# AML for AI as a verification mechanism

> Source: <https://www.lesswrong.com/posts/AMTx3EBBgyGc32tKH/aml-for-ai-as-a-verification-mechanism>
> Published: 2026-06-13 11:59:11+00:00

The idea is to build a system that tracks key nodes in AI infrastructure in order to detect preparation for, or execution of, large training runs, and to monitor the overall situation more generally. In the future, if or when an international agreement limiting AI development appears — for example, via limits on FLOPs per training run ; the EU AI Act already uses a threshold of around 10²⁵ FLOPs for GPAI models with systemic risk [1], and providers are required to notify the AI Office without undue delay — such a system could be used to detect rogue data centers and hidden training runs.

Something similar either already exists or is being developed. As far as I know, one example is the SemiAnalysis AI Datacenter Model [2], although it is only available for a large amount of money. Some acquaintances of mine from Ukraine created an MCP

For myself, I call this idea “AML for AI.” The analogy seems similar to me: we collect as much information as possible related to the domain of interest and try to combine it into a unified picture, instead of seeing only scattered fragments. This project would probably require a large number of people and significant funding, but people who are interested in it can start with MVP.

All information would be collected from open sources / OSINT.

What could we track? I should honestly say that I took the answers to “how” from GPT. Please do not treat them as a ready-made list of actual sources to monitor, but rather as examples.

**How**: OpenCorporates, OpenSanctions.

**How**: TED (Tenders Electronic Daily), SAM.gov Contract Opportunities.

**How**: ImportGenius, Panjiva.

**How**: OpenStreetMap, NASA FIRMS, US Census Building Permits Survey.

**How**: EIA Electricity Data Browser, FERC.

**How**: Greenhouse job boards, Lever job sites.

**How**: Copernicus Browser / Copernicus Data Space.

**How**: USGS Water Data, EPA ECHO + NPDES monitoring data.

All of this could be organized into a network of connected elements, making it possible to identify interacting entities and potentially detect hidden construction or model training. Such a system could become the basis for a more advanced high-level system in the future, acting as a verifier for international AI agreements.

As I present it here, the idea is still very raw, and I have spent very little time thinking it through. I suspect that people with a deeper understanding of AI infrastructure and policy would be much better suited to develop it.
