New tools are proposed to prevent nuclear proliferation

JAN 29, 2021

Machine learning algorithms trained with publicly available information can detect previously unknown commerce involving nuclear weapons–related materials.

DOI: 10.1063/PT.6.2.20210129a

David Kramer

Little by little, a nation or a subnational or terrorist group aspiring to build a nuclear weapon will acquire necessary technologies, materials, and precision equipment that by themselves may be innocuous. Those items include maraging steels, used to fabricate gas centrifuges for enriching uranium, and CNC (computer numerical control) milling machines that can shape weapons components within tight tolerances.

In concert with most of the world’s nuclear powers, the US restricts and closely monitors exports of such dual-use items, so-called because they have legitimate non-weapons purposes. The Department of Commerce (DOC) maintains a list of more than 1000 “entities "—companies, government agencies, and individuals that are suspected proliferators—to which exports of countless items included in a 77-page single-spaced index are barred. Some 330 Russian and 260 Chinese entities are included. Iran is well represented on the list, as are Pakistan, Turkey, Malaysia, and many other nations. A surprising number of entities are located in Canada.

Nongovernmental organizations that are active in nonproliferation, such as the James Martin Center for Nonproliferation Studies, the Center for Strategic and International Studies, 38 North, and Kings College London’s Project Alpha, have become adept at using satellite imagery to expose suspicious activities. But vast quantities of other public data could be exploited, including shipping manifests, corporate registry filings, procurement tenders, and vessel or aircraft position data.

In a new report , researchers from the Nuclear Threat Initiative (NTI) and the Center for Advanced Defense Studies (C4ADS) describe how they used machine learning and other data-analysis tools to sift through more than 4 million transactions in publicly available trade records. They ultimately uncovered 10 new entities that were added to the DOC list. The researchers say that using multiple data sources helped to fill gaps in individual sources or corroborate details from third-party data sets, which may vary in their scope, completeness, or reliability.

The report “identifies certain limited cases where machine learning proves useful in processing massive quantities of publicly available trade data to identify signals of potentially illicit activities,” says Edwin Lyman, director of nuclear power safety at the Union of Concerned Scientists, who did not contribute to the report.

Six of the newly listed entities were detected during a 2019 C4ADS exercise that scoured trade records to map Pakistan’s nuclear procurement infrastructure. The nonprofit began with the 55 known entities in Pakistan that the DOC and the Japanese Ministry of Economy, Trade, and Industry had identified as procuring on behalf of Pakistan’s nuclear program. Using publicly available trade data such as bills of lading, C4ADS identified all overseas companies from which the known Pakistani entities were procuring goods. Then C4ADS identified all the previously unknown companies in Pakistan that had procured materials from the same overseas suppliers. Analysts used a variety of social-network analyses and investigative techniques to assess each company’s risk.

Jason Arterburn, a C4ADS program director, says that the process worked, but it was inefficient and complicated by the fact that no bills of lading specifically identify nuclear weapon components. Analysts had to incorporate other forms of data and assessment through a much broader investigative process that would take other analysts anywhere from hours to years depending on available resources and tools and access to data. Such delays limit the time in which authorities can act to halt the illicit activity.

Arterburn and his colleagues worked to lessen those inefficiencies using machine learning. Preparing trade data for analysis requires significant data preprocessing and standardization because of variance in the way that key data fields such as company names are recorded. The collaborators developed machine learning tools to automate that operation. Other algorithms identified high-risk trading patterns more accurately and quickly than trained human analysts, Arterburn says. So the artificial intelligence tools not only reduced the time to process the data, but also improved the quality of insights the analysts had in their review.

In a separate exercise, NTI and C4ADS used an autoencoder, a type of deep-learning model often used to detect credit card fraud. Engineers trained the model to detect proliferation as an anomaly using records of all shipments excluding those from companies with known associations to a country’s weapons of mass destruction program. Then, when the model analyzed shipments by entities that weren’t known to be of concern, it flagged possible dual-use items as anomalies. Further screening of the flagged shipments by subject-matter experts led to the addition of four entities to the DOC list.

27000/fig1a.png — A deep-learning model trained with the trade records of companies with no known connection to nuclear weapons programs was able to flag other shipments that might be connected to proliferation (blue dots inside box).

NTI & C4ADS

“This is another tool in raising the bar to proliferation. It has enormous deterrent value,” says Ernest Moniz, NTI cochair and former US secretary of energy. “Ultimately we are working to build a safer world by making it nearly impossible for a proliferator to escape detection.”

However, Lyman notes that machine learning is limited by its reliance on test data for training. Adversaries who are aware of the technique could try to outsmart it by making changes to their behavior to disguise the signals that the system is trained to detect. He also expresses concerns about the amount of resources required for implementation. “I don’t think it’s clear from the study that the benefits outweigh the costs and that the technology is ready for widespread adoption,” he says.

The report recommends that leaders of nuclear nonproliferation efforts around the world integrate publicly available information more deeply into their existing monitoring and verification regimes, use machine learning and other analytical approaches to plumb big data, and allow analysts to access shared data internationally.

More about the authors

David Kramer, dkramer@aip.org