The National Institute of Standards and Technology (NIST), a key agency under the U.S. Commerce Department, has reintroduced Dioptra, an innovative testbed designed to measure how malicious attacks—particularly those that "poison" AI model training data—can degrade an AI system's performance.
Dioptra, named after a classical astronomical and surveying instrument, is a modular, open-source web-based tool first released in 2022. Its purpose is to help companies training AI models, as well as users of these models, assess, analyze, and track AI risks. According to NIST, Dioptra can be used to benchmark and research models, and to provide a common platform for exposing models to simulated threats in a "red-teaming" environment.
“Testing the effects of adversarial attacks on machine learning models is one of the goals of Dioptra,” NIST stated in a press release. “The open-source software, available for free download, could help the community, including government agencies and small to medium-sized businesses, conduct evaluations to assess AI developers’ claims about their systems’ performance.”
Dioptra was launched alongside new documents from NIST and its recently created AI Safety Institute, which outline ways to mitigate some of the dangers of AI, such as its potential to generate nonconsensual pornography. This follows the U.K. AI Safety Institute's release of Inspect, a toolset similarly aimed at assessing model capabilities and overall safety. The U.S. and U.K. have an ongoing partnership to jointly develop advanced AI model testing, announced at the U.K.’s AI Safety Summit in Bletchley Park last November.
This initiative also stems from President Joe Biden’s executive order on AI, which mandates that NIST assist with AI system testing. The executive order also establishes standards for AI safety and security, including requirements for companies developing models to notify the federal government and share results of all safety tests before they’re deployed to the public.
AI benchmarks are notoriously challenging, partly because the most sophisticated AI models today are black boxes, with their infrastructure, training data, and other key details kept secret by their creators. A recent report from the Ada Lovelace Institute, a U.K.-based nonprofit research institute that studies AI, highlighted that evaluations alone aren’t sufficient to determine the real-world safety of an AI model, partly because current policies allow AI vendors to selectively choose which evaluations to conduct.
While NIST doesn’t claim that Dioptra can completely eliminate risks, the agency suggests that Dioptra can reveal which types of attacks might impair an AI system and quantify their impact on performance.
However, a significant limitation of Dioptra is that it only works out-of-the-box on models that can be downloaded and used locally, like Meta’s expanding Llama family. Models gated behind an API, such as OpenAI’s GPT-4, are currently unsupported.
NIST’s Dioptra offers a promising step towards better understanding and mitigating risks in AI, providing valuable tools and insights for developers, businesses, and policymakers navigating the complexities of AI safety and performance.