Meta launches ‘Purple Llama’ AI security suite to fulfill White House promise
On December 7, Meta launched a suite of tools for acquiring and benchmarking generative artificial intelligence (AI) models.
The toolkit, called “Purple Llama,” is designed to help developers build securely using generative AI tools like Llama-2, Meta’s open source model.
Announcing Purple Llama — a new project to help level the playing field for building safe and responsible generative AI experiences.
Purple Llama includes licensed tools, trials, and models for both research and commercial use.
More details ➡️ https://t.co/k4ezDvhpHp pic.twitter.com/6BGZY36eM2
— Meta’s AI (@AIatMeta) December 7, 2023
AI Purple Team Building
According to Meta’s blog post, the “Purple” part of “Purple Llama” refers to a combination of “red teaming” and “blue teaming.”
Red teaming is a paradigm in which developers or internal testers determine whether they can intentionally attack an AI model to produce errors, defects, or unwanted output and interactions. This helps developers build resilience strategies against malicious attacks and protect against security and safety flaws.
Blue teaming, on the other hand, is the exact opposite. Here, developers or testers respond to red team attacks to determine mitigation strategies needed to combat real-world threats in production, consumer, or client-facing models.
Per goal:
“We believe that to truly mitigate the challenges presented by generative AI, we must adopt both offensive (red team) and defensive (blue team) postures. The formation of a Purple Team, comprised of the responsibilities of the Red and Blue Teams, is a collaborative approach to assessing and mitigating potential risks.”
model protection
This release, which Meta claims is “the industry’s first set of cybersecurity safety assessments for large-scale language models (LLMs),” includes:
- Metrics for Quantifying LLM Cybersecurity Risk
- Tool to evaluate the frequency of unsafe code suggestions
- A tool that evaluates LLMs to make it harder to create malicious code or help carry out cyberattacks.
The big idea is to integrate the system into the model pipeline to reduce unwanted output and insecure code while limiting the usefulness of model attacks to cybercriminals and malicious actors.
“With this initial release, we aim to provide tools to help address the risks outlined in the White House promise,” the Meta AI team wrote.
Related: Biden Administration Issues Executive Order on New AI Safety Standards