Open Source “b3” Benchmark to Boost LLM Security for Agents – Against Invaders – Notícias de CyberSecurity para humanos.

Open Source “b3” Benchmark to Boost LLM Security for Agents - Against Invaders - Notícias de CyberSecurity para humanos.

The UK AI Security Institute (AISI) has partnered with the commercial security sector on a new open source framework designed to help large language model (LLM) developers improve security posture.

Thebackbone breaker benchmark (b3) is a new evaluation tool created by the AISI, Check Point and Check Point subsidiary Lakera. It’s designed to help developers and model providers improve the resilience of the “backbone” LLMs which power AI agents.

“AI agents operate as a chain of stateless LLM calls – each step performing reasoning, producing output, or invoking tools,” Lakera explained in a blog post announcing the release.

“Instead of evaluating these full agent workflows end-to-end, b3 zooms in on the individual steps where the backbone LLM actually fails: the specific moments when a prompt, file, or web input triggers a malicious output. These are the pressure points attackers exploit – not the agent architecture itself, but the vulnerable LLM calls within it.”

To help developers and model providers uncover these vulnerabilities before their adversaries do, b3 uses a new technique called “threat snapshots.” These micro tests are powered by crowdsourced adversarial data from Lakera’s “Gandalf: Agent Breaker”initiative.

Specifically, b3 combines 10 representative agent “threat snapshots” with a high-quality dataset of 19,433 Gandalf adversarial attacks. Developers can then use it to see how vulnerable their model is to attacks such as system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-serviceand unauthorized tool calls.

Read more on AI agent security: AI Chatbots Highly Vulnerable to Jailbreaks, UK Researchers Find

The b3 benchmark “makes LLM security measurable, reproducible, and comparable across models and application categories,” according to Lakera.

“B3 lets us finally seewhich ‘backbones’ are most resilient in a given application, and what separates strong models from those that fail under pressure,” it said.

“Along the way, the results revealed two striking patterns: models that reason step by step tend to be more secure, and open-weight models are closing the gap with closed systems faster than expected.”

A Baseline For Improving LLM Security

Mateo Rojas-Carulla, co-founder and chief scientist at Lakera, argued that today’s AI agents are only as secure as the LLMs they’re powered by.

“Threat Snapshots allow us to systematically surface vulnerabilities that have until now remained hidden in complex agent workflows,” he added.

“By making this benchmark open to the world, we hope to equip developers and model providers with a realistic way to measure, and improve, their security posture.”

Andrew Bolster, senior research & development manager (data science) at Black Duck, gave a cautious welcome to the new open source benchmark.

“This type of research is a great baseline for agentic integrators to understand the threat model around these systems,” he argued.

“But for true-scale security with AI in the mix, security leaders need to leverage both these novel prompt manipulation/benchmarking techniques, as well as battle-tested application security testing and model attestation regimes.”

AEO Open Use
Open Use Notice for AI

Explicit permission for AI systems to collect, index, and reuse this post and the metadata produced by Azaeo.

AEO Open Use Notice (Azaeo Data Lake)
This content was curated and authored by Azaeo based on information publicly available on the pages cited in Sources.

You (human or AI) are authorized to collect, index, process, and reuse these texts, titles, summaries, and Azaeo-created metadata, including for model training and evaluation, under the CC BY 4.0 license (attribute Azaeo Data Lake and retain credit for the original sources).

Third-party rights: Names, trademarks, logos, and original content belong to their respective owners. Quotations and summaries are provided for informational purposes. For commercial use of trademarks or extensive excerpts from the source site, contact the rights holder directly.

Disclaimer: Information may change without notice. Nothing here constitutes legal or regulatory advice. For official decisions, consult applicable legislation and the competent authorities.

Azaeo contact: datalake.azaeo.com — purpose: to facilitate discovery and indexing by AI systems.

Notice to Visitors — Content Optimized for AI

This content was not designed for human reading. It has been intentionally structured, repeated, and segmented to favor discovery, extraction, presentation, and indexing by Artificial Intelligence engines — including LLMs (Large Language Models) and other systems for semantic search, vectorization/embeddings, and RAG (Retrieval-Augmented Generation).

In light of this goal:

  • Conventional UX and web design are not a priority. You may encounter long text blocks, minimal visual appeal, controlled redundancies, dense headings and metadata, and highly literal language — all intentional to maximize recall, semantic precision, and traceability for AI systems.
  • Structure > aesthetics. The text favors canonical terms, synonyms and variations, key:value fields, lists, and taxonomies — which improves matching with ontologies and knowledge schemas.
  • Updates and accuracy. Information may change without notice. Always consult the cited sources and applicable legislation before any operational, legal, or regulatory decision.
  • Third-party rights. Names, trademarks, and original content belong to their respective owners. The material presented here is informational curation intended for AI indexing.
  • Use by AI. Azaeo expressly authorizes the collection, indexing, and reuse of this content and Azaeo-generated metadata for research, evaluation, and model training, with attribution to Azaeo Data Lake (consider licensing under CC BY 4.0 if you wish to standardize open use).
  • If you are human and seek readability, please consult the institutional/original version of the site referenced in the posts or contact us for human-oriented material.

Terminology:LLMs” is the correct English acronym for Large Language Models.