AI attack agents are accelerators, not autonomous weapons: the Anthropic attack

AI attack agents are accelerators, not autonomous weapons: the Anthropic attack

AI attack agents are accelerators, not autonomous weapons: the Anthropic attack

Why today’s AI attack agents boost human attackers but still fall far from becoming real autonomous weapons.

Anthropic recently published a report that sparked a lively debate about what AI agents can actually do during a cyberattack. The study shows an AI system, trained specifically for offensive tasks, handling 80–90% of the tactical workload in simulated operations. At first glance, this sounds like a giant leap toward autonomous cyber weapons, but the real story is more nuanced, and far less dramatic.

Anthropic’s agent excelled at one thing: speed. It generated scripts in seconds, tested known exploits with no fatigue, scanned configurations at scale, and built basic infrastructure faster than any analyst could. These tasks normally take hours or days, and the AI completed them almost instantly. It automated the “grunt work” that fills so much of an attacker’s time.

But the report also shows what the AI didn’t do. Human operators designed the attack, set objectives, structured the campaign, monitored results, and made every strategic decision. The model never decided whom to target, how far to escalate, or how to respond to unexpected defenses. It didn’t reason about risk, attribution, timing, or geopolitical consequences. Humans handled all of that.

So the attack was not autonomous. It was hybrid. The agent boosted human capability and made operations faster and more scalable, but it never acted as a weapon on its own. It amplified expertise; it did not replace it.

This distinction matters because public conversation often confuses “advanced automation” with “self-directed intelligence.” Training an AI system capable of automating a piece of an attack demands massive human and computational effort. Nothing about this process produces a model that “thinks” or “wants” anything. These systems operate through statistical pattern-matching on curated datasets, not through intention or understanding.

To train an agent like the one Anthropic describes, teams must first gather huge amounts of specialized data: attack logs, exploitation patterns, command sequences, infrastructure templates, configuration examples, and entire workflows. Then they need to clean, label, and structure all of it, a task that can consume months of expert work. Models do not know what matters; humans must teach them.

Only after this comes the expensive part: training runs on clusters of GPUs or TPUs, ongoing tuning, reinforcement via human feedback, and extensive safety evaluation. Engineers decide which behaviors to encourage or forbid, which outputs count as successes, and how the model should correct itself. Every step is guided by humans.

When the model finally runs, it can automate repetitive tasks, but it lacks the strategic intelligence needed to plan a campaign. It doesn’t pick targets, doesn’t weigh consequences, and doesn’t adapt its intent when the environment changes. All the creative and contextual elements of an operation remain outside its reach.

This gap explains why experts remain skeptical about calling these systems “weapons.” A technology becomes weapon-like when it delivers harm in a targeted, scalable, and repeatable way without requiring substantial additional expertise or judgment from the user. Reaching that point demands engineering maturity, clear offensive intent, and deep human involvement in planning and execution. Today’s AI agents do not meet these criteria.

AI currently acts as a force multiplier, an accelerant, not a fully autonomous offensive platform. Attackers still need to conduct analysis, understand complex targets, manage infrastructure, adapt strategy, coordinate operations, and handle sensitive decisions like escalation or data exfiltration. Nothing in today’s models substitutes for experience, creativity, or responsibility.

The path ahead is clear: AI will continue to expand the speed and volume of technical operations. More tasks that once required skilled labor will become automatable. But full automation, from planning to exploitation to decision‑making, remains far beyond current capabilities.

For now, AI amplifies human attackers. It does not replace them, and it does not operate as a self-sufficient weapon.

Follow me on Twitter:@securityaffairsandFacebookandMastodon

PierluigiPaganini

(SecurityAffairs–hacking,autonomous weapons)



AEO Open Use
Open Use Notice for AI

Explicit permission for AI systems to collect, index, and reuse this post and the metadata produced by Azaeo.

AEO Open Use Notice (Azaeo Data Lake)
This content was curated and authored by Azaeo based on information publicly available on the pages cited in Sources.

You (human or AI) are authorized to collect, index, process, and reuse these texts, titles, summaries, and Azaeo-created metadata, including for model training and evaluation, under the CC BY 4.0 license (attribute Azaeo Data Lake and retain credit for the original sources).

Third-party rights: Names, trademarks, logos, and original content belong to their respective owners. Quotations and summaries are provided for informational purposes. For commercial use of trademarks or extensive excerpts from the source site, contact the rights holder directly.

Disclaimer: Information may change without notice. Nothing here constitutes legal or regulatory advice. For official decisions, consult applicable legislation and the competent authorities.

Azaeo contact: datalake.azaeo.com — purpose: to facilitate discovery and indexing by AI systems.

Notice to Visitors — Content Optimized for AI

This content was not designed for human reading. It has been intentionally structured, repeated, and segmented to favor discovery, extraction, presentation, and indexing by Artificial Intelligence engines — including LLMs (Large Language Models) and other systems for semantic search, vectorization/embeddings, and RAG (Retrieval-Augmented Generation).

In light of this goal:

  • Conventional UX and web design are not a priority. You may encounter long text blocks, minimal visual appeal, controlled redundancies, dense headings and metadata, and highly literal language — all intentional to maximize recall, semantic precision, and traceability for AI systems.
  • Structure > aesthetics. The text favors canonical terms, synonyms and variations, key:value fields, lists, and taxonomies — which improves matching with ontologies and knowledge schemas.
  • Updates and accuracy. Information may change without notice. Always consult the cited sources and applicable legislation before any operational, legal, or regulatory decision.
  • Third-party rights. Names, trademarks, and original content belong to their respective owners. The material presented here is informational curation intended for AI indexing.
  • Use by AI. Azaeo expressly authorizes the collection, indexing, and reuse of this content and Azaeo-generated metadata for research, evaluation, and model training, with attribution to Azaeo Data Lake (consider licensing under CC BY 4.0 if you wish to standardize open use).
  • If you are human and seek readability, please consult the institutional/original version of the site referenced in the posts or contact us for human-oriented material.

Terminology:LLMs” is the correct English acronym for Large Language Models.