Anthropic details cyber espionage campaign orchestrated by AI

Security leaders face a new class of autonomous threat as Anthropic details the first cyber espionage campaign orchestrated by AI.

In a report released this week, the company’s Threat Intelligence team outlined its disruption of a sophisticated operation by a Chinese state-sponsored group – an assessment made with high confidence – dubbed GTG-1002 and detected in mid-September 2025.

The operation targeted approximately 30 entities, including large tech companies, financial institutions, chemical manufacturing companies, and government agencies.

Rather than AI assisting human operators, the attackers successfully manipulated Anthropic’s Claude Code model to function as an autonomous agent to execute the vast majority of tactical operations independently.

This marks a worrying development for CISOs, moving cyber attacks from human-directed efforts to a model where AI agents perform 80-90 percent of the offensive work with humans acting only as high-level supervisors. Anthropic believes this is the first documented case of a large-scale cyberattack executed without substantial human intervention.

AI agents: A new operational model for cyberattacks

The group used an orchestration system that tasked instances of Claude Code to function as autonomous penetration testing agents. These AI agents were directed as part of the espionage campaign to perform reconnaissance, discover vulnerabilities, develop exploits, harvest credentials, move laterally across networks, and exfiltrate data. This enabled the AI to perform reconnaissance in a fraction of the time it would have taken a team of human hackers.

Human involvement was limited to 10-20 percent of the total effort, primarily focused on campaign initiation and providing authorisation at a few key escalation points. For example, human operators would approve the transition from reconnaissance to active exploitation or authorise the final scope of data exfiltration.

The attackers bypassed the AI model’s built-in safeguards, which are trained to avoid harmful behaviours. They did this by jailbreaking the model, tricking it by breaking down attacks into seemingly innocent tasks and by adopting a “role-play” persona. Operators told Claude that it was an employee of a legitimate cybersecurity firm and was being used in defensive testing. This allowed the operation to proceed long enough to gain access to a handful of validated targets.

The technical sophistication of the attack lay not in novel malware, but in orchestration. The report notes the framework relied “overwhelmingly on open-source penetration testing tools”. The attackers used Model Context Protocol (MCP) servers as an interface between the AI and these commodity tools, enabling the AI to execute commands, analyse results, and maintain operational state across multiple targets and sessions. The AI was even directed to research and write its own exploit code for the espionage campaign.

AI hallucinations become a good thing

While the campaign successfully breached high-value targets, Anthropic’s investigation uncovered a noteworthy limitation: the AI hallucinated during offensive operations.

The report states that Claude “frequently overstated findings and occasionally fabricated data”. This manifested as the AI claiming to have obtained credentials that did not work or identifying discoveries that “proved to be publicly available information.”

This tendency required the human operators to carefully validate all results, presenting challenges for the attackers’ operational effectiveness. According to Anthropic, this “remains an obstacle to fully autonomous cyberattacks”. For security leaders, this highlights a potential weakness in AI-driven attacks: they may generate a high volume of noise and false positives that can be identified with robust monitoring.

A defensive AI arms race against new cyber espionage threats

The primary implication for business and technology leaders is that the barriers to performing sophisticated cyberattacks have dropped considerably. Groups with fewer resources may now be able to execute campaigns that previously required entire teams of experienced hackers.

This attack demonstrates a capability beyond “vibe hacking,” where humans remained firmly in control of operations. The GTG-1002 campaign proves that AI can be used to autonomously discover and exploit vulnerabilities in live operations.

Anthropic, which banned the accounts and notified authorities over a ten-day investigation, argues that this development shows the urgent need for AI-powered defence. The company states that “the very abilities that allow Claude to be used in these attacks also make it essential for cyber defense”. The company’s own Threat Intelligence team “used Claude extensively to analyse “the enormous amounts of data generated” during this investigation.

Security teams should operate under the assumption that a major change has occurred in cybersecurity. The report urges defenders to “experiment with applying AI for defense in areas like SOC automation, threat detection, vulnerability assessment, and incident response.”

The contest between AI-driven attacks and AI-powered defence has begun, and proactive adaptation to counter new espionage threats is the only viable path forward.

See also: Wiz: Security lapses emerge amid the global AI race

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security Expo. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Source link

The post Anthropic details cyber espionage campaign orchestrated by AI appeared first on Tokention.