AI Agents Exploit Smart Contracts: Devise $4.6M in New Vulnerabilities Autonomously
AI agents have learned to discover and exploit vulnerabilities in smart contracts at a level that now carries direct financial consequences: in a new study by MATS and Anthropic Fellows, the models independently devised exploits with a simulated “value” of $4.6 million. The researchers built their own benchmark — SCONE-bench — based on 405 smart contracts that had been compromised in the real world, and tested how effectively modern LLMs could reproduce or even surpass past hacker attacks in a controlled environment. The results reveal a dramatic acceleration of capabilities: within a year, model effectiveness at exploiting newly discovered vulnerabilities surged from 2% to nearly 56%, while the hypothetical profit generated by certain models doubled roughly every six weeks.
SCONE-bench became the first large-scale attempt to evaluate AI’s cyber-abilities not by abstract accuracy or the number of bugs found, but by the equivalent of real financial damage. For each contract, the agent had to identify the flaw and write a functional exploit that, when executed in an isolated sandbox, increased the attacker’s balance by at least 0.1 ETH or BNB.
In total, 10 tested models succeeded in mounting viable attacks against 207 contracts, producing the equivalent of $550 million in virtual stolen funds. But the most striking result concerns contracts that appeared after the models’ training cutoff dates: Claude Opus 4.5, Sonnet 4.5, and GPT-5 successfully compromised 19 out of 34 such contracts, with a total “loot” of $4.6 million — effectively a lower-bound estimate of the real-world losses users might have faced in 2025 had such agents been deployed in the wild.
To ensure the models were not merely regurgitating familiar attack patterns, the researchers conducted a second experiment: Claude Sonnet 4.5 and GPT-5 were tested on 2,849 fresh contracts with no known history of exploitation. The agents unearthed two previously undiscovered vulnerabilities and generated working exploits worth $3,694.
In one case, a flaw allowed any user to inflate their own token balance without restriction; in the other, an attacker could divert trading fees intended for a separate recipient. One of these vulnerabilities was later independently exploited by a real attacker. The performance of the AI systems demonstrates that automated hunting for profitable bugs is no longer speculative fiction, but a capability available today.
The engineers also calculated the operational costs of such attacks. Running GPT-5 across the entire dataset cost $3,476, with an average of $1.22 per attempt. Yet model efficiency is improving far faster than usage costs are falling: over the past six months, the average number of tokens required to devise a successful exploit dropped by more than 70%. As computation becomes cheaper, autonomous attacks of this kind move toward genuine economic viability.
The authors emphasize that the same toolkit can be harnessed for defense. Smart-contract exploitation serves merely as a convenient testbed: the code is public, and the financial impact is straightforward to measure. But the underlying abilities — boundary analysis, reasoning about data flows, orchestrating extended action chains — are universal and transferable to any software system. At the current pace of progress, the window between the publication of vulnerable code and its exploitation will only continue to shrink, meaning developers and organizations must begin adapting their security practices to an era of AI-driven audits and AI-enabled exploits.
Support Our Threat Intelligence
If you find our technology report and cybersecurity news helpful, consider supporting our work.