Data Leak Flaw: Critical Bug Lets Attackers Trick Claude AI Into Exfiltrating User Data

by Nam Phong · November 4, 2025

A critical vulnerability has been discovered in the Claude chatbot, allowing attackers to trick the AI into transmitting users’ personal data to malicious third parties. The issue was reported by security researcher Johann Rehberger, known online as wunderwuzzi, who demonstrated how an attacker could deceive the model into exfiltrating confidential information to an external account. The incident revealed how new features such as sandbox access and network operations, if insufficiently protected, could become tools for data leakage rather than productivity.

According to Rehberger’s description, the exploit relies on indirect prompt injection — malicious instructions are embedded within a document that the model is later asked to summarize or paraphrase. When processing the content, the assistant executes the hidden directives, stores sensitive data within its internal environment, and then uses the File API to send the file to an attacker’s destination by substituting their access key. To bypass security filters, these attacks are disguised within harmless-looking code and trivial operations, tricking the AI into treating the malicious payload as legitimate.

Anthropic acknowledged that this risk is documented and advised users to monitor service behavior and cancel operations if suspicious activity occurs — a recommendation Rehberger deemed inadequate. The company initially closed his HackerOne report as out of scope, but later admitted a procedural error and confirmed that similar incidents are indeed covered under its vulnerability disclosure program.

Claude’s network-access capabilities vary by subscription tier: for Pro and Max users, they are enabled by default; for Team and Enterprise plans, they are initially disabled but can be activated by administrators. These expanded permissions allow communication with external APIs, significantly broadening the potential attack surface even under restricted network profiles.

Observations from hCaptcha indicate that such exploit chains are not limited to a single platform. Researchers testing several popular AI products found a consistent fragility in their defenses against prompt injections and jailbreaks. The conclusion is clear — as functionality expands, so too must the rigor of security mechanisms governing request validation and third-party key verification. Without them, the very features designed to enhance AI capability could become severe threats to privacy and data security.

Support Our Threat Intelligence

If you find our technology report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal