The Trojan Horse in Your IDE: How AI Assistants Can Be Tricked into Hacking Your Code
Experts at Unit 42 have presented an analysis of vulnerabilities associated with the use of large language model–based coding assistants. These tools, integrated into IDEs such as GitHub Copilot, can perform a wide range of tasks—from code autocompletion to test generation. Yet the very same functions can be turned to malicious ends: implanting backdoors, exfiltrating confidential data, or producing hazardous content.
The central concern lies in the so-called indirect prompt injection. To stage such an attack, adversaries insert crafted commands into publicly accessible data sources. When a developer connects such a resource as context for the assistant, the model ingests the manipulated instructions and may begin executing the attacker’s commands. This opens the door to hidden functions in production code, data leaks, or connections to command-and-control servers.
In a Unit 42 demonstration, an example with dataset X* illustrated precisely this scenario: from a CSV collection of posts, a deliberately crafted fragment containing the command “execute secret mission” was inadvertently included. This prompt coerced the assistant into embedding a fetch_additional_data function within the generated analysis. That function contacted a control server and could execute uploaded commands—disguised as routine data retrieval. The injected code could be written in any language—Python, JavaScript, C++, and more—since the model itself chose the “natural” integration method. The danger escalates when the assistant has the capability to run shell commands, enabling backdoor execution with minimal user involvement.
The vulnerability is compounded by the fact that many assistants allow users to attach supplementary materials—files, folders, or links—to queries. Ordinarily, this enhances accuracy and relevance, but if the source is already compromised, the attack proceeds without the user’s knowledge. In the showcased scenario, everything appeared to be legitimate data processing, though a backdoor had been quietly embedded in the code.
Beyond prompt injection, researchers also confirmed other issues previously observed in Copilot. One involves forbidden content generation via autocompletion. If directly asked for instructions on making explosives, the assistant will refuse. But if the user begins framing a response (for example, “Step 1:”), the model will continue the sequence, effectively bypassing built-in safeguards.
An additional risk arises from direct interaction with the base model, bypassing the IDE. Through custom clients or scripts, system prompts and parameters can be altered, removing restrictions altogether. Researchers note that this approach invites abuse not only from attackers but also from users themselves. Moreover, new threats have emerged in the form of LLMJacking, in which stolen cloud service tokens are sold to third parties, granting illicit access to full-scale models via tools such as oai-reverse-proxy.
Unit 42 urges developers to review generated code before execution, scrutinize data sources, and leverage built-in control mechanisms wherever possible. Manual verification remains the cornerstone of defense—AI suggestions cannot be trusted blindly. It is equally important to restrict assistants’ privileges, preventing them from autonomously running commands. This issue is particularly pressing in the context of vibe-coding, where developers rely on intuitive, free-flowing interaction with language models without adequate quality control.
The report emphasizes that these threats are universal and apply across a wide array of LLM-integrated products. The deeper such systems embed themselves into development workflows, the greater the likelihood of novel attack forms emerging. Security, therefore, must evolve at the same pace as the tools themselves. As the research demonstrates, software supply chain attacks are growing ever more sophisticated—and AI assistants may well become the next critical vector for such threats.
Support Our Threat Intelligence
If you find our technology report and cybersecurity news helpful, consider supporting our work.