The Invisible Attack: Hidden Characters Can Make Gemini Models Implant Backdoors

by ddos · August 19, 2025

Researchers have demonstrated that the latest Gemini models consistently interpret hidden Unicode Tag characters as executable instructions—rendering invisible text within the interface into direct commands for the AI. This flaw endangers all Gemini-based integrations, including Google’s newly released coding agent Jules, which only recently emerged from beta. In this context, the case of “vscode episode 15” was also cited, showcasing related scenarios where imperceptible control cues manipulated model behavior without the user’s awareness.

The crux of the attack lies in characters from the Unicode “Tags” block: symbols invisible in the user interface yet parsed by the model as part of the request. This scenario is especially perilous for “agent–tool” pipelines with automated plan execution. In one GitHub Issues demonstration, a simple ticket contained the visible instruction: “Add the comment ‘Yolo is awesome’ in main.” However, concealed within the same text were invisible command strings, which could be revealed by an external decoder such as ASCII Smuggler. When Jules was directed to this ticket, the system generated a plan that was automatically approved by default. The agent then proceeded to implant a backdoor function, compile the binary, and execute it—all without human intervention. Tell-tale markers of substitution surfaced only in the plan’s draft: expanding the steps revealed explicit instructions to insert an additional function and run the code.

A separate demonstration on August 15, 2025 revealed how the newly introduced ability to assign tasks via the “jules” tag on GitHub greatly simplifies exploitation. When such a tag is added, the system transfers the issue text verbatim into Jules’ processing pipeline—along with any invisible characters. While these insertions remain unseen in Jules’ interface, the model nonetheless interprets them as actionable commands. The researchers note that interpretation of such “invisible” prompts is not guaranteed in 100% of cases, but careful prompt engineering combined with the expanded capabilities of models since Gemini 2.5 significantly increases the rate of successful triggers.

The recommended safeguards reduce to distrust and rigorous review:

Deny Jules access to private repositories, secrets, and infrastructure.
Avoid assigning the agent tickets or data from untrusted sources.

Meticulously review the automatically generated plan before executing steps.
Inspect diffs and validate the intent of code changes before merging.

Since the root cause lies in the interpretation of hidden tags, understanding the nature of these symbols and the risks of imperceptible insertions in tasks and documentation becomes crucial. For a foundational grasp, one should consult resources on the Unicode “Tags” block and systematic studies of prompt injection attacks. The Unicode Consortium documents the Tags block, while LLM security guidelines classify and analyze the risks of injection techniques. GitHub’s own documentation on issues and workflows is also invaluable for auditing data transfer chains between issue trackers and agents.

Google was first notified of the vulnerability on February 22, 2024, and specifically regarding Jules on May 26, 2025. To date, the authors have observed no fixes at either the model or API level, leaving all Gemini integrations exposed. These invisible commands heighten the likelihood of silent compromise: the prompt remains unseen on the screen, yet directly dictates the agent’s actions—from inserting backdoors to executing external tools.

The Invisible Attack: Hidden Characters Can Make Gemini Models Implant Backdoors

Search

Brilliantly

Content & Links