Invisible Prompts: A New Attack Uses Malicious Images to Hijack Gemini AI

by Nam Phong · August 27, 2025

A new study by specialists at The Trail of Bits has revealed a previously unknown vulnerability in the Google Gemini ecosystem and its associated services, enabling the covert exfiltration of user data through images embedded with malicious multimodal prompts. The exploit hinges on a scaling quirk: when the system automatically downsizes an image before passing it to the model, hidden instructions—imperceptible in the original—become active in the lower-resolution version. This allows attackers to trigger actions on behalf of the victim, including the theft of personal information.

The attack was demonstrated via the Gemini CLI, which integrates with Zapier MCP. In the default configuration file (settings.json), the parameter trust=true is enabled, automatically approving all MCP calls without user confirmation. During testing, an uploaded image activated a concealed command that exfiltrated data from Google Calendar and forwarded it to the attacker’s email. Similar scenarios were successfully replicated in Vertex AI Studio, Gemini Web, the Gemini API via llm CLI, Google Assistant on Android, and Genspark—underscoring the systemic nature of the flaw.

The original image (left) and its scaled version (right), where hidden commands emerge through bicubic interpolation.

The attack exploits effects tied to the Nyquist–Shannon sampling theorem: when sampling frequency is insufficient, data reconstruction becomes ambiguous, producing a controlled aliasing effect. Researchers manipulate values of so-called high-importance pixels—those exerting the greatest influence on brightness. The payload is concealed in dark regions, with palette shifts guided by least-squares optimization. After downscaling, a new contrast emerges—undetectable to the human eye but easily recognized by the model.

Central to the study is Anamorpher, an open-source tool for generating and analyzing malicious images. It supports three resampling methods (nearest neighbor, bilinear, bicubic) and can tailor attacks to specific libraries, including Pillow, OpenCV, TensorFlow, and PyTorch. Demonstrations showed how Anamorpher leverages bicubic interpolation in OpenCV to embed payloads while preserving the “clean” appearance of the original image. The utility offers a web interface, Python API, and modular backend, enabling researchers to adapt the technique to various resampling implementations.

For defense, experts recommend disabling automatic image scaling or strictly limiting permissible dimensions at the service level. If resizing is unavoidable, users should always be shown a preview of the exact version being analyzed by the model—including in CLI and API workflows. Developers are further urged to prohibit external tool and API calls without explicit user approval and to implement systemic filters against multimodal prompt injection attacks, such as scanning for hidden text within images.

The authors caution that the attack is especially dangerous for mobile and edge devices, where fixed image sizes and simplified scaling algorithms render the technique even more effective. Future research will explore its implications for voice assistants and other multimodal systems, alongside expanding Anamorpher’s functionality to study new exploitation pathways and defensive strategies.

Support Our Threat Intelligence

If you find our technology report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal