The ChatGPhish Phenomenon: Indirect Prompt Injection via AI Summarization
Mechanics of the Summary Vector
A standard webpage can become an effective lure if an AI assistant summarizes its content. New research reveals how an adversary can conceal instructions directly within a website. Consequently, they can trick ChatGPT into displaying a fraudulent warning, hyperlink, or QR code within its response.
Principal researcher Andi Ahmeti from Permiso designated this technique as ChatGPhish. According to his data, the problem manifests when a user requests a summary of a webpage from ChatGPT. Although the tests utilized Firefox, the author does not consider the browser the source of the vulnerability. Instead, the risk stems from how the AI service processes third-party content within a trusted interface.
Executing the Prompt Injection
The scenario relies on embedding a textual instruction within an ordinary webpage. For example, this can include a GitHub README, an article, or a marketing site. Therefore, the visible portion of the page appears entirely legitimate. Meanwhile, a concealed fragment dictates the required response format to the model.
Upon summarizing, ChatGPT generates a standard abstract. Simultaneously, however, it appends an adversary-imposed block mimicking an account notification.
In the demonstration, this malicious block reported the addition of a new device. It then prompted the user to follow a malicious link. For the user, the danger lies in the location of the hyperlink. Because it surfaces directly within the AI response, the victim perceives it as authentic assistant output.
Markdown Weaponization and Passive Telemetry
A distinct variation of this attack leverages Markdown images. If a QR code from an adversary-controlled server enters the response, the interface automatically renders the graphic. Upon scanning, the victim transitions to an external website via their mobile device. Thus, they completely bypass the standard security mechanisms of a desktop browser. These bypassed defenses include hover-based link previews and automated domain verification.
Furthermore, another scenario involves passive surveillance. During image retrieval, the attacker’s server collects the target’s IP address and User-Agent metadata. It also captures the Referer header and the precise timestamp of the interaction. Clearly, this telemetry suffices to confirm that a specific target requested a summary.
Disclosure and Broader Implications
Ahmeti reported that he submitted his findings to OpenAI via Bugcrowd in late April 2026. Initially, engineers failed to reproduce the issue on the first submission. Subsequently, they deemed a secondary report inapplicable, before associating it with a known vulnerability.
On May 29, the author published his research to highlight a more expansive threat vector. Ultimately, phishing, QR redirections, and passive tracking can emerge directly from AI-summarized web content.
Support Our Threat Intelligence
If you find our technology report and cybersecurity news helpful, consider supporting our work.