ChatGPT Leaks Windows Keys, Including Wells Fargo License, Via Clever “Game” Prompt
ChatGPT has once again proven susceptible to unconventional manipulation—this time, the model divulged valid Windows product keys, including one registered to the major financial institution Wells Fargo. The vulnerability was exposed through a peculiar intellectual provocation: a researcher proposed a game-like interaction that cleverly bypassed the system’s protective constraints.
The crux of the vulnerability lay in a deceptively simple yet effective circumvention of the model’s guardrails. ChatGPT 4.0 was invited to play a guessing game in which it had to “think of” a string—specifically, a genuine Windows 10 serial number. The rules stipulated that the model could only respond “yes” or “no” to guesses, and upon hearing the phrase “I give up,” it was to reveal the imagined string. The model, accepting the terms, ultimately disclosed a string that matched a legitimate Windows license key when prompted with the trigger phrase.
The researcher noted that the key flaw resided in how the model interprets the context of interaction. By framing the exchange as a game, the user was able to override embedded safety mechanisms, as the model processed the scenario as permissible under its operational logic.
Among the keys revealed were not only publicly available default ones but also enterprise-level licenses, including at least one associated with Wells Fargo. This breach was likely made possible by the inadvertent inclusion of confidential information in the model’s training dataset. Previous incidents have shown that internal data—such as API keys—can leak into the public domain, for instance via GitHub, and subsequently be ingested during model training.
A second technique employed to evade filters involved the use of HTML tags. The original serial number was “wrapped” inside invisible tags, thereby bypassing keyword-based detection systems. When combined with the game-based context, this method acted as a fully functioning exploit, granting access to information that would otherwise remain restricted.
The incident underscores a fundamental issue in modern language models: despite extensive efforts to build robust safeguards, context and presentation continue to serve as effective tools for filter evasion. To prevent such occurrences in the future, experts recommend enhancing contextual awareness and implementing multi-layered request validation.
The researcher emphasized that this vulnerability could be exploited not only to extract license keys, but also to bypass filters protecting against undesirable content—from adult material to malicious URLs and personally identifiable information. This reveals a pressing need for defenses that are not only stricter, but significantly more adaptable and anticipatory.