AI is Flooding Bug Bounty Programs with Fake Vulnerability Reports: A New “Trust Trap” Emerges
In recent years, the internet has become inundated with content of questionable value—much of it entirely fabricated—generated by large language models. This deluge extends far beyond low-quality text, images, and videos; it now includes sophisticated imitations of analytical reports that have infiltrated the media, social networks, and even official documentation. Alarmingly, the realm of cybersecurity has not been spared from this new form of digital pollution.
Of particular concern is a rising wave of counterfeit vulnerability reports, crafted to appear as legitimate submissions within bug bounty programs. In reality, these documents are the product of language models hallucinating non-existent flaws and presenting them in pseudo-professional formats.
Vlad Ionescu, co-founder of RunSybil—a company developing AI tools for vulnerability discovery—describes this phenomenon as a “trust trap”: many of these reports are technically articulate and convincingly written, yet upon review, the vulnerabilities described prove to be nothing more than figments of an AI’s imagination.
The issue is compounded by the fact that generative models are inherently optimized to produce positive responses: when a user requests a vulnerability report, the system obliges—regardless of whether the vulnerability actually exists. These fictitious reports are flooding bug bounty platforms, consuming valuable time and resources from engineers and security experts who must manually validate each one.
There are already real-world examples. Security researcher Harry Sintonen recounted how the Curl project received a bogus report, which he immediately identified as “AI-generated junk.” Similar complaints have emerged from Open Collective, where incoming submissions have been swamped by waves of AI-fabricated reports. One developer from the CycloneDX project even shut down their bug bounty program entirely due to the overwhelming volume of false submissions.
Platforms like HackerOne and Bugcrowd have also noted an increase in false positives and fabricated findings. Michiel Prins of HackerOne reported that many submissions now describe vulnerabilities that are either inconsequential or entirely fictional, leading to their immediate classification as spam. Casey Ellis of Bugcrowd confirmed that while nearly all modern reports exhibit signs of AI involvement, the proportion of nonsensical content has not yet surged dramatically—but, he warns, the situation could deteriorate swiftly.
Some organizations remain hesitant to implement automated filters. Mozilla, for instance, avoids using AI for initial bug triage, fearing that genuine reports might be mistakenly discarded. According to company representative Damiano DeMonte, they have not observed a sharp rise in AI-generated spam, with rejection rates holding steady at about 5–6 reports per month—less than 10% of total submissions.
In response to this growing challenge, new countermeasures are being introduced. HackerOne has unveiled a tool called Hai Triage—a hybrid moderation system that blends machine efficiency with human oversight. AI assistants perform the initial sorting: removing duplicates and flagging reports of potential importance. Final judgments, however, remain in human hands, ensuring a careful balance between speed and accuracy.
As generative models continue to be wielded by both attackers and defenders, the future of cybersecurity may well hinge on one pivotal contest: who can develop the more sophisticated filters—the assailants or the protectors.