GPUHammer: New NVIDIA Vulnerability Threatens AI Models with Data Corruption
NVIDIA has issued a warning about a newly discovered vulnerability in its graphics processing units, dubbed GPUHammer. This attack, rooted in the well-known RowHammer technique, enables malicious actors to corrupt data belonging to other users by exploiting the physical characteristics of GPU memory behavior.
For the first time, researchers have demonstrated the feasibility of executing a RowHammer-style attack on GPUs, rather than traditional CPUs. As a proof of concept, the team employed an NVIDIA A6000 graphics card with GDDR6 memory, successfully inducing single-bit alterations in the video memory. Such tampering can compromise data integrity without requiring direct access to the targeted content.
Of particular concern is the finding that even a single bit flip can devastate the performance of artificial intelligence systems: a model trained on ImageNet that previously achieved 80% accuracy plummeted to under 1% following the attack. This revelation transforms GPUHammer from a mere technical anomaly into a formidable weapon capable of undermining AI infrastructure—through manipulation of internal model parameters or poisoning of training datasets.
Unlike CPUs, GPUs often lack built-in safeguards such as instruction-level access controls or parity checks, rendering them especially vulnerable to low-level attacks. This weakness is magnified in shared computing environments—such as cloud platforms and virtual desktops—where a malicious tenant can potentially influence adjacent processes without having explicit access to them, thus introducing significant multi-tenant security risks.
Prior research, including the SpecHammer technique, has combined RowHammer and Spectre vulnerabilities to exploit speculative execution. GPUHammer continues this trajectory, demonstrating that attacks remain viable even in the presence of mitigation mechanisms such as Target Row Refresh (TRR), once thought to be effective countermeasures.
The implications of such vulnerabilities are particularly grave in industries that demand high levels of security and transparency—such as healthcare, finance, and autonomous systems. The introduction of unpredictable data corruption within AI operations may violate regulatory frameworks like ISO/IEC 27001 or EU AI legislation, especially when critical decisions are made based on compromised models.
To mitigate these threats, NVIDIA recommends enabling ECC (Error-Correcting Code) memory by executing the command nvidia-smi -e 1
. The status of ECC can be verified via nvidia-smi -q | grep ECC
. In some cases, ECC may be selectively activated for training nodes or workloads deemed mission-critical. Additionally, administrators should monitor system logs for memory error corrections to detect potential intrusions at an early stage.
It is important to note that enabling ECC on the A6000 GPU reduces machine learning performance by approximately 10% and decreases usable memory capacity by 6.25%. However, newer GPU models such as the H100 and RTX 5090 are not susceptible to GPUHammer, as they feature on-die hardware error correction mechanisms.
Compounding the concern is a related development called CrowHammer, recently unveiled by researchers at NTT Social Informatics Laboratories and CentraleSupelec. This attack enabled the recovery of the private key of the post-quantum signature algorithm Falcon, which has been selected for NIST standardization. The researchers demonstrated that even a single, well-placed bit flip could compromise the key if hundreds of millions of signatures are available—and even fewer distortions could suffice with larger error volumes.
Together, these findings underscore the urgent need to rethink AI security from the ground up. It is no longer sufficient to secure only the data pipeline; protection must extend to hardware-level vulnerabilities, including those embedded in the architecture of video memory itself.