Google Unveils Private AI Compute: Cloud Power for Gemini with On-Device Privacy
Google has unveiled a new data-protection framework, Private AI Compute, designed to process artificial-intelligence requests in the cloud without exposing any personal information. According to the company, the technology creates a sealed, cryptographically fortified environment that allows users to harness the power of cloud-based Gemini models while preserving the confidentiality of all data involved.
Private AI Compute is presented as an isolated, “hardened” computational platform that merges the performance of the cloud with the level of security traditionally associated with strictly on-device processing. Its architecture is built on Trillium TPUs and Titanium Intelligence Enclaves, which pair high-end computational throughput with robust cryptographic guarantees.
To achieve this, Google relies on a hardware infrastructure composed of trusted nodes powered by AMD processors and equipped with a hardware-based Trusted Execution Environment (TEE). This mechanism encrypts system memory and isolates it from the host environment, preventing access to running tasks even by administrators. Only verified and attested workloads are permitted to run inside the trusted enclave, while attempts to extract data physically are blocked at the architectural level.
A defining element of Private AI Compute is its support for mutual attestation and end-to-end encryption between trusted nodes. This ensures that a user’s data is decrypted and processed exclusively within the protected perimeter, fully isolated from Google’s broader infrastructure. Each component authenticates every other using cryptographic validation, and keys are released only after the node proves its integrity against internal reference values. If any parameter fails to match, the connection is terminated, preventing the flow of information into untrusted segments.
The interaction model follows a multilayered sequence. The client initiates a secure Noise-protocol connection with a frontend server and performs mutual attestation. The server’s authenticity is then validated within an encrypted Oak session, preventing any possibility of channel tampering.
Afterward, the server establishes an encrypted ALTS channel to communicate with services inside the scalable inference pipeline, while the models process requests on protected TPU platforms. The entire system is intentionally ephemeral — all data, queries, and computations are destroyed immediately after the session ends. Even if an attacker were to obtain elevated privileges, past information would remain irrecoverable.
To prevent unauthorized access at any stage, Google embedded a wide array of safeguards into the environment:
• minimization of trusted components to preserve confidentiality;
• use of Confidential Federated Compute for collecting anonymized aggregate statistics;
• full encryption of all client–server communications;
• binary authorization, ensuring that only signed and validated configurations may run;
• isolation of user data within virtual machines;
• protection of memory and I/O channels via IOMMU to prevent physical exfiltration;
• complete removal of shell access on TPU hosts;
• routing of all inbound traffic through third-party IP relays to obscure the request’s origin;
• and a fully separated authentication and authorization system relying on anonymous tokens, detached from AI-request processing.
Security was independently assessed by NCC Group, which audited the platform from April through September 2025. Auditors uncovered several issues, including a timing-side-channel in the IP-relay module that, under certain conditions, could be used to deanonymize users. Google classified the risk as low, citing the inherent “noise” produced by a multi-tenant infrastructure, which makes correlating any individual request exceedingly difficult.
Additionally, three flaws were found in the attestation mechanism that could lead to denial-of-service conditions or protocol errors. Google stated that it is working to remediate all identified weaknesses. Despite the reliance on proprietary hardware and the system’s centralization around Borg Prime, researchers noted a high degree of protection against unauthorized data access — even from potential insider threats.
At its core, Private AI Compute follows the same philosophy as similar initiatives from other major vendors. Apple’s Private Cloud Compute and Meta’s Private Processing also enable cloud-based AI queries without compromising user privacy. Google emphasizes that its design fuses the computational power of cloud infrastructure with the guarantees of hardware-level encryption and transparent attestation. User devices connect to the secure enclave through remote cryptographic verification and encrypted channels, while Gemini models process data inside a sealed environment inaccessible to Google or any external actor.
In essence, this new architecture lays the foundation for a next-generation model of secure cloud computing — one in which the power of vast datacenters coexists with the principles of on-device privacy, and control over all data remains firmly in the hands of its owner.
Support Our Threat Intelligence
If you find our technology report and cybersecurity news helpful, consider supporting our work.