NVIDIA Debuts Inference Platform with L4 Tensor Core GPU and H100 NVL GPU

NVIDIA announced the launch of four inference platforms, introducing the new NVIDIA L4 Tensor Core GPU and NVIDIA H100 NVL GPU for accelerated inference applications. They have collaborated with partners like Google Cloud, D-ID, and Cohere to hasten the development of various generative artificial intelligence services.

The inference platforms primarily employ Ada, Hopper GPUs, or the “Grace + Hopper” Superchip configuration, supplemented by the newly added NVIDIA L4 Tensor Core GPU and NVIDIA H100 NVL GPU for optimized processing based on workload requirements. These platforms cater to AI video generation, image generation, large language model deployment, and recommendation system inference demands.

The NVIDIA L4 Tensor Core GPU offers 120 times the AI video generation performance of traditional CPUs while consuming 99% less energy. It handles diverse workloads and supports enhanced video decoding and transcoding capabilities, video streaming, augmented reality, and generative AI video applications.

The NVIDIA H100 NVL GPU, suitable for deploying large natural language models like ChatGPT, possesses 94GB of video memory and a Transformer engine for acceleration. It provides 12 times the inference performance of the previous-generation A100 GPU when running GPT-3 language models on data center scale servers.

Google Cloud is the first to implement the NVIDIA L4 Tensor Core GPU in its machine learning platform, Vertex AI, and to offer instances powered by the NVIDIA L4 Tensor Core GPU to the public. Early adopters include Descript, which assists creators in producing video and podcast content using generative AI, and WOMBO, an AI text-to-digital art app dubbed “Dream.”

Other adopters of NVIDIA’s inference platforms include Kuaishou, the generative AI technology platform D-ID, AI production studio Seyhan Lee, and language AI company Cohere.

The Grace Hopper Superchip and NVIDIA H100 NVL GPU are slated for availability in the second half of the year, while the NVIDIA L4 Tensor Core GPU can be previewed in a non-public capacity through Google Cloud or provided by Advantech, ASUS, Cisco, Dell, Fujitsu, GIGABYTE, HPE, Lenovo, QCT, and Supermicro.