Unparalleled AI and graphics performance for the data center.
The Most Powerful Universal GPU
Experience breakthrough multi-workload performance with the NVIDIA L40S GPU. Combining powerful AI compute with best-in-class graphics and media acceleration, the L40S GPU is built to power the next generation of data center workloads—from generative AI and large language model (LLM) inference and training to 3D graphics, rendering, and video.
Highlights
Universal Performance
Tensor Performance
1466
TFLOPS¹
RT Core Performance
212
TFLOPS
Single-Precision Performance
91.6
TFLOPS
Powered by the NVIDIA Ada Lovelace Architecture
Fourth-Generation Tensor Cores
Hardware support for structural sparsity and optimized TF32 format provides out of-the-box performance gains for faster AI and data science model training. Accelerate AI-enhanced graphics capabilities with DLSS to upscale resolution with better performance in select applications.
Third-Generation RT Cores
Enhanced throughput and concurrent ray-tracing and shading capabilities improve ray-tracing performance, accelerating renders for product design and architecture, engineering, and construction workflows. See lifelike designs in action with hardware-accelerated motion blur and stunning real-time animations.
CUDA Cores
Accelerated single-precision floating point (FP32) throughput and improved power efficiency significantly boost performance for workflows like 3D model development and computer-aided engineering (CAE) simulation. Use enhanced 16-bit math capabilities (BF16) for mixed-precision workloads.
Transformer Engine
Transformer Engine dramatically accelerates AI performance and improves memory utilization for both training and inference. Harnessing the power of the Ada Lovelace fourth-generation Tensor Cores, Transformer Engine intelligently scans the layers of transformer architecture neural networks and automatically recasts between FP8 and FP16 precisions to deliver faster AI performance and accelerate training and inference.
Efficiency and Security
L40S GPU is optimized for 24/7 enterprise data center operations and designed, built, tested, and supported by NVIDIA to ensure maximum performance, durability, and uptime. The L40S GPU meets the latest data center standards, are Network Equipment-Building System (NEBS) Level 3 ready, and features secure boot with root of trust technology, providing an additional layer of security for data centers.
DLSS 3
L40S GPU enables ultra-fast rendering and smoother frame rates with NVIDIA DLSS 3. This breakthrough frame-generation technology leverages deep learning and the latest hardware innovations within the Ada Lovelace architecture and the L40S GPU, including fourth-generation Tensor Cores and an Optical Flow Accelerator, to boost rendering performance, deliver higher frames per second (FPS), and significantly improve latency.
Specifications
NVIDIA L40S GPU
PU Architecture | NVIDIA Ada Lovelace architecture |
GPU Memory | 48GB GDDR6 with ECC |
Memory Bandwidth | 864GB/s |
Interconnect Interface | PCIe Gen4 x16: 64GB/s bidirectional |
NVIDIA Ada Lovelace Architecture-Based CUDA® Cores | 18,176 |
NVIDIA Third-Generation RT Cores | 142 |
NVIDIA Fourth-Generation Tensor Cores | 568 |
RT Core Performance TFLOPS | 212 |
FP32 TFLOPS | 91.6 |
TF32 Tensor Core TFLOPS | 183 I 366* |
BFLOAT16 Tensor Core TFLOPS | 362.05 I 733* |
FP16 Tensor Core | 362.05 I 733* |
FP8 Tensor Core | 733 I 1,466* |
Peak INT8 Tensor TOPS Peak INT4 Tensor TOPS |
733 I 1,466* 733 I 1,466* |
Form Factor | 4.4″ (H) x 10.5″ (L), dual slot |
Display Ports | 4x DisplayPort 1.4a |
Max Power Consumption | 350W |
Power Connector | 16-pin |
Thermal | Passive |
Virtual GPU (vGPU) Software Support | Yes |
vGPU Profiles Supported | See virtual GPU licensing guide |
NVENC I NVDEC | 3x l 3x (includes AV1 encode and decode) |
Secure Boot With Root of Trust | Yes |
NEBS Ready | Level 3 |
Multi-Instance GPU (MIG) Support | No |
NVIDIA® NVLink® Support | No |