The NVIDIA L40S is a high-performance, data-centre-class GPU built on the Ada Lovelace architecture. It is designed for universal workloads — from generative AI (inference & training), large language models (LLMs), 3D graphics, real-time rendering and video applications. With 48 GB of GDDR6 memory and a broad precision range (from FP32 to FP8 and INT4/INT8), it offers an excellent balance of AI compute, graphics acceleration and media/streaming capabilities.
It is ideal for enterprises and service providers looking for a single-GPU solution to support multimodal AI workloads, visualization, virtual workstations, cloud graphics and inference deployments.
Here is a comprehensive summary of the main technical specifications:
|
Specification |
Value |
|
GPU Architecture |
NVIDIA Ada Lovelace |
|
Memory |
48 GB GDDR6 with ECC |
|
Memory Bandwidth |
864 GB/s |
|
CUDA® Cores |
18,176 |
|
RT Cores (3rd Gen) |
142 |
|
Tensor Cores (4th Gen) |
568 |
|
Peak FP32 Performance |
~91.6 TFLOPS |
|
Tensor Performance (TF32, FP16, FP8) |
Up to ~366 TFLOPS (TF32) / ~733 TFLOPS (FP16) / ~1,466 TFLOPS (FP8) with sparsity support |
|
RT Core Performance |
~209-212 TFLOPS |
|
Interface |
PCI Express Gen4 x16 |
|
Form Factor |
Full-height, full-length (10.5″), dual-slot FHFL (≈4.4″ H x 10.5″ L) |
|
Max Power Consumption |
350 W |
|
Cooling Solution |
Passive cooling (suitable for server/airflow design) |
|
Display Outputs |
4 × DisplayPort 1.4a (typically disabled by default in server mode) |
|
Virtual GPU (vGPU) Support |
Yes – for virtual workstations and virtualised environments |
|
Multi-Instance GPU (MIG) |
No support (as of specification) |