The NVIDIA L40S is a high-performance, data-centre-class GPU built on the Ada Lovelace architecture. It is designed for universal workloads — from generative AI (inference & training), large language models (LLMs), 3D graphics, real-time rendering and video applications. With 48 GB of GDDR6 memory and a broad precision range (from FP32 to FP8 and INT4/INT8), it offers an excellent balance of AI compute, graphics acceleration and media/streaming capabilities. 
 It is ideal for enterprises and service providers looking for a single-GPU solution to support multimodal AI workloads, visualization, virtual workstations, cloud graphics and inference deployments.
Here is a comprehensive summary of the main technical specifications:
| Specification | Value | 
| GPU Architecture | NVIDIA Ada Lovelace | 
| Memory | 48 GB GDDR6 with ECC | 
| Memory Bandwidth | 864 GB/s | 
| CUDA® Cores | 18,176 | 
| RT Cores (3rd Gen) | 142 | 
| Tensor Cores (4th Gen) | 568 | 
| Peak FP32 Performance | ~91.6 TFLOPS | 
| Tensor Performance (TF32, FP16, FP8) | Up to ~366 TFLOPS (TF32) / ~733 TFLOPS (FP16) / ~1,466 TFLOPS (FP8) with sparsity support | 
| RT Core Performance | ~209-212 TFLOPS | 
| Interface | PCI Express Gen4 x16 | 
| Form Factor | Full-height, full-length (10.5″), dual-slot FHFL (≈4.4″ H x 10.5″ L) | 
| Max Power Consumption | 350 W | 
| Cooling Solution | Passive cooling (suitable for server/airflow design) | 
| Display Outputs | 4 × DisplayPort 1.4a (typically disabled by default in server mode) | 
| Virtual GPU (vGPU) Support | Yes – for virtual workstations and virtualised environments | 
| Multi-Instance GPU (MIG) | No support (as of specification) |