NVIDIA L40S

Brand: Nvidia
Availability: In Stock
  • Overview

    The NVIDIA L40S is a high-performance, data-centre-class GPU built on the Ada Lovelace architecture. It is designed for universal workloads — from generative AI (inference & training), large language models (LLMs), 3D graphics, real-time rendering and video applications. With 48 GB of GDDR6 memory and a broad precision range (from FP32 to FP8 and INT4/INT8), it offers an excellent balance of AI compute, graphics acceleration and media/streaming capabilities.
    It is ideal for enterprises and service providers looking for a single-GPU solution to support multimodal AI workloads, visualization, virtual workstations, cloud graphics and inference deployments.

     

    Key Features

    • Ada Lov­elace GPU Architecture: built for modern AI and graphics workloads.
    • 4th-generation Tensor Cores + support for structural sparsity and low-precision (FP8, INT8, INT4) compute for accelerated AI training & inference.
    • 3rd-generation RT (Ray-Tracing) Cores: high ray-tracing throughput to support real-time rendering, digital twins, XR, architecture/engineering workflows.
    • Massive 48 GB of GDDR6 memory with ECC (error-correcting) to handle large models, rich textures, high-resolution scenes.
    • High memory bandwidth (864 GB/s) to feed data-intensive pipelines.
    • Passive cooling, dual-slot form factor suitable for data-centre server integration. Enterprise-ready features: Secure Boot with root of trust, NEBS Level 3 ready, virtualization (vGPU) support.
    • Multi-purpose support: not only AI but also professional visualization, virtual workstations, cloud gaming/graphics, rendering workflows.

     

    Specifications

    Here is a comprehensive summary of the main technical specifications:

    Specification

    Value

    GPU Architecture

    NVIDIA Ada Lovelace

    Memory

    48 GB GDDR6 with ECC

    Memory Bandwidth

    864 GB/s

    CUDA® Cores

    18,176

    RT Cores (3rd Gen)

    142

    Tensor Cores (4th Gen)

    568

    Peak FP32 Performance

    ~91.6 TFLOPS

    Tensor Performance (TF32, FP16, FP8)

    Up to ~366 TFLOPS (TF32) / ~733 TFLOPS (FP16) / ~1,466 TFLOPS (FP8) with sparsity support

    RT Core Performance

    ~209-212 TFLOPS

    Interface

    PCI Express Gen4 x16

    Form Factor

    Full-height, full-length (10.5″), dual-slot FHFL (≈4.4″ H x 10.5″ L)

    Max Power Consumption

    350 W

    Cooling Solution

    Passive cooling (suitable for server/airflow design)

    Display Outputs

    4 × DisplayPort 1.4a (typically disabled by default in server mode)

    Virtual GPU (vGPU) Support

    Yes – for virtual workstations and virtualised environments

    Multi-Instance GPU (MIG)

    No support (as of specification)

Omega One Company
Omega One Company



Copyright © 2025 Omega One Company All Rights Reserved.
Developer & Designer | Hossein Donyadideh