Category Archives: AI and ML

CVE‑2025‑23360 – NVIDIA Nemo Framework contains a vulnerability (12th Mar 2025)

Preface: The symbol ~/. by itself is not a relative path traversal; it simply refers to the home directory of the current user. However, when combined with ./.., it can be part of a relative path traversal.

Relative path traversal involves using sequences like ../ to navigate up the directory hierarchy. For example, ~/. refers to the home directory, and ./.. moves up one directory level from the current directory. So, ~/. ./.. would navigate to the parent directory of the home directory, which can be considered a form of relative path traversal

Background: NVIDIA NeMo is an end-to-end platform designed for developing and deploying generative AI models. This includes large language models (LLMs), vision language models (VLMs), video models, and speech AI. NeMo offers tools for data curation, fine-tuning, retrieval-augmented generation (RAG), and inference, making it a comprehensive solution for creating enterprise-ready AI models. Here are some key capabilities of NeMo LLMs:

  1. Customization: NeMo allows you to fine-tune pre-trained models to suit specific enterprise needs. This includes adding domain-specific knowledge and skills, and continuously improving the model with reinforcement learning from human feedback (RLHF).
  2. Scalability: NeMo supports large-scale training and deployment across various environments, including cloud, data centers, and edge devices. This ensures high performance and flexibility for different use cases.
  3. Foundation Models: NeMo offers a range of pre-trained foundation models, such as GPT-8, GPT-43, and GPT-530, which can be used for tasks like text classification, summarization, creative writing, and chatbots.
  4. Data Curation: The platform includes tools for processing and curating large datasets, which helps improve the accuracy and relevance of the models.
  5. Integration: NeMo can be integrated with other NVIDIA AI tools and services, providing a comprehensive ecosystem for AI development.

Vulnerability details: NVIDIA Nemo Framework contains a vulnerability where a user could cause a relative path traversal issue by arbitrary file write. A successful exploit of this vulnerability may lead to code execution and data tampering.

Official announcement: Please see the official link for details

https://nvidia.custhelp.com/app/answers/detail/a_id/5623

CVE-2024-0141: NVIDIA Hopper HGX for 8-GPU contains a vulnerability in GPU vBIOS  (10th Mar2025)

Last official update on February 28, 2025 at 3:28 PM

Preface: Hopper PPCIe is limited to HGX 8-way systems, where the eight GPUs and four NVSwitches are passed through to one VM. Other topologies are not supported.

Background: The GPU vBIOS can communicate through IOCTL (Input/Output Control) calls. IOCTL is a system call for device-specific input/output operations and other operations which cannot be expressed by regular system calls. In the context of GPU drivers, IOCTLs are used to interact with the GPU hardware, including tasks like memory management, command submission, and mode setting.

CUDA Interprocess Communication (IPC) is not supported in PPCIe mode. Developer tools such as NVIDIA Nsight for profiling are not supported in PPCIe mode.

When an IOCTL contains privileged functionality and is exposed unnecessarily, attackers may be able to access this functionality by invoking the IOCTL.

Vulnerability details: NVIDIA Hopper HGX for 8-GPU contains a vulnerability in the GPU vBIOS that may allow a malicious actor with tenant level GPU access to write to an unsupported registry causing a bad state. A successful exploit of this vulnerability may lead to denial of service.

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5561

CVE-2024-0114: NVIDIA Hopper HGX for 8-GPU contains a vulnerability in the HGX Management Controller HMC (7 th March 2025)

Preface: NVIDIA collaborates with Supermicro for their server solutions, including the use of Supermicro’s BMC (Baseboard Management Controller) in certain systems. Supermicro provides a range of server solutions optimized for NVIDIA’s platforms.

Background: The NVIDIA Hopper HGX for 8 GPUs has several standout features:

High Performance: It hosts eight H100 Tensor Core GPUs, which are designed for AI and high-performance computing (HPC) workloads.

Advanced Connectivity: Each H100 GPU connects to four third-generation NVSwitches, enabling a fully connected topology. This setup allows any H100 GPU to communicate with any other H100 GPU concurrently at a bidirectional speed of 900 GB/s.

Enhanced Bandwidth: The NVLink ports provide more than 14 times the bandwidth of the current PCIe Gen4 x16 bus.

Vulnerability details: VIDIA Hopper HGX for 8-GPU contains a vulnerability in the HGX Management Controller (HMC) that may allow a malicious actor with administrative access on the BMC to access the HMC as an administrator. A successful exploit of this vulnerability may lead to code execution, denial of service, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5561

CVE-2024-36353: CROSS-PROCESS GPU MEMORY DISCLOSURE (27-02-2025)

Preface: Regarding its use in HPC clusters, the Radeon PRO V710 is indeed suitable. It is supported by AMD’s ROCm platform, which is optimized for HPC and AI workloads. Additionally, it is used in Azure’s NVads V710 v5-series virtual machines, which are designed for GPU-accelerated applications, including HPC.

Background: The global memory of the AMD Radeon™ PRO V710 is the 28 GB of GDDR6 memory. This memory is connected via a 224-bit memory interface and operates at an effective speed of 18 Gbps1. The memory is used for storing data that the GPU processes, such as textures, frame buffers, and other computational data.

The NVIDIA Container Toolkit is specifically designed to work with NVIDIA GPUs and their CUDA framework. It is not compatible with AMD GPUs. For AMD GPUs, you should use the ROCm (Radeon Open Compute) platform, which provides similar functionality for containerized environments.

OpenCL (Open Computing Language) in AMD ROCm (Radeon Open Compute) serves as a framework for writing programs that execute across heterogeneous platforms, including CPUs, GPUs, and other processors. Specifically, in the context of AMD ROCm, OpenCL allows developers to harness the computational power of AMD GPUs for high-performance, data-parallel computing tasks.

Vulnerability details: Insufficient clearing of GPU global memory could allow a malicious process running on the same GPU to read left over memory values potentially leading to loss of confidentiality.

Official announcement: Please refer to the link for details https://www.amd.com/en/resources/product-security/bulletin/amd-sb-6019.html

CVE-2024-0148: NVIDIA Jetson Linux and IGX OS image contains a vulnerability in the UEFI firmware RCM boot mode (25-02-2025)

Preface: NVIDIA IGX Orin software is used by a variety of organizations, particularly those in industrial and medical environments. This platform is designed to support AI applications at the edge, providing high performance, advanced functional safety, and security.

Some specific use cases include:

  • Industrial Automation: Companies use IGX Orin to enhance manufacturing processes with AI-driven automation and predictive maintenance.
  • Healthcare: Medical institutions leverage IGX Orin for AI-powered diagnostics, medical imaging, and patient monitoring.
  • Robotics: Robotics companies utilize IGX Orin for developing intelligent robots that can operate safely alongside humans.

The platform’s versatility and robust support make it suitable for any organization looking to deploy AI solutions in demanding environments.

Background: The NVIDIA IGX Orin Developer Kit runs the Holopack 2.0 Developer Preview software. Holopack is a comprehensive solution for end-to-end GPU accelerated AI application development and testing. Holopack supports two GPU modes:

iGPU – Holopack deploys drivers and libraries to support NVIDIA Ampere sets on NVIDIA IGX Orin modules into the GPU.

dGPU – Holopack deploys drivers and libraries to support optional NVIDIA RTX A6000 connected to PCIe slot Discrete GPU.

Its high-performance, low-power computing for deep learning, and computer vision makes Jetson the ideal platform for compute-intensive projects. The Jetson platform includes a variety of Jetson modules with NVIDIA JetPack™ SDK.

Vulnerability details: NVIDIA Jetson Linux and IGX OS image contains a vulnerability in the UEFI firmware RCM boot mode, where an unprivileged attacker with physical access to the device could load untrusted code. A successful exploit might lead to code execution, escalation of privileges, data tampering, denial of service, and information disclosure. The scope of the impacts can extend to other components.

Remark: UEFI supply-chain allows for many of these shared libraries to be integrated in various ways, including compiled from source, licensed for modification and reuse and finally as a dynamic or static linked executable.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5617

When the Sumerian advanced civilization met Cray HPC

(24-02-2025)

Preface: Is linear algebra used in real life? An example of where there is a lot of research on these things is in sparse matrix analysis, which comes up a lot in real world applications of linear algebra. For some buzzwords, popular topics like machine learning, neural networks, and computer graphics all use huge amounts of linear algebra.

Since a box’s length is independent of its width and breadth, space has three dimensions. Since any point in space may be described by a linear combination of three independent vectors, space is considered to be three-dimensional in the technical language of linear algebra.

In Einstein’s special relativity theory we live in 4 dimensional spacetime. Though the way we normally “imagine” the world, we tend to believe that we live in a 3 dimensional Newtonian space with a separate absolute time dimension.

Introduction: AI calculations often rely on various mathematical techniques, including linear algebra, Fourier transforms, and sparse matrix operations.

Some of the key math libraries in ROCm include:

  • rocBLAS: A library for basic linear algebra subprograms.
  • rocFFT: A library for fast Fourier transforms.
  • rocRAND: A library for random number generation.
  • rocSOLVER: A library for solving linear algebra problems.
  • rocSPARSE: A library for sparse matrix operations

These libraries are optimized for AMD hardware and provide similar functionality to NVIDIA’s cuBLAS, cuFFT, cuRAND, etc., making it easier for developers to port their applications between different hardware platforms.

What does ROCm stand for? ROCm initially stood for Radeon Open Compute platform; however, due to Open Compute being a registered trademark, ROCm is no longer an acronym — it is simply AMD’s open-source stack designed for GPU compute.

Official reference: If you are interested in ROCm, please refer to the following link – https://rocm.docs.amd.com/en/docs-5.7.1/reference/gpu_libraries/math.html

CVE‑2024‑53870, CVE‑2024‑53871, CVE‑2024‑53872, CVE‑2024‑53873, CVE‑2024‑53874, CVE‑2024‑53875, CVE‑2024‑53876, CVE‑2024‑53877, CVE‑2024‑53878 and CVE‑2024‑53879 (21-02-2025)

Released on February 18, 2025

Preface: In NVIDIA CUDA, cuobjdump and nvdisasm are two binary utilities used for examining and disassembling CUDA binaries (cubin files).

cuobjdump

  • Purpose: It can disassemble CUDA binaries and extract PTX (Parallel Thread Execution) code from host binaries, executables, object files, static libraries, and external fatbinary files.
  • Usage: cuobjdump is versatile as it accepts both cubin files and host binaries.
  • Features: It provides basic disassembly and extraction capabilities but lacks advanced display options and control flow analysis.

nvdisasm

  • Purpose: It is specifically designed to disassemble cubin files.
  • Usage: Unlike cuobjdump, nvdisasm only accepts cubin files.
  • Features: It offers richer output options, including advanced display options and control flow analysis.

These tools are essential for developers who need to inspect and debug the compiled CUDA code.

Background: Parallel processing is a method in computing of running two or more processors (CPUs) to handle separate parts of an overall task. Breaking up different parts of a task among multiple processors will help reduce the amount of time to run a program. GPUs render images more quickly than a CPU because of its parallel processing architecture, which allows it to perform multiple calculations across streams of data simultaneously. The CPU is the brain of the operation, responsible for giving instructions to the rest of the system, including the GPU(s).

NVIDIA CUDA provides a simple C/C++ based interface. The CUDA compiler leverages parallelism built into the CUDA programming model as it compiles your program into code.
CUDA is a parallel computing platform and programming interface model created by Nvidia for the development of software which is used by parallel processors. It serves as an alternative to running simulations on traditional CPUs.

Vulnerability details:

The following two design flaws are associated with these CVEs:

CVE‑2024‑53870, CVE‑2024‑53871, CVE‑2024‑53872, CVE‑2024‑53873, CVE‑2024‑53874, CVE‑2024‑53875, CVE‑2024‑53876, CVE‑2024‑53877, CVE‑2024‑53878 and CVE‑2024‑53879

NVIDIA CUDA toolkit for Linux and Windows contains a vulnerability in the cuobjdump binary, where a user could cause a crash by passing a malformed ELF file to cuobjdump. A successful exploit of this vulnerability might lead to a partial denial of service.

NVIDIA CUDA toolkit for all platforms contains a vulnerability in the nvdisasm binary, where a user could cause an out-of-bounds read by passing a malformed ELF file to nvdisasm. A successful exploit of this vulnerability might lead to a partial denial of service.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5594

CVE-2024-0112 – about NVIDIA Jetson AGX Orin™ and NVIDIA IGX Orin software  (17th Feb 2025)

The original security advisory was issued on February 11, 2025.

Preface: NVIDIA IGX Orin software is used by a variety of organizations, particularly those in industrial and medical environments. This platform is designed to support AI applications at the edge, providing high performance, advanced functional safety, and security.

Some specific use cases include:

  • Industrial Automation: Companies use IGX Orin to enhance manufacturing processes with AI-driven automation and predictive maintenance.
  • Healthcare: Medical institutions leverage IGX Orin for AI-powered diagnostics, medical imaging, and patient monitoring.
  • Robotics: Robotics companies utilize IGX Orin for developing intelligent robots that can operate safely alongside humans.

The platform’s versatility and robust support make it suitable for any organization looking to deploy AI solutions in demanding environments.

Background: The NVIDIA IGX Orin Developer Kit runs the Holopack 2.0 Developer Preview software. Holopack is a comprehensive solution for end-to-end GPU accelerated AI application development and testing. Holopack supports two GPU modes:

iGPU – Holopack deploys drivers and libraries to support NVIDIA Ampere sets on NVIDIA IGX Orin modules into the GPU.

dGPU – Holopack deploys drivers and libraries to support optional NVIDIA RTX A6000 connected to PCIe slot Discrete GPU.

Its high-performance, low-power computing for deep learning, and computer vision makes Jetson the ideal platform for compute-intensive projects. The Jetson platform includes a variety of Jetson modules with NVIDIA JetPack™ SDK.

Vulnerability details: NVIDIA Jetson AGX Orin™ and NVIDIA IGX Orin software contain a vulnerability where an attacker can cause an improper input validation issue by escalating certain permissions to a limited degree. A successful exploit of this vulnerability might lead to code execution, denial of service, data corruption, information disclosure, or escalation of privilege.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5611

CVE‑2024‑53880: NVIDIA Triton Inference Server contains a vulnerability in the model loading API (14th Feb 2025)

Original release date: February 11, 2025

Preface: The NVIDIA Triton Inference Server API supports both HTTP/REST and GRPC protocols. These protocols allow clients to communicate with the Triton server for various tasks such as model inferencing, checking server and model health, and managing model metadata and statistics.

Background: NVIDIA Triton™ Inference Server, part of the NVIDIA AI platform and available with NVIDIA AI Enterprise, is open-source software that standardizes AI model deployment and execution across every workload.

The Asynchronous Server Gateway Interface (ASGI) is a calling convention for web servers to forward requests to asynchronous-capable Python frameworks, and applications. It is built as a successor to the Web Server Gateway Interface (WSGI).

Vulnerability details: NVIDIA Triton Inference Server contains a vulnerability in the model loading API, where a user could cause an integer overflow or wraparound error by loading a model with an extra-large file size that overflows an internal variable. A successful exploit of this vulnerability might lead to denial of service.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5612

CVE-2025-23359 – NVIDIA Container Toolkit for Linux contains a Time-of-Check Time-of-Use (TOCTOU) vulnerability (11th Feb 2025)

Preface: In software development, time-of-check to time-of-use (TOCTOU, TOCTTOU or TOC/TOU) is a class of software bugs caused by a race condition involving the checking of the state of a part of a system (such as a security credential) and the use of the results of that check.

Background: The NVIDIA container stack is architected so that it can be targeted to support any container runtime in the ecosystem. The components of the stack include:

The NVIDIA Container Runtime (nvidia-container-runtime)

The NVIDIA Container Runtime Hook (nvidia-container-toolkit / nvidia-container-runtime-hook)

The NVIDIA Container Library and CLI (libnvidia-container1, nvidia-container-cli)

The components of the NVIDIA container stack are packaged as the NVIDIA Container Toolkit.

The NVIDIA Container Toolkit is a key component in enabling Docker containers to leverage the raw power of NVIDIA GPUs. This toolkit allows for the integration of GPU resources into your Docker containers.

Remark: The Podman command can be used with remote services using the –remote flag. Connections can be made using local unix domain sockets, ssh

Vulnerability details: NVIDIA Container Toolkit for Linux contains a Time-of-Check Time-of-Use (TOCTOU) vulnerability when used with default configuration, where a crafted container image could gain access to the host file system. A successful exploit of this vulnerability might lead to code execution, denial of service, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please refer to the vendor announcement for detail – https://nvidia.custhelp.com/app/answers/detail/a_id/5616