Category Archives: AI and ML

CVE-2025-33203 – Design weakness of NVIDIA NeMo Agent Toolkit UI for Web. Another preventive approach. (28th Nov 2025)

Preface: While web vulnerabilities can lead to various cyberattacks, they don’t directly or exclusively cause ransomware attacks. CSRF attacks exploit the trust a website has in a user’s browser to perform unauthorized actions on that website, while ransomware involves malware that encrypts a user’s system and demands payment.

Background: The official frontend user interface component for NeMo Agent Toolkit, an open-source library for building AI agents and workflows.

Prerequisites

  • NeMo Agent Toolkit installed and configured
  • Git
  • Node.js (v18 or higher)
  • npm or Docker

While Node.js v18 itself doesn’t inherently prevent or cause CSRF, it’s crucial to implement proper CSRF protection in your Node.js applications built with this version. Node.js v18 is now End-of-Life (EOL), meaning it no longer receives security updates, which makes implementing robust security measures even more critical.

Vulnerability details: CVE-2025-33203 – NVIDIA NeMo Agent Toolkit UI for Web contains a vulnerability in the chat API endpoint where an attacker may cause a Server-Side Request Forgery. A successful exploit of this vulnerability may lead to information disclosure and denial of service.

Affected Products:      NeMo Agent ToolKit  

Platforms or OS: All platforms

Affected Product: NeMo Agent ToolKit

Affected Versions: All versions prior to 1.3.0

Updated Version: 1.3.0

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5726

AI developers, please do not underestimate the CVE-2025-33187 (NVIDIA DGX Spark GB10) vulnerability (26th Nov 2025)

Updated 11/21/2025 04:36 PM

Preface: NVIDIA DGX Spark will be used by AI developers, researchers, and data scientists who need to prototype and deploy large AI models on their desktop, including those working with agentic AI, LLMs, and robotics.

The NVIDIA DGX Secure Root of Trust (SRoT), more commonly referred to as the Hardware Root of Trust (HRoT), is a foundational security component embedded in the system’s hardware, including the main GPUs and the BlueField Data Processing Units (DPUs).

The term “NVIDIA DGX SROOT” refers to the Secure Root of Trust (SROOT) firmware component within the NVIDIA DGX Spark personal AI supercomputer. It is a security feature designed to ensure the integrity of the system’s secure boot process and certificate management.

Background: The DGX Spark runs on NVIDIA DGX OS, a customized Ubuntu Linux distribution that includes a full-stack NVIDIA AI software ecosystem. The NVIDIA GB10 is a Superchip that integrates separate CPU and GPU dies in a single package, and the operating system is not embedded within the CPU die itself. Instead, the OS is installed on external NVMe storage, and the system uses unified memory accessible by both dies.

The OS and related software stack are stored on external NVMe solid-state drives (SSDs), not on the CPU die. The DGX Spark workstation typically includes up to 4 TB of NVMe storage.

However, Nvidia SROOT is an internal firmware element located in the Nvidia DGX Spark GB10 systems. It is a specific firmware component that runs on the system’s hardware.

Vulnerability details: CVE-2025-33187 – NVIDIA DGX Spark GB10 contains a vulnerability in SROOT, where an attacker could use privileged access to gain access to SoC protected areas. A successful exploit of this vulnerability might lead to code execution, information disclosure, data tampering, denial of service, or escalation of privileges.

Official announcement: Please refer to the link for details – https://nvd.nist.gov/vuln/detail/CVE-2025-33187

Cyber security focus for NVIDIA Isaac-GR00T – About CVE-2025-33183 and CVE-2025-33184 (24th Nov 2025)

Updated 11/19/2025 09:03 AM

Preface: NVIDIA Isaac is an AI robot development platform consisting of NVIDIA-accelerated libraries, application frameworks, and AI models that accelerate the development of AI robots such as autonomous mobile robots (AMRs), arms and manipulators, and humanoids.

NVIDIA Isaac GR00T N1 is the world’s first open foundation model for generalized humanoid robot reasoning and skills. This cross-embodiment model takes multimodal input, including language and images, to perform manipulation tasks in diverse environments.

Background:

Isaac GR00T N1.5 uses vision and text transformers to encode the robot’s image observations and text instructions. The architecture handles a varying number of views per embodiment by concatenating image token embeddings from all frames into a sequence, followed by language token embeddings.

A token is the most fundamental data unit in a text document, essential for enabling AI to understand and process information. In Natural Language Processing (NLP), tokenization refers to breaking down larger texts into smaller, manageable pieces called tokens.

Please refer to the attached diagram:

The System 1 reasoning module is a pre-trained Vision-Language Model (VLM) that runs on NVIDIA GPU. It processes the robot’s visual perception and language instruction to interpret the environment and understand the task goal. Subsequently, a Diffusion Transformer, trained with action flow-matching, serves as the System 2 action module. It cross-attends to the VLM output tokens and employs embodiment specific encoders and decoders to handle variable state and action dimensions for motion generation.

Vulnerability details: See below –

CVE-2025-33183         NVIDIA Isaac-GR00T for all platforms contains a vulnerability in a Python component, where an attacker could cause a code injection issue. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure, and data tampering.

CVE-2025-33184         NVIDIA Isaac-GR00T for all platforms contains a vulnerability in a Python component, where an attacker could cause a code injection issue. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5725

CVE-2025-33185: About NVIDIA AIStore  (17th Nov 2025)

Official Updated 10th Nov 2025 05:39 AM

Preface: The core design objective of NVIDIA AIStore (AIS) is to provide a high-performance, linearly scalable, and flexible storage solution specifically optimized for large-scale AI/ML and data analytics workloads. NVIDIA AIStore (AIS) provides secure access via a standalone Authentication Server (AuthN) that uses OAuth 2.0 compliant JSON Web Tokens (JWT) for token-based authentication.

The AuthN server is part of the broader NVIDIA AIStore project, which is publicly available on GitHub. It provides token-based secure access using the JSON Web Tokens (JWT) framework.

Background: The security of a signed JWT relies on a secret key (for HMAC algorithms like HS256) or a public/private key pair (for RSA or ECDSA). This key is used to create a digital signature that ensures the token’s integrity and authenticity—proving it has not been tampered with. If the application’s source code, configuration files, or version control system contains this secret key in plain text, it violates the principle of confidentiality for credentials. An attacker who discovers this hard-coded secret.

Vulnerability details: NVIDIA AIStore contains a vulnerability in AuthN. A successful exploit of this vulnerability might lead to escalation of privileges, information disclosure, and data tampering.

Impacts: Escalation of privileges, information disclosure, data tampering

Remediation: Updated to v3.31

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5724

CVE-2025-33202: About NVIDIA Triton Inference Server (14th Nov 2025)

Official Updated 11/10/2025 05:39 AM

Preface: Clients can communicate with Triton using either an HTTP/REST protocol, a GRPC protocol, or by an in-process C API or its C++ wrapper. Triton supports HTTP/REST and gRPC, both of which involve complex header parsing.

In the context of the Open Inference Protocol (OIP), also known as KServe V2 Protocol. The protocol defines a standardized interface for model inference, which implies that compliant inference servers must be capable of parsing incoming requests and serializing outgoing responses according to the protocol’s defined message formats.

Background: To define a parser that filters the payload for Triton using the KServe V2 (Open Inference Protocol), you need to handle the following:

Key Considerations

1.Protocol Compliance – The parser must understand the OIP message format:

-Inference Request: Includes inputs, outputs, parameters.

-Inference Response: Includes model_name, outputs, parameters.

-Data can be in JSON (for REST) or Protobuf (for gRPC).

2.Filtering Logic – Decide what you want to filter:

-Specific tensor names?

-Certain data types (e.g., FP32, INT64)?

-Large payloads (e.g., skip tensors above a size threshold)?

-Security checks (e.g., reject malformed headers)?

3.Shared Memory Handling – If shared memory is used, the parser should:

-Validate shared_memory_region references.

-Ensure the payload does not redundantly include tensor data when shared memory is specified.

Vulnerability details: NVIDIA Triton Inference Server for Linux and Windows contains a vulnerability where an attacker could cause a stack overflow by sending extra-large payloads. A successful exploit of this vulnerability might lead to denial of service.

Official announcement: Please see the official link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5723

CVE‑2025‑23361 and CVE-2025-33178: NVIDIA Nemo Framework contains vulnerabilities (13th Nov 2025)

Preface: The advantages of using Hydra and OmegaConf for configuration management are flexibility, reproducibility, and scalability. Hydra’s ability to override configurations at runtime from the command line and compose them from multiple sources makes it highly flexible.

NeMo uses Hydra/OmegaConf for configuration management, which supports interpolation and sometimes dynamic evaluation.

Background: NVIDIA NeMo is an end-to-end platform designed for developing and deploying generative AI models. This includes large language models (LLMs), vision language models (VLMs), video models, and speech AI. NeMo offers tools for data curation, fine-tuning, retrieval-augmented generation (RAG), and inference, making it a comprehensive solution for creating enterprise-ready AI models. Here are some key capabilities of NeMo LLMs:

  1. Customization: NeMo allows you to fine-tune pre-trained models to suit specific enterprise needs. This includes adding domain-specific knowledge and skills, and continuously improving the model with reinforcement learning from human feedback (RLHF).
  2. Scalability: NeMo supports large-scale training and deployment across various environments, including cloud, data centers, and edge devices. This ensures high performance and flexibility for different use cases.
  3. Foundation Models: NeMo offers a range of pre-trained foundation models, such as GPT-8, GPT-43, and GPT-530, which can be used for tasks like text classification, summarization, creative writing, and chatbots.
  4. Data Curation: The platform includes tools for processing and curating large datasets, which helps improve the accuracy and relevance of the models.
  5. Integration: NeMo can be integrated with other NVIDIA AI tools and services, providing a comprehensive ecosystem for AI development.

Vulnerability details:

CVE-2025-23361: NVIDIA NeMo Framework for all platforms contains a vulnerability in a script, where malicious input created by an attacker may cause improper control of code generation. A successful exploit of this vulnerability may lead to code execution, escalation of privileges, information disclosure, and data tampering.

CVE-2025-33178: NVIDIA NeMo Framework for all platforms contains a vulnerability in the bert services component where malicious data created by an attacker may cause a code injection. A successful exploit of this vulnerability may lead to Code execution, Escalation of privileges, Information disclosure, and Data tampering.

Ref: CVE-2025-33178 in the BERT services component is conceptually similar to CVE-2025-23361 in the LLM pretraining workflow. Both share the same underlying weakness: unsanitized dynamic code generation or execution based on user-controlled input.

Official announcement: Please see the official link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5718

CVE-2025-12863: About the libxml2 XML parsing library (11th Nov 2025)

Preface: Libxml2, a C library for parsing and manipulating XML documents, can be relevant in machine learning contexts when dealing with data stored or exchanged in XML format. While not a machine learning library itself, libxml2, or its Python binding lxml, serves as a foundational tool for data preparation and feature engineering.

Ref: The “difference” between Libxml and Libxml2 is that they are essentially the same thing, with “libxml2” being the official and specific name for the library.

Background: Moving nodes between XML documents can happen in machine learning workflows, especially during data integration, comparison, or transformation.  

When would you be at risk?

You’d be at risk if:

  • You use libxml2 or lxml to move nodes from one document to another (e.g., merging XML trees).
  • The underlying library internally calls xmlSetTreeDoc() during such operations.
  • The original document gets freed while namespace pointers still reference it.

Vulnerability details: A flaw was found in the xmlSetTreeDoc() function of the libxml2 XML parsing library. This function is responsible for updating document pointers when XML nodes are moved between documents. Due to improper handling of namespace references, a namespace pointer may remain linked to a freed memory region when the original document is destroyed. As a result, subsequent operations that access the namespace can lead to a use-after-free condition, causing an application crash.

Official announcement: Please refer to the link for details.

https://www.tenable.com/cve/CVE-2025-12863

https://access.redhat.com/security/cve/cve-2025-12863

Security Bulletin:  About NVIDIA RunAI – CVE-2025-33176 (6th Nov 2025)

Preface: NVIDIA Run:ai is a Kubernetes-native platform for managing and optimizing AI workloads, acquired by NVIDIA in 2024. It provides dynamic orchestration for GPU resources, supporting flexible resource allocation to improve resource utilization and accelerate the AI ​​development lifecycle in hybrid, on-premises, and cloud environments. Prior to its acquisition by NVIDIA in December 2024, Run:ai was an Israeli software startup founded in 2018, focused on AI infrastructure management. Its core product is Kubernetes-based software for managing and optimizing GPU workloads for AI applications, helping enterprises utilize hardware more efficiently. The company’s software assists enterprises in managing GPU resources in on-premises, cloud, and hybrid environments. Run:ai does not “change its role in machine learning,” but rather focuses on the key challenges of AI workload management and GPU orchestration throughout the broader Machine Learning Operations (MLOps) lifecycle.

Background: NVIDIA Run:ai is a Kubernetes-based platform that acts as a GPU orchestrator to maximize the efficiency of AI and machine learning workloads. It addresses challenges like managing expensive GPU resources by enabling dynamic, policy-based scheduling, allowing for the sharing of GPUs across teams and projects, and optimizing workload performance for training, tuning, and inference. Run:ai integrates into existing hybrid or on-premises AI infrastructure to improve GPU utilization and accelerate AI development cycles.

NVIDIA DGX Cloud primarily leverages Kubernetes (K8s) for orchestrating and managing AI workloads.

While DGX systems historically used Docker for containerization, DGX Cloud, particularly for advanced AI workloads and resource management, relies on Kubernetes for its scalability, high-performance computing capabilities, and efficient GPU resource allocation. This is often integrated with other NVIDIA software like NVIDIA NeMo and NVIDIA Run:ai, and deployed on cloud services such as Amazon Elastic Kubernetes Service (Amazon EKS).

Vulnerability details: NVIDIA RunAI for all platforms contains a vulnerability where a user could cause an improper restriction of communications channels on an adjacent network. A successful exploit of this vulnerability might lead to escalation of privileges, data tampering, and information disclosure.

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5719

AMD ID: AMD-SB-7055RDSEED Failure on AMD “Zen 5” Processors(27th-10-2025)

Preface: The main consequence of an RDSEED failure on AMD Zen 5 processors is instability, crashes, and potentially corrupted data, as this issue affects the processor’s ability to generate high-quality random numbers for cryptography and other sensitive tasks. This has led to the development of Linux kernel patches to temporarily disable RDSEED on affected Zen 5 CPUs until AMD provides a permanent hardware or firmware fix.

Background: RDSEED is a CPU instruction that provides high-entropy random numbers directly from a hardware entropy source, such as the Intel Digital Random Number Generator. It is designed to be used to seed other pseudo-random number generators (PRNGs) for cryptographic applications, ensuring a secure and unpredictable starting point.

RDSEED is a CPU instruction that provides high-entropy random numbers directly from a hardware entropy source, such as the Intel Digital Random Number Generator. It is designed to be used to seed other pseudo-random number generators (PRNGs) for cryptographic applications, ensuring a secure and unpredictable starting point.

Vulnerability Details: AMD was notified of a bug in “Zen 5” processors that may cause the RDSEED instruction to return 0 at a rate inconsistent with randomness while incorrectly signaling success (CF=1), indicating a potential misclassification of failure as success. This issue was initially reported publicly via the Linux kernel mailing list and was not submitted through AMD’s Coordinated Vulnerability Disclosure (CVD) process.

AMD has determined that the 16-bit and 32-bit forms of the RDSEED instruction on “Zen 5” processors are affected. The 64-bit form of RDSEED is not affected. AMD plans to release mitigations for this vulnerability.

Official announcement: Please refer to the link for details –

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7055.html

Security Bulletin: NVIDIA ConnectX and BlueField (CVE‑2025-23299) – October 2025 (24th Oct 2025)

Preface: Nvidia BlueField is a line of data processing units (DPUs) designed and produced by Nvidia. Initially developed by Mellanox Technologies. DOCA is a consistent and essential resource across all existing and future generations of BlueField DPU and SuperNIC products.

Background: The NVIDIA cloud-native supercomputing platform leverages the NVIDIA BlueField DPU architecture with high-speed, low-latency. The DPU enables native cloud services that let multiple users securely share resources without loss in application performance. HPC and AI communication frameworks and libraries play a critical role in determining application performance. Due to their latency and bandwidth-sensitive nature, offloading the libraries from the host CPU or GPU to the BlueField DPU creates the highest degree of overlap for parallel progression of communication and computation. DOCA is a consistent and essential resource across all existing and future generations of BlueField DPU and SuperNIC products.

DOCA BlueMan dashboard is the web-based interface for managing and monitoring an NVIDIA BlueField DPU (Data Processing Unit).

Vulnerability details: NVIDIA Bluefield and ConnectX contain a vulnerability in the management interface that may allow a malicious actor with high privilege access to execute arbitrary code.

Reference:

While Python itself is memory-safe, the real risk comes from:

  • YAML parsing libraries (like PyYAML) that allow arbitrary object deserialization.
  • C-based extensions or native bindings used by Python that may not enforce memory safety.
  • Improper validation of YAML configuration passed into privileged services like DTS.

Official announcement: Please see the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5684