Category Archives: Potential Risk of CVE

CVE-2025-33185: About NVIDIA AIStore  (17th Nov 2025)

Official Updated 10th Nov 2025 05:39 AM

Preface: The core design objective of NVIDIA AIStore (AIS) is to provide a high-performance, linearly scalable, and flexible storage solution specifically optimized for large-scale AI/ML and data analytics workloads. NVIDIA AIStore (AIS) provides secure access via a standalone Authentication Server (AuthN) that uses OAuth 2.0 compliant JSON Web Tokens (JWT) for token-based authentication.

The AuthN server is part of the broader NVIDIA AIStore project, which is publicly available on GitHub. It provides token-based secure access using the JSON Web Tokens (JWT) framework.

Background: The security of a signed JWT relies on a secret key (for HMAC algorithms like HS256) or a public/private key pair (for RSA or ECDSA). This key is used to create a digital signature that ensures the token’s integrity and authenticity—proving it has not been tampered with. If the application’s source code, configuration files, or version control system contains this secret key in plain text, it violates the principle of confidentiality for credentials. An attacker who discovers this hard-coded secret.

Vulnerability details: NVIDIA AIStore contains a vulnerability in AuthN. A successful exploit of this vulnerability might lead to escalation of privileges, information disclosure, and data tampering.

Impacts: Escalation of privileges, information disclosure, data tampering

Remediation: Updated to v3.31

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5724

CVE-2025-33202: About NVIDIA Triton Inference Server (14th Nov 2025)

Official Updated 11/10/2025 05:39 AM

Preface: Clients can communicate with Triton using either an HTTP/REST protocol, a GRPC protocol, or by an in-process C API or its C++ wrapper. Triton supports HTTP/REST and gRPC, both of which involve complex header parsing.

In the context of the Open Inference Protocol (OIP), also known as KServe V2 Protocol. The protocol defines a standardized interface for model inference, which implies that compliant inference servers must be capable of parsing incoming requests and serializing outgoing responses according to the protocol’s defined message formats.

Background: To define a parser that filters the payload for Triton using the KServe V2 (Open Inference Protocol), you need to handle the following:

Key Considerations

1.Protocol Compliance – The parser must understand the OIP message format:

-Inference Request: Includes inputs, outputs, parameters.

-Inference Response: Includes model_name, outputs, parameters.

-Data can be in JSON (for REST) or Protobuf (for gRPC).

2.Filtering Logic – Decide what you want to filter:

-Specific tensor names?

-Certain data types (e.g., FP32, INT64)?

-Large payloads (e.g., skip tensors above a size threshold)?

-Security checks (e.g., reject malformed headers)?

3.Shared Memory Handling – If shared memory is used, the parser should:

-Validate shared_memory_region references.

-Ensure the payload does not redundantly include tensor data when shared memory is specified.

Vulnerability details: NVIDIA Triton Inference Server for Linux and Windows contains a vulnerability where an attacker could cause a stack overflow by sending extra-large payloads. A successful exploit of this vulnerability might lead to denial of service.

Official announcement: Please see the official link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5723

CVE‑2025‑23361 and CVE-2025-33178: NVIDIA Nemo Framework contains vulnerabilities (13th Nov 2025)

Preface: The advantages of using Hydra and OmegaConf for configuration management are flexibility, reproducibility, and scalability. Hydra’s ability to override configurations at runtime from the command line and compose them from multiple sources makes it highly flexible.

NeMo uses Hydra/OmegaConf for configuration management, which supports interpolation and sometimes dynamic evaluation.

Background: NVIDIA NeMo is an end-to-end platform designed for developing and deploying generative AI models. This includes large language models (LLMs), vision language models (VLMs), video models, and speech AI. NeMo offers tools for data curation, fine-tuning, retrieval-augmented generation (RAG), and inference, making it a comprehensive solution for creating enterprise-ready AI models. Here are some key capabilities of NeMo LLMs:

  1. Customization: NeMo allows you to fine-tune pre-trained models to suit specific enterprise needs. This includes adding domain-specific knowledge and skills, and continuously improving the model with reinforcement learning from human feedback (RLHF).
  2. Scalability: NeMo supports large-scale training and deployment across various environments, including cloud, data centers, and edge devices. This ensures high performance and flexibility for different use cases.
  3. Foundation Models: NeMo offers a range of pre-trained foundation models, such as GPT-8, GPT-43, and GPT-530, which can be used for tasks like text classification, summarization, creative writing, and chatbots.
  4. Data Curation: The platform includes tools for processing and curating large datasets, which helps improve the accuracy and relevance of the models.
  5. Integration: NeMo can be integrated with other NVIDIA AI tools and services, providing a comprehensive ecosystem for AI development.

Vulnerability details:

CVE-2025-23361: NVIDIA NeMo Framework for all platforms contains a vulnerability in a script, where malicious input created by an attacker may cause improper control of code generation. A successful exploit of this vulnerability may lead to code execution, escalation of privileges, information disclosure, and data tampering.

CVE-2025-33178: NVIDIA NeMo Framework for all platforms contains a vulnerability in the bert services component where malicious data created by an attacker may cause a code injection. A successful exploit of this vulnerability may lead to Code execution, Escalation of privileges, Information disclosure, and Data tampering.

Ref: CVE-2025-33178 in the BERT services component is conceptually similar to CVE-2025-23361 in the LLM pretraining workflow. Both share the same underlying weakness: unsanitized dynamic code generation or execution based on user-controlled input.

Official announcement: Please see the official link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5718

CVE-2025-12907: About Devtools in Google Chrome  (12th Nov 2025)

Preface: Google Chrome comes with DevTools built directly into the browser by default. These are a comprehensive set of web developer tools that allow users to inspect and debug web pages, analyze network activity, monitor performance, and much more.

You can use the snippets option available in the Sources tab in Chrome DevTools. Just type out your code as you do in a code editor and hit Ctrl+Enter (or click on the Run snippet button available there) to execute the code, and the output displays in the console at the bottom.

Background: In Chrome DevTools, Copy as cURL is a feature in the Network panel that lets you right-click on a network request and copy it as a curl command. This command replicates the HTTP request outside the browser, including:

  • URL
  • HTTP method (GET, POST, etc.)
  • Headers (cookies, authorization tokens, etc.)
  • Payload (for POST/PUT requests)

Vulnerability details: Insufficient validation of untrusted input in Devtools in Google Chrome prior to 140.0.7339.80 allowed a remote attacker to execute arbitrary code via user action in Devtools. (Chromium security severity: Low)

This CVE refers to a design flaw in Chrome DevTools that could allow unauthorized access or privilege escalation under certain conditions. Based on the advisory, the flaw likely involves:

  • Improper isolation or authentication when DevTools is exposed.
  • Attackers exploiting CDP endpoints without proper permission checks.
Best-Practice Checklist
  1. Always update to Chrome ≥ 140[.]0[.]7339[.]80.
  2. Launch with --remote-debugging-port and --user-data-dir (non-default).
  3. Avoid exposing the CDP port; use localhost only or protected proxies like devtoolium.
  4. Audit DevTools features (e.g., “Copy as cURL”) before exposing them via CDP.
  5. For automation, consider Chrome for Testing for safer, isolated CDP usage.

Official announcement: Please refer to the link for details –

https://www.tenable.com/cve/CVE-2025-12907

CVE-2025-12863: About the libxml2 XML parsing library (11th Nov 2025)

Preface: Libxml2, a C library for parsing and manipulating XML documents, can be relevant in machine learning contexts when dealing with data stored or exchanged in XML format. While not a machine learning library itself, libxml2, or its Python binding lxml, serves as a foundational tool for data preparation and feature engineering.

Ref: The “difference” between Libxml and Libxml2 is that they are essentially the same thing, with “libxml2” being the official and specific name for the library.

Background: Moving nodes between XML documents can happen in machine learning workflows, especially during data integration, comparison, or transformation.  

When would you be at risk?

You’d be at risk if:

  • You use libxml2 or lxml to move nodes from one document to another (e.g., merging XML trees).
  • The underlying library internally calls xmlSetTreeDoc() during such operations.
  • The original document gets freed while namespace pointers still reference it.

Vulnerability details: A flaw was found in the xmlSetTreeDoc() function of the libxml2 XML parsing library. This function is responsible for updating document pointers when XML nodes are moved between documents. Due to improper handling of namespace references, a namespace pointer may remain linked to a freed memory region when the original document is destroyed. As a result, subsequent operations that access the namespace can lead to a use-after-free condition, causing an application crash.

Official announcement: Please refer to the link for details.

https://www.tenable.com/cve/CVE-2025-12863

https://access.redhat.com/security/cve/cve-2025-12863

CVE-2025-62161: About Youki, a container runtime written in Rust (10th Nov 2025)

Preface: Amazon, Google, Microsoft, Apple, Cloudflare, Coursera, Discord, Dropbox, and Figma are among the large companies employing Rust for various purposes, including cloud infrastructure, operating system components, web services, and low-level security components.

Youki is gaining increasing attention as a container runtime, especially in the Rust ecosystem, but it has not yet achieved the widespread adoption of mature runtimes such as runc or crun.

runc is a common and fundamental component in Docker environments. runc is the default low-level container runtime used by Docker Engine. Kubernetes, through its Container Runtime Interface (CRI), interacts with a CRI-compliant runtime like containerd or CRI-O. So, while Kubernetes doesn’t directly interact with runc, it relies on higher-level runtimes that use runc for the low-level container operations.

Background: A new container runtime in Rust. youki has the potential to be faster and use less memory than runc. While runc’s memory usage during container initialization can be around 2.2-3MB, Youki aims to reduce this footprint.

Rust may not be the primary language for data scientists, but it plays a vital role in providing the underlying infrastructure for training and deploying large AI models.

Vulnerability details: Youki is a container runtime written in Rust. In versions 0.5.6 and below, the initial validation of the source /dev/null is insufficient, allowing container escape when youki utilizes bind mounting the container’s /dev/null as a file mask.

Remedy: This issue is fixed in version 0.5.7.

Official announcement: Please refer to the link for details –

https://www.tenable.com/cve/CVE-2025-62161

About CVE-2025-47365: Qualcomm integrated with automotive platform (7th Nov 2025)

Published: 11/03/2025

Preface: GM’s Ultra Cruise system is supposed to be a more capable iteration of its Super Cruise ADAS that was first introduced in 2017.

To dig into the weeds a bit, the Ultra Cruise compute, which is about the size of two laptops stacked together, is made up of two Snapdragon SA8540P SoCs and one SA9000P AI accelerator to deliver low-latency control functions on 16-core CPUs and AI compute of more than 300 Tera operations per second for camera, radar and lidar processing.

Background: The Snapdragon Ride Platform accelerates the shift to software-defined vehicles by empowering developers to create AI-driven automated driving solutions. The system-on-chip, called Snapdragon Ride Platform, was developed for advanced driver assistance systems (ADAS) and automated driving. It’s one of a suite of cloud-connected platforms introduced by Qualcomm.

The Qualcomm Cloud AI 100/AIC100 family of products (including SA9000P – part of Snapdragon Ride) are PCIe adapter cards which contain a dedicated SoC ASIC for the purpose of efficiently running Artificial Intelligence (AI) Deep Learning inference workloads. They are AI accelerators.

Qualcomm’s Snapdragon SA8540P SoCs and SA9000P AI accelerator use the QNX Neutrino RTOS for safety-critical functions in automated driving systems.

Vulnerability details: Integer Overflow or Wraparound in Automotive Platform

Description – Memory corruption while processing large input data from a remote source via a communication interface.

Official announcement: Please refer to link for details –

https://docs.qualcomm.com/product/publicresources/securitybulletin/november-2025-bulletin.html

Security Bulletin:  About NVIDIA RunAI – CVE-2025-33176 (6th Nov 2025)

Preface: NVIDIA Run:ai is a Kubernetes-native platform for managing and optimizing AI workloads, acquired by NVIDIA in 2024. It provides dynamic orchestration for GPU resources, supporting flexible resource allocation to improve resource utilization and accelerate the AI ​​development lifecycle in hybrid, on-premises, and cloud environments. Prior to its acquisition by NVIDIA in December 2024, Run:ai was an Israeli software startup founded in 2018, focused on AI infrastructure management. Its core product is Kubernetes-based software for managing and optimizing GPU workloads for AI applications, helping enterprises utilize hardware more efficiently. The company’s software assists enterprises in managing GPU resources in on-premises, cloud, and hybrid environments. Run:ai does not “change its role in machine learning,” but rather focuses on the key challenges of AI workload management and GPU orchestration throughout the broader Machine Learning Operations (MLOps) lifecycle.

Background: NVIDIA Run:ai is a Kubernetes-based platform that acts as a GPU orchestrator to maximize the efficiency of AI and machine learning workloads. It addresses challenges like managing expensive GPU resources by enabling dynamic, policy-based scheduling, allowing for the sharing of GPUs across teams and projects, and optimizing workload performance for training, tuning, and inference. Run:ai integrates into existing hybrid or on-premises AI infrastructure to improve GPU utilization and accelerate AI development cycles.

NVIDIA DGX Cloud primarily leverages Kubernetes (K8s) for orchestrating and managing AI workloads.

While DGX systems historically used Docker for containerization, DGX Cloud, particularly for advanced AI workloads and resource management, relies on Kubernetes for its scalability, high-performance computing capabilities, and efficient GPU resource allocation. This is often integrated with other NVIDIA software like NVIDIA NeMo and NVIDIA Run:ai, and deployed on cloud services such as Amazon Elastic Kubernetes Service (Amazon EKS).

Vulnerability details: NVIDIA RunAI for all platforms contains a vulnerability where a user could cause an improper restriction of communications channels on an adjacent network. A successful exploit of this vulnerability might lead to escalation of privileges, data tampering, and information disclosure.

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5719

CVE-2025-47353: About Automotive Software platform based on QNX (5th Nov 2025)

Preface: An automotive cockpit is the driver’s compartment, integrating all the controls and information displays needed to operate a vehicle, including the steering wheel, dashboard, instruments, and central displays.

Background: To install QNX on a Qualcomm SA8775P chip, you will need the specific QNX Board Support Package (BSP) for that platform, as it contains the necessary hardware-specific software to get the OS running. The installation process will involve loading the BSP onto the chip, which provides the essential startup code and device drivers needed to run QNX. Since this is an automotive-grade chip used for cockpit and driving functions, it’s likely an OEM or a specialized automotive development partner would be handling this process.

QNX handles guest virtual machine (GVM) memory by using a hypervisor to manage the host physical memory, which the guest sees as contiguous. The hypervisor allocates memory to a guest and uses services like “smmuman” to program the IOMMU/SMMU to protect the guest’s memory from the host and other guests. The hypervisor maps host physical memory to the guest’s guest-physical memory, and can use shared memory regions for efficient inter-VM communication.

Vulnerability details:

Title – Exposed Dangerous Method or Function in Automotive Software platform based on QNX

Description – Memory corruption while processing request sent from GVM.

Technology Area – Automotive Software platform based on QNX

Vulnerability Type – CWE-749: Exposed Dangerous Method or Function

Access Vector     Local

Security Rating   High

CVSS Rating        High

CVSS Score         7.8

Due to a lack of detailed supplier information, we discovered through observation that…

Attack Surface:

  • Shared memory + IPC exposed to guest.

Threats:

  • Buffer overflow → hypervisor memory corruption.
  • Capability spoofing → unauthorized access.

Mitigations:

  • Mediator validation.
  • Capability-based security.
  • Immutable protocol with checksum.

Official announcement: Please refer to the link for details –

https://docs.qualcomm.com/product/publicresources/securitybulletin/november-2025-bulletin.html

TEE.fail: Another perspective on how intercepting DDR5 memory bus can compromise a trusted execution environment. (4th Nov 2025)

Preface: The recent research, released in a paper titled “TEE.fail: Breaking Trusted Execution Environments via DDR5 Memory Bus Interposition”, does not change Intel’s previous out of scope statement for these types of physical attacks.

Background: Intel SGX protects memory by creating encrypted “enclaves,” which are isolated, private regions within an application’s address space. These enclaves are stored in the Enclave Page Cache (EPC) within the processor’s reserved memory (PRM), and the CPU encrypts data as it is written to the EPC and decrypts it on the fly as it’s read from the CPU, preventing unauthorized access from even privileged software like the operating system or hypervisor.

When the CPU writes data to memory, the memory controller uses the plaintext and address as input to deterministically encrypt it. Writing the same plaintext to the same address will always produce the same ciphertext. Attackers cannot directly read a victim’s secret messages through aliasing mechanisms. However, a doctoral researcher at KU Leuven in Belgium claims it is possible to capture a victim’s ciphertext, and then the victim can simply replay the ciphertext at the same physical location to decrypt it into valid but outdated plaintext.

Ref: ACPI Machine Language (AML) is the platform independent code that ACPI utilizes. ASL is ACPI source language. It is a more human-readable form of the byte code that is AML.

For example the OS has a driver for an Embedded Controller device, and AML can actually talk to the OS driver. Or ACPI can reserve certain hardware resources, so that AML can use them directly, and the OS knows it is not allowed to use them.

Vulnerability details: Independent researchers have separately published methods to attack Intel® Software Guard Extensions (Intel® SGX) with a physical interposer device.

In the WireTap paper, researchers from Georgia Tech and Purdue University applied a passive interposer to read ciphertext memory of low entropy data to create a ciphertext-to-plain-text dictionary.

In the Battering RAM paper, researchers from KU Leuven and University of Birmingham developed a custom interposer to actively alias memory and gain arbitrary read/write access into Intel SGX-protected memory.​

Both research teams assume a physical adversary has direct access to the hardware with a memory bus interposer.

Official announcement: Please refer to the link for details – https://www.intel.com/content/www/us/en/security-center/announcement/intel-security-announcement-2025-10-28-001.html