Category Archives: AI and ML

CVE-2025-33179 and CVE-2025-33180: About NVIDIA Cumulus Linux and NVOS products (25-02-2026)

Preface: NVIDIA InfiniBand switches are based on technology from Mellanox Technologies. Nvidia Spectrum switches are based on technology from Mellanox Technologies. The Spectrum switch ASIC portfolio, originally developed by Mellanox for high-performance Ethernet networking, was rebranded under Nvidia and is now a core component of Nvidia’s networking division. NVIDIA completed the acquisition of Mellanox Technologies, a major supplier of high-performance interconnect technology (switches, NICs), in April 2020 for approximately $7 billion. This strategic move enhanced NVIDIA’s data center networking capabilities, specifically in InfiniBand and Ethernet, to support AI and high-performance computing.

Background: Cumulus Linux is optimized for Ethernet fabrics, while NVOS/Onyx is largely utilized in high-performance InfiniBand environments.

-Key switches supporting NVIDIA Cumulus Linux include:

  • Spectrum-4: SN5600, SN5600D, SN5400
  • Spectrum-2/3: SN3700, SN3700C, SN4600, SN4700
  • Spectrum-1: SN2700, SN2100, SN2745 

Example: Spectrum-4 series (including the SN5600, SN5600D, and SN5400) is a line of physical Ethernet switches (hardware)

Use Cases: Ideal for hyperscale cloud data centers and enterprise AI networks, emphasizing scalability and full customizability.

-NVOS (NVIDIA Onyx) or similar OS typically supports:

Quantum/Quantum-2 InfiniBand: Switches designed for high-performance AI, such as the Quantum-2 series.

Use Cases: Focused on High-Performance Computing (HPC) and large-scale AI training clusters (AI Factories), particularly environments utilizing NVLink for GPU interconnects.

Note: As of early 2026, NVIDIA is focusing on standardizing the management commands (NVUE) across both systems to reduce the complexity of automation workflows when transitioning between different operating systems.

Cumulus Linux (Native Linux): When you SSH in, you land in a standard Debian Linux bash shell. You configure the switch using the NVUE (NVIDIA User Experience) object model via the nv command (e.g., nv set interface swp1…).

Vulnerability Note: The CVEs (CVE-2025-33179/33180) specifically target the NVUE API and CLI engine found in Cumulus Linux 5.x and later.

Vulnerability details:

CVE-2025-33179 NVIDIA Cumulus Linux and NVOS products contain a vulnerability in the NVUE interface, where a low-privileged user could run an unauthorized command. A successful exploit of this vulnerability might lead to escalation of privileges.

CVE-2025-33180 NVIDIA Cumulus Linux and NVOS products contain a vulnerability in the NVUE interface, where a low-privileged user could inject a command. A successful exploit of this vulnerability might lead to escalation of privileges.

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5722

AMD-SB-3043 all aspects, side-channel analysis for privacy applications on confidential VMs (24th Feb 2026)

Preface: AMD-SB-3042 is a formal advisory for a specific vulnerability, while AMD-SB-3043 is an advisory regarding an analytical tool (SNPeek) used to detect such vulnerabilities.

Background: If AMD zen5 operating AMD SEV-SNP, traditional hypervisor especially VMware or Hyper-V management include encryption will handle by hypervisor embedded in Xen5, so SNPeek collect the traffic is un-encrypted data. Details shown as below:

Limitations of SNPeek If using a tool like SNPeek to intercept traffic on the host side: It can only see data marked as “Shared” (usually for use by the hypervisor to assist with network or disk I/O). Data in Private Memory is always encrypted to SNPeek; the hypervisor cannot read its plaintext content at all. Potential Risk Warning Despite strong hardware encryption, the recently discovered StackWarp (CVE-2025-29943) vulnerability shows that a malicious hypervisor could still influence the execution path of Zen 5 virtual machines by manipulating the CPU’s internal “Stack Engine.” While this doesn’t mean it can directly “read” encrypted memory, it can achieve indirect attacks.

AMD-SB-3043: Analytical Framework (SNPeek)

  • Nature: A bulletin regarding a research framework and toolkit for evaluating side-channel risks in Confidential VMs (CVM).
  • Core Content: Describes the SNPeek open-source toolkit.
  • Function & Purpose:
    • SNPeek is not a single vulnerability but an automated analysis pipeline that uses machine learning to assess how sensitive a CVM application is to side-channel attacks.
    • It helps developers quantify how much information an application might leak when running in encrypted environments like SEV-SNP.
    • It provides configurable attack primitives to help developers locate “weak points” in their code and guides the implementation of mitigations (e.g., oblivious memory access).

Official details and announcement: AMD’s assessment is that all side-channel techniques demonstrated in the paper fall within the category of already known, documented, and out of scope behaviors according to the published SEV/SNP threat model.  AMD has  introduced features on Zen 5 processors—specifically Ciphertext Hiding and PMC Virtualization—that address the ciphertext visibility and HPC based leakage paths highlighted by the researchers.

AMD recommends software developers employ existing best practices, including constant-time algorithms, and avoiding secret-dependent data accesses where appropriate. Please refer to the link for details – https://www.amd.com/en/resources/product-security/bulletin/amd-sb-3043.html

Edge TPU (an ASIC accelerator developed by Google) – Episode 1 (23rd Feb 2026)

Preface: PyCoral is specifically a TPU processing technique. While TensorFlow Lite (TFLite) can run on a standard CPU, PyCoral is the dedicated library used to delegate those operations to the Edge TPU hardware.

PyCoral API: This is a Python library specifically designed by Google to run inference on Coral Edge TPU hardware, such as the Coral USB Accelerator or M.2 modules. It is built on top of TensorFlow Lite.

Nvidia H100: This is a high-end data center GPU based on the Hopper architecture. It uses Nvidia’s proprietary software stack, including the CUDA toolkit, TensorRT, and the Transformer Engine to accelerate AI workloads.

Background: It is accurate to say that foundational memory management principles—specifically allocation and copying (malloc/new, memcpy)—are the basis for both CUDA/TensorRT and Coral API inference, though they operate on different memory spaces.

  • CUDA/TensorRT (GPU-centric): Uses cudaMalloc and cudaMemcpy to manage dedicated GPU device memory.
  • PyCoral API/TFLite (CPU-centric/Edge): Primarily uses malloc or new for CPU-based input/output buffers and memcpy to manage memory within host memory, even when interacting with the Edge TPU.

In both cases, efficient management of data movement between host (CPU) and device (GPU/TPU) is key, making memory allocation and copying the common denominator.

PyCoral API (pycoral module): This is a Python library built on top of the TensorFlow Lite Python API (tflite_runtime). It provides convenience functions and additional features (like model pipelining and on-device transfer learning) to simplify development with Python.Coral C++ API (libcoral): This is a C++ library built on top of the TensorFlow Lite C++ API. It offers the same functionality as the PyCoral API but for C++ applications.

Cyber security focus: But the most common vulnerability occurs when developers call [.]get() to obtain the raw pointer, and then continue to use that raw pointer after the std::unique_ptr has gone out of scope or been destroyed. Is the C++ TPU programming related to this issue? Please refer to the recommendations in the diagram for details.

CVE-2025-33239 and CVE-2025-33240: Regarding NVIDIA Megatron Bridge (20th Feb 2026)

Preface: Artificial intelligence is both harmful and beneficial. Why is it harmful? Fundamentally, it reduces opportunities for low-skilled jobs. Speakers chant slogans like “Smart living, increased productivity.” However, its underlying problems seem difficult to conceal, so you can learn about the latest developments in artificial intelligence from online newspapers and articles. Today, when you seek answers from artificial intelligence, the answers it provides may not be the truth! Why have humans been able to survive and thrive on Earth for thousands of years? The answer is: survival of the fittest.

Background: Megatron-Core and Megatron-LM are open-source tools that are typically used together to train LLMs at scale across GPUs. Megatron-Core expands the capability of Megatron-LM.

NeMo Megatron Bridge is utilized by AI researchers, infrastructure engineers, and developers focused on high-performance training and fine-tuning of large language models (LLMs) and foundation models, particularly those bridging the Hugging Face ecosystem with NVIDIA’s Megatron-Core. NVIDIA H100 GPU introduced support for a new datatype, FP8 (8-bit floating point), enabling higher throughput of matrix multiplies and convolutions. Megatron Bridge uses the NVIDIA TransformerEngine (TE) to leverage speedups from FP8.

While NVIDIA developed Megatron Bridge to facilitate checkpoint conversion between NVIDIA NeMo and other deep learning frameworks, OpenAI utilizes its own internal infrastructure. As of 2026, NVIDIA Megatron Bridge is primarily used by large enterprises, Cloud Service Providers (CSPs), and Sovereign AI initiatives that need to train or deploy open-source models (such as Llama 3, Mistral, or Qwen) at massive scale on NVIDIA hardware.

Vulnerability details:

CVE-2025-33239 NVIDIA Megatron Bridge contains a vulnerability in a data merging tutorial, where malicious input could cause a code injection. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure, and data tampering.

CVE-2025-33240 NVIDIA Megatron Bridge contains a vulnerability in a data shuffling tutorial, where malicious input could cause a code injection. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5781

CVE-2025-33245: NVIDIA NeMo 2.0+ shifts away from pickle (19th Feb 2026)

Preface: NeMo 2.0 is NVIDIA’s major modernization of the NeMo ecosystem.

Two things to remember about NeMo 2.0:

1. NeMo 2.0 is the training & model building framework.

It focuses on:

•               Model architectures (LLMs, ASR, TTS, multimodal)

•               Training pipelines

•               NeMo Run + NeMo-based microservices

•               Distributed GPU/accelerated workflows

2. NeMo Guardrails and NeMo Curator are NOT part of the NeMo 2.0 training stack.

They live adjacent to NeMo 2.0, serving two different lifecycle phases.

Background: NeMo 1.x modules (ASR collections, VAD, etc.) used pickle because they relied heavily on Python multiprocessing and Python objects.

NeMo 2.0 is moving toward language  and framework agnostic formats

Instead of pickle, NeMo 2.0 favors:

•               Safetensors (for weights)

•               JSON / YAML (for metadata)

•               Parquet (for curated datasets)

•               Numpy / torch tensors loaded explicitly

•               HuggingFace compatible formats

These formats are:

•               Safe

•               Portable across hardware and OS

•               Usable by non Python systems

•               Compatible with cloud trust boundaries

NeMo Curator and NeMo Guardrails are designed to avoid pickle entirely

Even though older NeMo components still used pickle internally:

  • NeMo Curator does not ingest pickle data
  • NeMo Guardrails never used pickle at all
  • NeMo 2.0 framework minimizes it or removes it

This aligns with modern security guidance for LLM infrastructure.

Vulnerability details: CVE-2025-33245 NVIDIA NeMo Framework contains a vulnerability where malicious data could cause remote code execution. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5762

CVE-2026-2242: Vulnerabilities in janet-lang may also affect partner devices. (11th Feb 2026)

Preface: Janet has a small footprint: it fits in environments where 2.5 MB of RAM is considered “plenty”. While Janet does not run on the GPU (it is a CPU-bound language), it is often used as the control/orchestration layer on heterogeneous AI platforms. Because Janet is written in C and compiles to a small binary (roughly 200–300 KB), it is frequently used on ARM Cortex-based systems. While Janet does not run on the GPU (it is a CPU-bound language), it is often used as the control/orchestration layer on heterogeneous AI platforms.

Background: Janet can be used to manage the data pipeline, calling into C/C++ libraries that handle heavy GPU lifting via CUDA. If code allow external scripts or users to submit code dynamically, it can use Janet’s built-in eval-string function. Does it vulnerable to CVE-2026-2242.

My speculation: Using eval-string does expose your Jetson pipeline to CVE‑2026‑2242, because:

CVE‑2026‑2242 is triggered during compilation of Janet code, and eval-string compiles code dynamically. If a malicious user submits a specially-crafted Janet expression that enters the vulnerable path inside:

janetc_if  →  specials[.]c

then Janet may perform an out‑of‑bounds read, which can cause:

  • interpreter crash
  • denial of service
  • undefined behavior inside the Janet process
Even though the CVE requires “local execution,”

-allowing remote users to submit code and then calling eval-string makes that local execution possible.

Therefore, theJetson pipeline becomes exploitable.

Vulnerability details: A vulnerability was determined in janet-lang janet up to 1.40.1. This impacts the function janetc_if of the file src/core/specials.c. Executing a manipulation can lead to out-of-bounds read. The attack needs to be launched locally. The exploit has been publicly disclosed and may be utilized. This patch is called c43e06672cd9dacf2122c99f362120a17c34b391. It is advisable to implement a patch to correct this issue.

Official announcement: Please refer to the link for more details –

https://www.tenable.com/cve/CVE-2026-2242

CVE-2026-25592: An Arbitrary File Write vulnerability has been identified in Microsoft’s Semantic Kernel .NET SDK (10th Feb 2026)

Preface: Microsoft has launched several products (Semantic Kernel, Microsoft.Extensions.AI, and Azure.OpenAI), which initially caused confusion for developers. Furthermore, the Semantic Kernel is currently being “upgraded” to the new Microsoft Agent Framework, leading some developers to question future support for the Semantic Kernel.

Python, with its rich libraries and large open-source community, remains the “universal language” in artificial intelligence research and data science. LangChain, as the primary alternative, is also based on Python. Python remains dominant in the field of artificial intelligence/machine learning.

Background: Microsoft developed Semantic Kernel as an open-source SDK to bridge conventional programming languages (C#, Python, Java) with advanced LLMs, enabling developers to build enterprise-grade, agentic AI applications. It simplifies orchestrating complex AI workflows, allows swapping models without rewriting code, and ensures secure, compliant integration of AI with existing systems.

Semantic Kernel uses plugins (formerly skills or functions) to extend its capabilities beyond its core prompt engineering functionality and integrate with external services, data sources, and API.

  • Semantic Kernel acts as an orchestrator, using the LLM to decide which plugins to use and when, effectively automating complex tasks that involve multiple steps and tools. The LLM determines the necessary sequence of actions to fulfill a user’s request.
  • Plugins allow the LLM to interact with real-world applications and data. For example, a plugin could retrieve real-time weather information, search a database, book a flight, or send an email.

The airline and travel industry is beginning to use Microsoft Semantic Kernel to build intelligent, AI-powered applications, particularly for automating customer service and booking processes. Developers are using Semantic Kernel to build dialogue agents that can understand complex booking instructions, such as “book the cheapest flight from Hong Kong to Tokyo”, and handle the booking process independently.

Vulnerability details: CVE-2026-25592 Semantic Kernel is an SDK used to build, orchestrate, and deploy AI agents and multi-agent systems. Prior to 1.70.0, an Arbitrary File Write vulnerability has been identified in Microsoft’s Semantic Kernel .NET SDK, specifically within the SessionsPythonPlugin. The problem has been fixed in Microsoft.SemanticKernel.Core version 1.70.0. As a mitigation, users can create a Function Invocation Filter which checks the arguments being passed to any calls to DownloadFileAsync or UploadFileAsync and ensures the provided localFilePath is allow listed.

Official announcement: Please refer to the link for details –https://nvd.nist.gov/vuln/detail/CVE-2026-25592

CVE-2025-33220 only applies to NVIDIA vGPU deployments running on hypervisors, such as TKGI clusters on vSphere. (2 Feb 2026)

Preface: When comparing VMware TKGI, Docker, and Kubernetes (K8s) for CUDA (NVIDIA’s parallel computing platform) workflows, the “best” choice depends on your scale and infrastructure.

Choose Docker – if you are a data scientist doing local model development.

Choose Native Kubernetes – if you are building a large-scale AI platform on physical hardware (Bare-metal) for maximum performance.

Choose VMware TKGI – if you need high availability, vGPU flexibility, and are already heavily invested in the VMware ecosystem.

Background: CVE‑2025‑33220 lives in the hypervisor’s vGPU Manager, not in:

  • Docker
  • Containerd
  • Kubernetes
  • NVIDIA Container Runtime
  • NVIDIA Docker runtime
  • PyTorch/TensorFlow workloads
  • CUDA libraries inside containers

CVE‑2025‑33220 requires:

  1. Freeing an object inside the hypervisor
  2. A later operation accessing that SAME freed internal heap structure
  3. The hypervisor NOT realizing the handle is stale
  4. A malformed RM object relationship or command sequence
  5. Conditions normal CUDA applications never generate

If there is no hypervisor-based vGPU, there is no attack surface, because:

  • The ioctl path stops at the bare‑metal NVIDIA GPU driver
  • There is no vGPU Manager backend
  • No vGPU protocol messages are generated
  • No hypervisor memory structures exist to exploit

The CVE is triggered only under very specific hypervisor‑internal states that normal or even “weird order” RMAPI usage will never produce.

Vulnerability details: CVE-2025-33220 – NVIDIA vGPU software contains a vulnerability in the Virtual GPU Manager, where a malicious guest could cause heap memory access after the memory is freed. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, data tampering, denial of service, or information disclosure.

Official announcement: Please refer to the link for details.- https://nvidia.custhelp.com/app/answers/detail/a_id/5747

In-depth analysis of CVE-2025-33234: NVIDIA runx contains this vulnerability. (30-01-2026)

Preface: According to the NVIDIA security bulletin for runx updated January 23, 2026 – A vulnerability (CVE‑2025‑33234) was found “during End of Support, but prior to End of Life.”

Background: An ML hyperparameter sweep is the automated process of systematically testing different combinations of hyperparameter values for a machine learning model to identify the configuration that yields the best performance. Hyperparameters are external settings that control the learning process, such as the learning rate or number of hidden layers, and must be set before training begins.

NVIDIA runx (Deep Learning Library) – This is a lightweight Python tool used for experiment management and hyperparameter sweeps.

Vulnerability details: CVE-2025-33234 NVIDIA runx contains a vulnerability where an attacker could cause a code injection. A successful exploit of this vulnerability might lead to code execution, denial of service, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5764

Appendix: runx is a local Python script for organizing your personal files, while NVIDIA Run:ai is a cloud-based infrastructure for managing hardware. To move from a local developer environment using runx to an enterprise environment using Run:ai, you must transition from a “local script” workflow to a “containerized” workflow.

Realistic examples where Run:ai WOULD be impacted (conceptual, not exploit details)

Scenario 1

runx injects malicious commands into a Python file your training code imports →
you copy the folder into Docker →
Run:ai runs that code →
the malicious code executes.

Scenario 2

runx generates a poisoned submit_cmd[.]sh or config file →
you include it in the Docker image →
your entrypoint or tools accidentally run it.

Scenario 3

runx modifies your dataset or preprocessing script →
container uses that script →
it runs inside Run:ai.

Reminder: These examples show pollution can travel only if you physically package it into the container.

CVE-2025-13952: Advanced driver assistance systems (ADAS) and smartphones stay alert! (27-1-2026)

Preface: The Unified Shading Cluster (USC) is a fundamental part of the PowerVR Rogue architecture and subsequent series. The USC acts as the central shader core, where vertex, fragment, and compute tasks are executed on unified hardware. It is central to PowerVR’s Tile-Based Deferred Rendering (TBDR) architecture. To write shader code for Imagination Technologies’ PowerVR architectures (often referred to in technical documentation alongside the USC or “Universal Shading Cluster” and Volcano shader cores), you use the OpenGL ES Shading Language (GLSL ES). 

Background: In the context of Imagination Technologies (IMG) GPUs, libusc (Unified Shading Cluster library) is a critical component of the GPU Driver Development Kit (DDK), primarily serving as the back-end compiler library for shader programs.

Is it possible to combine [.]frag in html web page?

The most common way is to place the shader code inside a <script> tag with a custom type attribute, such as x-shader/x-fragment. Browsers do not execute these scripts because they don’t recognize the type, but the text remains accessible via the Document Object Model (DOM).

The design flaw mentioned in CVE-2025-13952 typically occurs if:

•         User-controlled input is inserted into these <script> tags without sanitization.

•         Your code later injects shader text back into the DOM using innerHTML or evaluates it as JavaScript.

Vulnerability details: A web page that contains unusual GPU shader code is loaded from the Internet into the GPU compiler process triggers a write use-after-free crash in the GPU shader compiler library. On certain platforms, when the compiler process has system privileges this could enable further exploits on the device. The shader code contained in the web page executes a path in the compiler that held onto an out of date pointer, pointing to a freed memory object.

Official announcement: Please refer to the link for details.

https://www.tenable.com/cve/CVE-2025-13952