Category Archives: AI and ML

CVE-2026-33579: (OpenClaw 2026.3.28 or later) will also address a CVSS 9.9 token rotation race condition flaw allowing full admin access and remote code execution (9th April 2026)

Preface: Unlike ChatGPT, which is a conversational chatbot, OpenClaw is designed to act. It receives a high-level goal, breaks it down into structured tasks, calls APIs, executes shell commands, and iterates until the objective is complete.

Installing OpenClaw (formerly ClawdBot) to collaborate with OpenAI on a smartphone that already contains WhatsApp is designed to achieve autonomous, proactive, and secure personal AI assistance directly within a messaging interface.

Background: OpenClaw’s primary design objective is to transition AI from a passive, conversational interface into a proactive, action-oriented autonomous agent that can independently execute multi-step workflows across a user’s local operating system and external cloud services.

It is architected as an “AI Gateway” or agent runtime rather than a standalone model, serving as the “hands” for an artificial brain by connecting large language models (LLMs) to real-world tools, files, and messaging platforms.

Older versions of the OpenClaw might have stored permissions in a “sticky” historical field (the legacy role fields). Without this check:

•You might revoke an agent’s access in the new dashboard.

•The system might see “no active tokens” and accidentally “fall back” to old settings.

•The agent would regain access you intended to take away.

Vulnerability details: OpenClaw before 2026.3.28 contains a privilege escalation vulnerability in the /pair approve command path that fails to forward caller scopes into the core approval check. A caller with pairing privileges but without admin privileges can approve pending device requests asking for broader scopes including admin access by exploiting the missing scope validation in extensions/device-pair/index[.]ts and src/infra/device-pairing[.]ts.

Official announcement: Please refer to link for details –

https://nvd.nist.gov/vuln/detail/CVE-2026-33579

CVE-2026-24164 and CVE-2026-24165: About BioNeMo Framework (06 April -2026)

Preface: DNA models like DNABERT and Evo2 are Genomic Foundation Models (gLMs), which treat the DNA sequence of 4 letters (A (Adenine), C (Cytosine), T (Thymine), and G (Guanine).) as a “language” to learn the fundamental rules, patterns, and “syntax” governing life.

Similar to how Large Language Models (LLMs) like GPT are pre-trained on vast amounts of text to understand English, these DNA models are pre-trained on billions to trillions of base pairs (nucleotides) from diverse species to understand the “grammar” of genomes, including the 98% that is non-coding.

Background: For a DNA repository, NVIDIA BioNeMo (the life sciences extension of NeMo) handles the heavy lifting of transforming raw genetic sequences into “usable intelligence”. It is used for more than just simple normalization; it provides a specialized pipeline for pre-training, fine-tuning, and analyzing genomic data.

Here is how the workflow typically functions for DNA data:

1. Data Preparation & Preprocessing

Instead of generic text normalization, BioNeMo uses specialized scripts to prepare genomic data (like the GRCh38 human genome) for AI.

•Chunking: Breaking long chromosomal sequences into manageable segments (e.g., 512 nucleotides).

•Tokenization: Converting DNA “letters” (A, C, G, T) into numerical tokens. Advanced models like DNABERT-2 use Byte Pair Encoding (BPE) to process sequences up to 5x more efficiently than older methods.

•Standardization: Organizing raw genomic data into structured formats like FASTA or CSV that the training framework can ingest.

2. Categorization & Functional Prediction – details not described here

3. Downstream Analysis – details not described here

Vulnerability details:

CVE-2026-24164 – NVIDIA BioNeMo contains a vulnerability where a user could cause a deserialization of untrusted data. A successful exploit of this vulnerability might lead to code execution, denial of service, information disclosure, and data tampering.

CVE-2026-24165 – NVIDIA BioNeMo contains a vulnerability where a user could cause a deserialization of untrusted data. A successful exploit of this vulnerability might lead to code execution, denial of service, information disclosure, and data tampering.

Official announcement: Please refer to link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5808

CVE-2026-24148, CVE-2026-24154 and CVE-2026-24153: About NVIDIA Jetson (2nd-April-2026)

Preface: NVIDIA JetPack and Jetson Linux (formerly L4T – Linux for Tegra) are the foundational software stacks for NVIDIA Jetson AI modules. Jetson Linux provides the essential BSP (bootloader, Linux kernel, Ubuntu rootfs, drivers), while JetPack SDK bundles this with developer tools, libraries (CUDA, TensorRT), and APIs for AI, computer vision, and robotics.

Background: The initrd and root file system (rootfs) unencrypted creates a significant security gap against local physical attacks. In a standard industrial or autonomous deployment, physical access is often the most direct threat to a machine’s integrity.

The Security Gap: Local Physical Access

When a Jetson device is left with its default, unencrypted configuration, an attacker with physical access can easily bypass system protections:

Because the bootloader cannot read encrypted files directly, it must first mount an unencrypted partition containing the kernel and initrd images. Without signing or encryption, these critical files can be replaced via a malicious USB or NVMe drive.

Ref: nvluks-srv-app is a NVIDIA Jetson Linux user-space application used to retrieve a unique, secure passphrase from the Trusted Execution Environment (TEE) to unlock encrypted partitions at boot time. It enables disk encryption on Jetson devices by facilitating secure communication between the normal operating system and the hardware-backed security services (OP-TEE).

Vulnerability details:

CVE-2026-24154 NVIDIA Jetson Linux has vulnerability in initrd, where an unprivileged attacker with physical access could inject incorrect command line arguments. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, denial of service, data tampering, and information disclosure.

CVE-2026-24148 NVIDIA Jetson for JetPack contains a vulnerability in the system initialization logic, where an unprivileged attacker could cause the initialization of a resource with an insecure default. A successful exploit of this vulnerability might lead to information disclosure of encrypted data, data tampering, and partial denial of service across devices sharing the same machine ID.

CVE-2026-24153 NVIDIA Jetson Linux has a vulnerability in initrd, where the nvluks trusted application is not disabled. A successful exploit of this vulnerability might lead to information disclosure.

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5797

CVE-2026-5164: A flaw has been found in virtio-win. Don’t underestimate this; the field of artificial intelligence also needs virtio-win! (1st April 2026)

Preface: While NVIDIA CUDA provides powerful parallel processing capabilities on both Linux and Windows, developers still need to run Windows on top of Linux using virtio-win in several specific environments:

For example: Windows-Exclusive HPC Applications

Many specialized scientific and engineering applications are only developed for Windows and cannot be easily recompiled for Linux.

Background: To programmatically use RhelDoUnMap() while ensuring user requests are correctly validated, you must specifically address the descriptor count validation to prevent buffer overflows. This function is part of the virtio-win drivers used in Red Hat Enterprise Linux environments.

Key Components of virtio-win include Network (NetKVM), Storage (viostor / virtio-scsi), Memory Balloon (balloon), Serial (virtio-serial), Graphics (virtio-gpu), Input (virtio-input) and Guest Agent (qemu-ga). The RhelDoUnMap() function is part of the virtio-win driver suite, specifically within the VioStor (Virtio Storage) driver.

Vulnerability details : A flaw was found in virtio-win. The `RhelDoUnMap()` function does not properly validate the number of descriptors provided by a user during an unmap request. A local user could exploit this input validation vulnerability by supplying an excessive number of descriptors, leading to a buffer overrun. This can cause a system crash, resulting in a Denial of Service (DoS).

Official announcement: Please refer to the link for details –

https://nvd.nist.gov/vuln/detail/cve-2026-5164

CVE-2026-24157 and CVE-2026-24159: To protect your system, NVIDIA recommends updating to NeMo Framework version 2.6.2 or later (30-03-2026)

Preface: NVIDIA NeMo is a widely adopted, end-to-end framework for building, customizing, and deploying generative AI models (LLMs) and conversational AI agents. It is primarily used to tailor open-source models—such as Llama, Mistral, and Google Gemma—using proprietary enterprise data.

Ollama, Mistral, and Google Gemma represent a powerful ecosystem for running local, open-weight Large Language Models (LLMs). Ollama acts as the engine to run models, while Mistral and Gemma are two of the most popular, high-performing model families designed to be efficient enough to run on personal computers.

Background: Regarding the restore_from() method, it is a core functionality used to load local checkpoint files with the .nemo extension.

Key Details of restore_from() –

  • Purpose: Fully restores a model instance, including its weights and configuration, from a local [.]nemo file for evaluation, inference, or fine-tuning.
  • File Structure: A [.]nemo file is an archive (specifically a tar.gz file) containing the model’s weights and a model_config[.]yaml file that defines its architecture.
  • Usage: It is called directly from the model’s base class (e.g., ASRModel[.]restore_from(restore_path="path/to/file[.]nemo")).

Vulnerability details: (see below)

CVE-2026-24157 NVIDIA NeMo Framework contains a vulnerability in checkpoint loading where an attacker could cause remote code execution. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure and data tampering.

CVE-2026-24159 NVIDIA NeMo Framework contains a vulnerability where an attacker may cause remote code execution. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure and data tampering.

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5800

CVE-2026-24141: NVIDIA Model Optimizer for Windows and Linux contains a vulnerability in the ONNX quantization feature. (27th Mar 2026)

Preface: A design limitation has been discovered in the ONNX quantization function of the NVIDIA model optimizer for Windows and Linux. However, confusingly, the ONXY function appears to only work on Windows/RTX (not Linux). What is the actual design limitation?

A sophisticated technical question. The confusion often stems from the fact that while ONNX is the primary deployment format for Windows/RTX, the quantization process (where the vulnerability often lies) frequently occurs on Linux development servers.

Background: Why does the vulnerability affect both Linux and Windows?

Although ONNX is the target format for Windows AI PC applications, the NVIDIA Model Optimizer (ModelOpt) library is cross-platform.

*Linux as the “Factory”: Most developers use powerful Linux servers (with A100/H100 GPUs) to run the ModelOpt quantization scripts. They generate the optimized ONNX model on Linux and then “ship” it to Windows clients. Therefore, the vulnerability exists in the Linux-based conversion tools.

Vulnerability details: NVIDIA Model Optimizer for Windows and Linux contains a vulnerability in the ONNX quantization feature, where a user could cause unsafe deserialization by providing a specially crafted input file. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, data tampering, and information disclosure. (Initial release – March 24, 2026)

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5798

Remedy CVE-2025-33244, securely deserialize data for use with APEX or PyTorch. (26th Mar 2026)

Preface: The core design goal of NVIDIA Apex is to achieve mixed precision training, which mainly involves a combination of **16-bit (FP16)** and **32-bit (FP32)**.

Background: In NVIDIA APEX, handling FP16 and FP32 data is primarily managed through the Automatic Mixed Precision (AMP) module. You don’t need to manually cast your data.

  • Use FP32 if you are doing scientific simulations that require extreme precision or if your model fails to converge using lower bit-depths.
  • Use FP16 for Inference (running a finished model on a phone or server) or when training Large Language Models (LLMs) to save massive amounts of time and electricity.

Current Gold Standard is FP16; balances speed and memory.

Scientific simulations (such as those simulating black holes or aircraft airflow) require extremely high numerical stability to prevent errors from accumulating over time; while LLM training is more like finding a probability distribution of the “general direction,” where speed and model size are more important than accuracy to the 15th decimal place.

Vulnerability details: NVIDIA APEX for Linux contains a vulnerability where an unauthorized attacker could cause a deserialization of untrusted data. This vulnerability affects environments that use PyTorch versions earlier than 2.6. A successful exploit of this vulnerability might lead to code execution, denial of service, escalation of privileges, data tampering, and information disclosure.

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5782

Best Practices:

•Weight Files: Always convert and store your .pth or .bin files as .safetensors.

•API Inputs: Prefer Protocol Buffers (Protobuf) or JSON for real-time requests.

•Integrity Checks: Before deserializing, verify the file’s SHA-256 hash to ensure it hasn’t been tampered with during transit

CVE-2025-33179 and CVE-2025-33180: About NVIDIA Cumulus Linux and NVOS products (25-02-2026)

Preface: NVIDIA InfiniBand switches are based on technology from Mellanox Technologies. Nvidia Spectrum switches are based on technology from Mellanox Technologies. The Spectrum switch ASIC portfolio, originally developed by Mellanox for high-performance Ethernet networking, was rebranded under Nvidia and is now a core component of Nvidia’s networking division. NVIDIA completed the acquisition of Mellanox Technologies, a major supplier of high-performance interconnect technology (switches, NICs), in April 2020 for approximately $7 billion. This strategic move enhanced NVIDIA’s data center networking capabilities, specifically in InfiniBand and Ethernet, to support AI and high-performance computing.

Background: Cumulus Linux is optimized for Ethernet fabrics, while NVOS/Onyx is largely utilized in high-performance InfiniBand environments.

-Key switches supporting NVIDIA Cumulus Linux include:

  • Spectrum-4: SN5600, SN5600D, SN5400
  • Spectrum-2/3: SN3700, SN3700C, SN4600, SN4700
  • Spectrum-1: SN2700, SN2100, SN2745 

Example: Spectrum-4 series (including the SN5600, SN5600D, and SN5400) is a line of physical Ethernet switches (hardware)

Use Cases: Ideal for hyperscale cloud data centers and enterprise AI networks, emphasizing scalability and full customizability.

-NVOS (NVIDIA Onyx) or similar OS typically supports:

Quantum/Quantum-2 InfiniBand: Switches designed for high-performance AI, such as the Quantum-2 series.

Use Cases: Focused on High-Performance Computing (HPC) and large-scale AI training clusters (AI Factories), particularly environments utilizing NVLink for GPU interconnects.

Note: As of early 2026, NVIDIA is focusing on standardizing the management commands (NVUE) across both systems to reduce the complexity of automation workflows when transitioning between different operating systems.

Cumulus Linux (Native Linux): When you SSH in, you land in a standard Debian Linux bash shell. You configure the switch using the NVUE (NVIDIA User Experience) object model via the nv command (e.g., nv set interface swp1…).

Vulnerability Note: The CVEs (CVE-2025-33179/33180) specifically target the NVUE API and CLI engine found in Cumulus Linux 5.x and later.

Vulnerability details:

CVE-2025-33179 NVIDIA Cumulus Linux and NVOS products contain a vulnerability in the NVUE interface, where a low-privileged user could run an unauthorized command. A successful exploit of this vulnerability might lead to escalation of privileges.

CVE-2025-33180 NVIDIA Cumulus Linux and NVOS products contain a vulnerability in the NVUE interface, where a low-privileged user could inject a command. A successful exploit of this vulnerability might lead to escalation of privileges.

Official announcement: Please refer to the link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5722

AMD-SB-3043 all aspects, side-channel analysis for privacy applications on confidential VMs (24th Feb 2026)

Preface: AMD-SB-3042 is a formal advisory for a specific vulnerability, while AMD-SB-3043 is an advisory regarding an analytical tool (SNPeek) used to detect such vulnerabilities.

Background: If AMD zen5 operating AMD SEV-SNP, traditional hypervisor especially VMware or Hyper-V management include encryption will handle by hypervisor embedded in Xen5, so SNPeek collect the traffic is un-encrypted data. Details shown as below:

Limitations of SNPeek If using a tool like SNPeek to intercept traffic on the host side: It can only see data marked as “Shared” (usually for use by the hypervisor to assist with network or disk I/O). Data in Private Memory is always encrypted to SNPeek; the hypervisor cannot read its plaintext content at all. Potential Risk Warning Despite strong hardware encryption, the recently discovered StackWarp (CVE-2025-29943) vulnerability shows that a malicious hypervisor could still influence the execution path of Zen 5 virtual machines by manipulating the CPU’s internal “Stack Engine.” While this doesn’t mean it can directly “read” encrypted memory, it can achieve indirect attacks.

AMD-SB-3043: Analytical Framework (SNPeek)

  • Nature: A bulletin regarding a research framework and toolkit for evaluating side-channel risks in Confidential VMs (CVM).
  • Core Content: Describes the SNPeek open-source toolkit.
  • Function & Purpose:
    • SNPeek is not a single vulnerability but an automated analysis pipeline that uses machine learning to assess how sensitive a CVM application is to side-channel attacks.
    • It helps developers quantify how much information an application might leak when running in encrypted environments like SEV-SNP.
    • It provides configurable attack primitives to help developers locate “weak points” in their code and guides the implementation of mitigations (e.g., oblivious memory access).

Official details and announcement: AMD’s assessment is that all side-channel techniques demonstrated in the paper fall within the category of already known, documented, and out of scope behaviors according to the published SEV/SNP threat model.  AMD has  introduced features on Zen 5 processors—specifically Ciphertext Hiding and PMC Virtualization—that address the ciphertext visibility and HPC based leakage paths highlighted by the researchers.

AMD recommends software developers employ existing best practices, including constant-time algorithms, and avoiding secret-dependent data accesses where appropriate. Please refer to the link for details – https://www.amd.com/en/resources/product-security/bulletin/amd-sb-3043.html

Edge TPU (an ASIC accelerator developed by Google) – Episode 1 (23rd Feb 2026)

Preface: PyCoral is specifically a TPU processing technique. While TensorFlow Lite (TFLite) can run on a standard CPU, PyCoral is the dedicated library used to delegate those operations to the Edge TPU hardware.

PyCoral API: This is a Python library specifically designed by Google to run inference on Coral Edge TPU hardware, such as the Coral USB Accelerator or M.2 modules. It is built on top of TensorFlow Lite.

Nvidia H100: This is a high-end data center GPU based on the Hopper architecture. It uses Nvidia’s proprietary software stack, including the CUDA toolkit, TensorRT, and the Transformer Engine to accelerate AI workloads.

Background: It is accurate to say that foundational memory management principles—specifically allocation and copying (malloc/new, memcpy)—are the basis for both CUDA/TensorRT and Coral API inference, though they operate on different memory spaces.

  • CUDA/TensorRT (GPU-centric): Uses cudaMalloc and cudaMemcpy to manage dedicated GPU device memory.
  • PyCoral API/TFLite (CPU-centric/Edge): Primarily uses malloc or new for CPU-based input/output buffers and memcpy to manage memory within host memory, even when interacting with the Edge TPU.

In both cases, efficient management of data movement between host (CPU) and device (GPU/TPU) is key, making memory allocation and copying the common denominator.

PyCoral API (pycoral module): This is a Python library built on top of the TensorFlow Lite Python API (tflite_runtime). It provides convenience functions and additional features (like model pipelining and on-device transfer learning) to simplify development with Python.Coral C++ API (libcoral): This is a C++ library built on top of the TensorFlow Lite C++ API. It offers the same functionality as the PyCoral API but for C++ applications.

Cyber security focus: But the most common vulnerability occurs when developers call [.]get() to obtain the raw pointer, and then continue to use that raw pointer after the std::unique_ptr has gone out of scope or been destroyed. Is the C++ TPU programming related to this issue? Please refer to the recommendations in the diagram for details.