CVE-2025-23307: NVIDIA NeMo Curator for all platforms contains a vulnerability (28th Aug 2025)

Preface: NeMo Curator, part of the NVIDIA NeMo software suite for managing the AI agent lifecycle, is a Python library specifically designed for fast and scalable data processing and curation for generative AI use cases such as foundation language model pretraining, text-to-image model training, domain-adaptive pretraining (DAPT), supervised fine-tuning (SFT) and parameter-efficient fine-tuning (PEFT).

Background: To install the NeMo Curator library, run the following command:

  • git clone https://github[.]com/NVIDIA/NeMo-Curator[.]git
  • cd NeMo-Curator
  • pip install –extra-index-url https://pypi[.]nvidia[.]com “.[cuda12x]”

Data download: downloading pipeline in NeMo Curator consists of the following classes:

  • DocumentDownloader: Abstract class for downloading remote data to disk.
  • DocumentIterator: Abstract class for reading dataset raw records from the disk.
  • DocumentExtractor: Abstract class for extracting text records, as well as any relevant metadata from the records on the disk.

Vulnerability details: NVIDIA NeMo Curator for all platforms contains a vulnerability where a malicious file created by an attacker could allow code injection. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure, and data tampering.

Ref: The vulnerability arises when malicious files—such as JSONL files—are loaded by NeMo Curator. If these files are crafted to exploit weaknesses in how NeMo Curator parses or processes them, they can inject executable code. This aligns with your description of:

  • Embedded malicious payloads in JSONL files.
  • JSON injection attacks exploiting parsing logic.

Official announcement: Please see the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5690

CVE-2025-9386: It impacts any deployment of AppNeta’s tcpreplay tool (27thAug 2025)

Preface: AppNeta, now part of Broadcom, is a SaaS-based network performance monitoring solution that provides IT and network operations teams with end-to-end visibility into application performance and network issues from the end-user perspective.

Why do developers need to customize Tcpreplay?

  • Testing Firewalls and IDS/IPS: Tcpreplay allows you to replay captured traffic through network devices like firewalls and intrusion detection/prevention systems.
  • Tuning Flow Expiry: You can optimize flow timeout settings to improve the accuracy of flow analysis and tuning for flow-based products.

Background: Tcpreplay is a suite of free, open-source command-line tools for replaying and editing network traffic captured in pcap files, which are created by tools like tcpdump and Wireshark. It’s used to test network devices such as intrusion detection systems (IDS), routers, and firewalls by replaying real-world traffic at specific speeds, or to simulate traffic for debugging and performance analysis.

AppNeta, on the other hand, is a commercial network performance monitoring solution. While AppNeta provides a comprehensive suite of features for network monitoring, including bandwidth monitoring, application management, and capacity management, its relationship with Tcpreplay is notable.

Vulnerability details: A vulnerability has been found in appneta tcpreplay up to 4.5.1. The impacted element is the function get_l2len_protocol of the file get.c of the component tcprewrite. Such manipulation leads to use after free. The attack must be carried out locally. The exploit has been disclosed to the public and may be used.

Remedy: Upgrading to version 4.5.2-beta3 is sufficient to resolve this issue. You should upgrade the affected component.

Official announcement: Please see the link for details –

https://nvd.nist.gov/vuln/detail/CVE-2025-9386

CVE-2025-4609: About Mojo in Google Chrome on Windows (26-8-2025)

Published: 2025-08-22

Updated: 2025-08-21

Preface: In Chrome, a sandbox is a security feature that isolates web content and processes into restricted environments, while serialization is the process of converting an object’s state into a format (like a byte stream) that can be stored or transferred. The sandbox focuses on security and isolation, whereas serialization is a data handling mechanism used within various Chrome features, such as interacting with hardware via the Serial AP.

Background: In Chrome, an unprivileged environment is a restricted sandbox where web content and extensions run with limited access to the operating system and local resources, designed to prevent malicious code from causing harm. Chrome uses sandboxing, process isolation, and a permissions system to enforce this isolation. Content running in an unprivileged environment generally cannot install programs, view or delete user data, or directly interact with system-level features unless the user explicitly grants specific permissions.

Ref: The “ipcz_driver” layer refers to a component in Google Chrome’s development that acts as a bridge between JavaScript and the Mojo API, enabling injected JavaScript to communicate with C++ components via the Chromium ipcz-based Mojo bindings. It functions as a communication layer, allowing for the high-bandwidth, zero-copy transfer of data between different parts of the browser’s architecture, particularly for communication between the browser’s user interface and its underlying components.

Vulnerability details: Incorrect handle provided in unspecified circumstances in Mojo in Google Chrome on Windows prior to 136.0.7103.113 allowed a remote attacker to potentially perform a sandbox escape via a malicious file.

Official announcement: Please see the link for details –

https://www.tenable.com/cve/CVE-2025-4609

CVE-2025-27072: Buffer Copy Without Checking Size of Input in Automotive Vehicle Networks (25-08-2025)

Official announcement: 08/04/2025

Quote: I chose a Qualcomm product affected by this vulnerability as an example. The Snapdragon Ride™ Flex SoC, including the SA9000P series, does not run on a single embedded OS, but rather supports mixed-criticality operating systems such as those provided by Qualcomm’s partners or the automaker themselves.

Preface: To set up Audio Video Bridging (AVB) on a Qualcomm SA9000P, you’ll need to enable the AVB stack and configure the appropriate settings for your specific hardware and software. While specific commands will depend on the operating system and software development kit (SDK) for the SA9000P, the general process involves using command-line tools or the provided SDK to enable the relevant protocols, such as Time-Sensitive Networking (TSN) and Multiple Stream Reservation Protocol (MSRP), which are part of the AVB standard.

Background: The Qualcomm SA9000P itself is a SoC (System on Chip), which typically does not include internal flash storage for OS images. Instead, the kernel and OS images are usually stored on external non-volatile memory connected to the SoC. Common storage options include:

  • eMMC (embedded MultiMediaCard)
  • UFS (Universal Flash Storage)
  • SPI NOR/NAND Flash
  • SD Card (for development purposes)

These storage devices are mounted on the development board or production hardware that integrates the SA9000P.

Vulnerability details: Information disclosure while processing a packet at EAVB BE side with invalid header length.

Vulnerability Type : CWE-120 Buffer Copy Without Checking Size of Input (‘Classic Buffer Overflow’)

Speculation: If a vulnerability (like CVE-2025-27072) allows processing of malformed packets that access invalid memory regions, and those regions are shared, then:

  1. Other subsystems (e.g., audio DSP, camera ISP, or modem) could be impacted.
  2. Sensitive data in shared buffers could be exposed or corrupted.
  3. System stability could degrade due to memory corruption.
  4. Security boundaries between subsystems could be violated.

Official announcement: Please refer to the link for details –

https://nvd.nist.gov/vuln/detail/CVE-2025-27072

CVE-2025-5115: About Eclipse Jetty (22nd Aug 2025)

Published: 2025-08-20

Updated: 2025-08-19

Preface: Apache Knox uses Eclipse Jetty as its embedded web server. When you deploy and execute Apache Knox, it uses Jetty to handle incoming HTTP requests and provide its various features, such as authentication, authorization, and routing to backend Hadoop services.

Ref: Is Hadoop used in AI? Hadoop ecosystems help with the processing of data and model training operations for machine learning applications.

Background: How Jetty Consumes Resources? Apache Knox leverages Eclipse Jetty as its embedded web server. Apache Knox is a reverse proxy and API Gateway that provides a single point of secure access for Apache Hadoop services. It is written in Java and relies heavily on Java for its runtime environment and functionality.

Is the Exploit Related to HTTP Response Buffer Size?

Not directly. The vulnerability does not exploit the size of the HTTP response buffer itself. Instead, it targets the processing logic of incoming HTTP/2 frames. However:

  • If Jetty is configured with large buffers or many concurrent streams, the impact of the exploit can be amplified.
  • The server may allocate response buffers unnecessarily if it begins processing a request before realizing it’s invalid.

Vulnerability details: In Eclipse Jetty, versions <=9.4.57, <=10.0.25, <=11.0.25, <=12.0.21, <=12.1.0.alpha2, an HTTP/2 client may trigger the server to send RST_STREAM frames, for example by sending frames that are malformed or that should not be sent in a particular stream state, therefore forcing the server to consume resources such as CPU and memory.

Official announcement: Please see the link for details –

https://www.tenable.com/cve/CVE-2025-5115

https://github.com/jetty/jetty.project/pull/13449

CVE-2025-38553: Linux kernel’s net/sched subsystem (Fixed) – 21st Aug 2025

Preface: While Kubernetes doesn’t directly expose net/sched as a configurable API, its network management and QoS features often rely on or interact with net/sched at the underlying Linux kernel level to achieve desired network behavior for containerized applications.

Background: net/sched is the Linux kernel subsystem responsible for traffic control (tc). It manages how packets are queued and scheduled for transmission on network interfaces using qdiscs (queueing disciplines). The default qdisc is typically pfifo_fast or fq_codel depending on the kernel version and distribution.

Vulnerability details: The vulnerability CVE-2025-38553 affects the Linux kernel’s net/sched subsystem, specifically the netem qdisc. It arises when multiple netem instances are added to the same qdisc tree, which can lead to:

  • Soft lockups
  • Out-of-memory (OOM) errors
  • Infinite loops during packet dequeueing

The root cause is flawed duplication logic in netem_enqueue, especially when a netem is nested within another netem in a qdisc hierarchy. The fix restricts the addition of a duplicating netem if another netem already exists in the tree

Official announcement: Please see the link for details –

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=09317dfb681ac5a96fc69bea0c54441cf91b8270

AMD responds to known potential side channels attack in SEV-SNPs. (20-08-2025)

Official Revision Date: 2025-08-12

Preface: AMD SEV-SNP is a confidential computing hardware technology present in AMD EPYC processors from generation 3 and newer. It is based on hardware virtualization extensions and achieves isolation by adding these measures: Full memory encryption.

SEV-SNP is not solely located in the firmware. While the firmware plays a crucial role in SEV-SNP (Secure Encrypted Virtualization-Secure Nested Paging), it’s a combination of hardware, firmware, and software working together to provide the security features. The firmware initializes the SEV-SNP context and performs attestation, but the core functionality also relies on the AMD processor’s hardware and the guest operating system’s software components. For example: SEV-SNP is supported on AMD EYPC processors starting with the AMD EPYC 7003 series processors. AMD SEV-SNP offers powerful and flexible support for the isolation of a guest virtual machine from an untrusted host operating system. It is very useful in public cloud and any untrusted host scenario.

Background: SEV-SNP is designed to prevent software-based integrity attacks and reduce risk associated with compromised memory integrity. The basic principle of SEV-SNP integrity is that if a VM is able to read a private (encrypted) page of memory, it must always read the value it last wrote.

AMD SEV-SNP allows the hypervisor to move encrypted guest pages, including swapping pages to disk, but this capability can also be exploited through ciphertext side-channel attacks, where the hypervisor monitors ciphertext changes to infer guest data.

The findings of the two research teams:

AMD has received reports from two research groups detailing methods by which a malicious hypervisor could potentially execute a side channel attack against a running secure encrypted virtualization – secure nested paging (SEV-SNP) guest.

The first report, titled “Relocate + Vote: Exploiting Ciphertext Side-Channels using Sparsity Information,” was submitted by researchers at the Toronto System Security Lab of the University of Toronto. 

A subsequent report from researchers at ETH Zurich titled “Chosen Plaintext Oracle against SEV-SNP,” outlines a similar exploitation technique that also leverages the ability to move or swap guest pages. 

Official announcement: For more information, please refer to the link:

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-3021.html

Overview of Transformer-based language models (19-08-2025)

Technical Highlights: Megatron-LM codebase efficiently trains models from 2 billion to 462 billion parameters across thousands of GPUs, achieving up to 47% Model FLOP Utilization (MFU) on H100 clusters. The Megatron-LM codebase has successfully benchmarked the training of a 462B parameter model using 6144 H100 GPUs, achieving up to 47% Model FLOP Utilization (MFU).

GPT-4, the latest iteration in OpenAI’s Generative Pre-trained Transformer series, significantly scales up the parameter count compared to its predecessors. While GPT-2 had 1.5 billion parameters and GPT-3 boasted 175 billion, GPT-4 is estimated to have a staggering 1.76 trillion parameters.

Parameters in Artificial Intelligence:

Parameters in AI are the variables that the model learns during training. They are the internal variables that the model uses to make predictions or decisions. In a neural network, the parameters include the weights and biases of the neurons. Parameters are used in AI to determine the output of the model for a given input. During training, the model adjusts its parameters to minimize the difference between its predictions and the actual values. This is typically done using an optimization algorithm, such as gradient descent. Gradient descent is a fundamental optimization algorithm in artificial intelligence, particularly in machine learning and deep learning. It’s used to minimize a function, often the cost function of a model, by iteratively adjusting the model’s parameters in the direction of the steepest descent.

About Megatron-LM: The Megatron-LM codebase, developed by NVIDIA, is widely used for training large parameter models, particularly Large Language Models (LLMs), due to its specialized features and optimizations designed for large-scale distributed training. Megatron-LM is a GPU-optimized framework developed by NVIDIA for training transformer models at scale. It supports models ranging from a few billion to hundreds of billions of parameters.

Outlines the core techniques:

  • Intra-layer model parallelism
  • Pipeline parallelism
  • Tensor parallelism
  • Efficient communication primitives using NCCL
  • Mixed precision training (FP16/BF16)

Ref: NVIDIA Collective Communications Library (NCCL), is a library developed by NVIDIA that provides optimized routines for multi-GPU and multi-node communication. It’s designed to accelerate collective communication patterns, like all-reduce, broadcast, reduce, and all-gather, which are crucial for deep learning frameworks and other parallel computing applications using NVIDIA GPUs.

Ref: FP16 and BF16 are both 16-bit floating-point formats used in AI training to improve performance and efficiency, but they differ in their dynamic range and precision. FP16, also known as half precision, offers higher precision for smaller values but has a limited range. BF16, or Brain Floating Point, has a wider dynamic range, making it more suitable for large-scale models where numerical stability is crucial, even at the cost of some precision.

Official details: For details, please refer to the link – https://github.com/NVIDIA/Megatron-LM

CVE-2025-23305 and CVE-2025-23306: About NVIDIA Megatron-LM (18-08-2025)

Official Updated 08/11/2025 06:16 AM

Preface: GPT-4 offers several key benefits, including improved accuracy, longer context handling, and the ability to process both text and image inputs. It also exhibits stronger guardrails, leading to more reliable and ethical outputs. Additionally, GPT-4 excels in various tasks like professional and academic benchmarks, creative writing, and adapting to user needs.

Background: The Megatron-LM codebase is a framework for training large, powerful transformer language models at scale, developed by NVIDIA. It focuses on efficient, model-parallel (tensor and pipeline) and multi-node pre-training of transformer-based models like GPT, BERT, and T5 using mixed precision

Megatron-LM codebase efficiently trains models from 2B to 462B parameters across thousands of GPUs, achieving up to 47% Model FLOP Utilization (MFU) on H100 clusters.

GPT-4, the latest iteration in OpenAI’s Generative Pre-trained Transformer series, significantly scales up the parameter count compared to its predecessors. While GPT-2 had 1.5 billion parameters and GPT-3 boasted 175 billion, GPT-4 is estimated to have a staggering 1.76 trillion parameters.

The Megatron-LM codebase has successfully benchmarked the training of a 462B parameter model using 6144 H100 GPUs, achieving up to 47% Model FLOP Utilization (MFU).

While this demonstrates the capability of the Megatron-LM framework to train very large models on H100 clusters, the exact number of H100 GPUs used to train GPT-4 is not publicly disclosed. GPT-4 was developed by OpenAI, and they have not released the specific hardware configurations used for its training.

Vulnerability details:

CVE-2025-23305       NVIDIA Megatron-LM for all platforms contains a vulnerability in the tools component, where an attacker may exploit a code injection issue. A successful exploit of this vulnerability may lead to code execution, escalation of privileges, information disclosure, and data tampering.

CVE-2025-23306       NVIDIA Megatron-LM for all platforms contains a vulnerability in the megatron/training/

arguments.py component where an attacker could cause a code injection issue by providing a malicious input. A successful exploit of this vulnerability may lead to code execution, escalation of privileges, information disclosure, and data tampering.

Official announcement: For more information, please refer to the link

https://nvidia.custhelp.com/app/answers/detail/a_id/5685

CVE-2025-23298: About NVIDIA Merlin Transformers4Rec (15th Aug 2025)

Official Updated 08/11/2025 06:15 AM

Preface: While the Bible doesn’t specifically mention artificial intelligence, it reminds us that human knowledge and capabilities will increase dramatically in the last days (Daniel 12:4). Building and training neural networks is a cornerstone of modern artificial intelligence, enabling breakthroughs in fields such as computer vision, natural language processing, and robotics.

Background: NVIDIA Merlin Transformers4Rec is a Python library designed for building sequential and session-based recommender systems, leveraging the power of Transformer architectures, particularly for use with PyTorch. It is part of the broader NVIDIA Merlin ecosystem, which provides end-to-end GPU-accelerated solutions for recommender systems.

Transformers4Rec is pre-installed in the merlin-pytorch container that is available from the NVIDIA GPU Cloud (NGC) catalog.

NVIDIA Merlin PyTorch container, available on NVIDIA NGC (NVIDIA GPU Cloud), includes the necessary components for GPU acceleration, including the CUDA Toolkit.

The Merlin PyTorch container allows users to do preprocessing and feature engineering with NVTabular, and then train a deep-learning based recommender system model with PyTorch, and serve the trained model on Triton Inference Server.

Ref: NVTabular and RAPIDS (cuDF/cuML) for preprocessing and feature engineering.

Vulnerability details: NVIDIA Merlin Transformers4Rec for all platforms contains a vulnerability in a python dependency, where an attacker could cause a code injection issue. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please see the link for details

https://nvidia.custhelp.com/app/answers/detail/a_id/5683

antihackingonline.com