CVE-2025-23294: NVIDIA WebDataset for all platforms contains a vulnerability 14-08-2025

Official Updated 08/11/2025 06:15 AM

Preface: WebDataset is a PyTorch IterableDataset implementation designed for efficient access to large datasets stored in POSIX tar archives. It focuses on sequential/streaming data access, which offers substantial performance advantages in environments where local storage is limited or I/O bottlenecks are a concern. WebDataset is particularly well-suited for very large-scale training, as it minimizes the need for local storage and allows for efficient data loading from various sources, including cloud storage.

Background: NVIDIA WebDataset refers to the integration of WebDataset with NVIDIA technologies like DALI or NeMo, rather than a separate NVIDIA-specific installation. Installing WebDataset itself is straightforward, as it is a Python library.

  • DALI is a portable, open-source software library for decoding and augmenting images, videos, and speech to accelerate deep learning applications.

DALI itself doesn’t extract .tar files directly — instead, it processes data streamed from tarballs via WebDataset or other loaders.

  • NVIDIA NeMo is a framework for building and deploying generative AI models, particularly those used in conversational AI like speech recognition and natural language processing.

It may extract or stream data depending on the configuration, but tarball handling is abstracted behind the data pipeline.

Vulnerability details: CVE-2025-23294 – NVIDIA WebDataset for all platforms contains a vulnerability where an attacker could execute arbitrary code with elevated permissions. A successful exploit of this vulnerability might lead to escalation of privileges, data tampering, information disclosure, and denial of service.

Official announcement: Please see the link for details

https://nvidia.custhelp.com/app/answers/detail/a_id/5658

A safe mode bypass vulnerability in Keras versions 3.0.0 through 3.10.0 (13th Aug 2025)

Preface: Deep learning in AI generally learns much faster than humans in specific, narrow tasks, especially those involving large datasets and complex computations. However, humans still excel at general intelligence, creative problem-solving, and learning with limited data.

Perhaps, AI does not have this advantage yet!

Background: Keras 3.0 is a major rewrite of the Keras deep learning API, designed to provide a unified and flexible platform for building and deploying deep learning models. Its most significant feature is its multi-backend architecture, allowing users to run Keras workflows on top of various popular deep learning frameworks.

TensorFlow is a comprehensive, low-level machine learning framework capable of building and training models directly. However, Keras plays a crucial role as its official high-level API, providing several benefits that make deep learning development significantly easier and more efficient within the TensorFlow ecosystem.

Keras 3.0 does it work in lambda layer? Yes, the Lambda layer continues to be available and functional in Keras 3.0. In machine learning, specifically within the context of deep learning frameworks like Keras or TensorFlow, a Lambda layer is a type of layer that allows you to wrap arbitrary expressions or functions as a layer in your neural network model.

Vulnerability details: A safe mode bypass vulnerability in the `Model.load_model` method in Keras versions 3.0.0 through 3.10.0 allows an attacker to achieve arbitrary code execution by convincing a user to load a specially crafted `.keras` model archive.

Official announcement: Please see the link for details

https://www.tenable.com/cve/CVE-2025-8747

CVE-2025-6573: About Imagination’s PowerVR DDK (12th AUG 2025)

Preface: PowerVR is a brand of graphics processing unit (GPU) IP ( intellectual property) developed by Imagination Technologies. In the context of Android, PowerVR GPUs are integrated into mobile System-on-Chips (SoCs) by various manufacturers, providing the graphics processing capabilities for Android devices. It’s a key competitor to Adreno (Qualcomm) and Mali (Arm) GPUs in the Android market.

Background: The Android SDK and Imagination’s PowerVR DDK are both software development kits, but they serve different purposes. The Android SDK is a comprehensive set of tools for developing Android applications, while the PowerVR DDK is a specialized kit for optimizing and integrating graphics rendering with Imagination Technologies’ PowerVR GPUs.

A DDK is a set of tools and libraries provided by an operating system vendor to facilitate the development of device drivers and kernel modules. Kernel modules are pieces of code that can be loaded into the operating system kernel at runtime, extending its functionality without requiring a full system reboot. This is common in Linux and Android kernel development.

The PowerVR DDK (Driver Development Kit) Native Lib C Framework refers to the foundational libraries and tools provided by Imagination Technologies to facilitate the development of graphics applications and drivers for systems utilizing PowerVR GPUs.

Vulnerability details: Kernel software installed and running inside an untrusted/rich execution environment (REE) could leak information from the trusted execution environment (TEE).

  • The scratch buffer (pui8FWScratchBuf) is used by the GPU firmware for temporary data.
  • If this buffer is mapped or accessible from REE, malicious or compromised kernel software could read or overwrite data that should be protected within the TEE.

Official announcement: Please refer to the link for details

https://nvd.nist.gov/vuln/detail/CVE-2025-6573

AMD responds to ETH Zurich researchers’ technical findings (11th Aug 2025)

Preface: AMD K10 architecture, first launched in 2007, is not considered valid for modern computing needs. While it was a significant step in AMD’s processor development, it has been superseded by newer architectures like Zen, which offer significant performance and efficiency improvements.

Background: The “AMD Zen stack engine” generally refers to the AMD Zen microarchitecture and its various generations used in AMD processors. Zen utilizes a modular structure, with the basic building block being the CPU Complex (CCX). Each CCX contains multiple cores (e.g., four cores in early Zen generations) that share a large L3 cache.

Technical details: The stack engine is a feature that has a speculative stack address delta register in the front-end that is updated directly with push/pop instructions, and that delta is dispatched with the stack memory uop to be added to the original stack address register when doing address generation in the load/store units.

The stack engine is not predictive in nature and as such does not open up new transient execution windows. However, it might still leak information under speculation. The following two main scenarios were analyzed:

First, Researchers from ETH Zurich checked whether the stack engine offset is reset when the CPU corrects a branch misprediction. We find that the offset is reset to zero on Zen 3-4 while Zen 5 appears to retain an offset. We were not able to conclusively determine the effect on the other architectures due to excessive noise introduced by the misspeculation.

Second, Reseachers from ETH Zurich aimed to detect stack engine sync operations that occur only on the speculative path that are latersquashed. Using performance monitor counters (PMCs), we confirm that sync operations are indeed also observable under transient execution on Zen 3-5. An attacker might theoretically combine this behavior with a classical indirect branch target injection to build a call-depth disclosure gadget in a cross-thread attack. However, we note that such an attack would only slightly expand the capabilities of a cross-thread attacker.

Workaround: AMD continues to recommend software developers employ existing best practices including constant time algorithm and avoid secret-dependent data access or control flows to help mitigate the potential vulnerability.

Official announcement: Please refer to the link for detailshttps://www.amd.com/en/resources/product-security/bulletin/amd-sb-7045.html

CVE-2025-0932: Arm fixes userspace vulnerability in Mali GPU driver (8th Aug 2025)

Preface: The Valhall family of Mali GPUs uses the same top-level architecture as the previous generation Bifrost GPUs. The Valhall family uses a unified shader core architecture.

The Arm 5th generation GPU architecture, including the Immortalis and Mali GPUs, represents a modern design for mobile and other client devices.

Background: ioctl (Input/Output Control) is the primary syscall used by userspace GPU drivers to communicate with the kernel-space driver. It allows sending custom commands and structured data to the driver.

Typical ioctl operations in Mali drivers include:

  • MALI_IOCTL_ALLOC_MEM: Allocate GPU-accessible memory
  • MALI_IOCTL_FREE_MEM: Free previously allocated memory
  • MALI_IOCTL_SUBMIT_JOB: Submit a GPU job (e.g., shader execution)
  • MALI_IOCTL_WAIT_JOB: Wait for job completion
  • MALI_IOCTL_MAP_MEM: Map memory to userspace

The path bifrost-drivers/driver/product/kernel/drivers/gpu/arm indicates that the code within this directory is part of the kernel-space drivers for Arm Mali Bifrost GPUs.

Vulnerability details: Use After Free vulnerability in Arm Ltd Bifrost GPU Userspace Driver, Arm Ltd Valhall GPU Userspace Driver, Arm Ltd Arm 5th Gen GPU Architecture Userspace Driver allows a non-privileged user process to perform valid GPU processing operations, including via WebGL or WebGPU, to gain access to already freed memory.

Scope of impact: This issue affects Bifrost GPU Userspace Driver: from r48p0 through r49p3, from r50p0 through r51p0; Valhall GPU Userspace Driver: from r48p0 through r49p3, from r50p0 through r54p0; Arm 5th Gen GPU Architecture Userspace Driver: from r48p0 through r49p3, from r50p0 through r54p0.

Official announcement: Please see the link for details –

https://nvd.nist.gov/vuln/detail/CVE-2025-0932

https://developer.arm.com/documentation/110626/latest

AMD response to EDK2 SMM MCE Enablement Issue (7th Aug 2025)

Preface: While it’s technically possible to update UEFI firmware from within a Linux user space environment, it’s not a common or recommended practice. Most UEFI updates are designed to be installed through specific utilities provided by the motherboard manufacturer, often requiring a bootable medium or a dedicated Windows application.

Background: EDK II, also known as EDK2, is an open-source firmware development environment for the Unified Extensible Firmware Interface (UEFI) and Platform Initialization (PI) specifications. It’s a modern, feature-rich, and cross-platform environment developed by the Tianocore project. Think of it as the official development environment for UEFI applications and a core component of many platforms’ firmware

TianoCore is an open-source community focused on developing and promoting the Unified Extensible Firmware Interface (UEFI). It provides a firmware development environment, primarily known as EDK II, which is used for building UEFI firmware, drivers, and applications. TianoCore is a reference implementation of UEFI and is widely adopted by the industry.

Technical details: A researcher reported a bug in the open source EDK2 system management interrupt (SMI) entry code when an MCE occurs near the start of the SMI handler.  An attacker who can inject a machine check exception (MCE) could cause execution to jump to an attacker-controlled interrupt handler, leading to arbitration code execution.

Ref: On AMD EPYC processors, the System Management Mode (SMM) functionality is indeed implemented within the System Management Unit (SMU), which is a distinct block of logic on the processor die.

The System Management Unit (SMU) contains a mailbox function to facilitate communication between the SMU and other system components, including the CPU and operating system. This mailbox acts as a communication channel for sending commands and data, and receiving responses, enabling the SMU to perform its tasks related to system management, power management, and hardware control.

Official announcement: Please refer to the following link for detailshttps://www.amd.com/en/resources/product-security/bulletin/amd-sb-7043.html

2025-23318 and CVE-2025-23319: About NVIDIA Triton Inference Server (6th Aug 2025)

Preface: Nvidia’s security advisories released on August 4, 2025 (e.g., CVE-2025-23318, CVE-2025-23319) are specifically related to the Python backend. The Triton backend for Python. The goal of Python backend is to let you serve models written in Python by Triton Inference Server without having to write any C++ code.

Background: NVIDIA Triton Inference Server is an open-source inference serving software that streamlines the deployment and execution of AI models from various deep learning and machine learning frameworks. It achieves this flexibility through a modular system of backends. 

Each backend within Triton is responsible for executing models from a specific framework. When an inference request arrives for a particular model, Triton automatically routes the request to the necessary backend for execution. 

Key backend frameworks supported by Triton include:

  • TensorRT: NVIDIA’s high-performance deep learning inference optimizer and runtime.
  • TensorFlow: A popular open-source machine learning framework.
  • PyTorch: Another widely used open-source machine learning library.
  • ONNX: An open standard for representing machine learning models.
  • OpenVINO: Intel’s toolkit for optimizing and deploying AI inference.
  • Python: A versatile backend that can execute models written directly in Python and also serves as a dependency for other backends.
  • RAPIDS FIL (Forest Inference Library): For efficient inference of tree models (e.g., XGBoost, LightGBM, Scikit-Learn).

This modular backend architecture allows Triton to provide a unified serving solution for a wide range of AI models, regardless of the framework they were trained in.

Vulnerability details:

CVE-2025-23318: NVIDIA Triton Inference Server for Windows and Linux contains a vulnerability in the Python backend, where an attacker could cause an out-of-bounds write. A successful exploit of this vulnerability might lead to code execution, denial of service, data tampering, and information disclosure.

CVE-2025-23319: NVIDIA Triton Inference Server for Windows and Linux contains a vulnerability in the Python backend, where an attacker could cause an out-of-bounds write by sending a request. A successful exploit of this vulnerability might lead to remote code execution, denial of service, data tampering, or information disclosure.

Official announcement: Please see the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5687

CVE-2025-23310: The NVIDIA Triton Inference Server for Windows and Linux suffers from a stack buffer overflow due to specially crafted input. (5th Aug 2025)

Preface: The NVIDIA Triton Inference Server API supports both HTTP/REST and GRPC protocols. These protocols allow clients to communicate with the Triton server for various tasks such as model inferencing, checking server and model health, and managing model metadata and statistics.

Background: NVIDIA Triton™ Inference Server, part of the NVIDIA AI platform and available with NVIDIA AI Enterprise, is open-source software that standardizes AI model deployment and execution across every workload.

The Asynchronous Server Gateway Interface (ASGI) is a calling convention for web servers to forward requests to asynchronous-capable Python frameworks, and applications. It is built as a successor to the Web Server Gateway Interface (WSGI).

NVIDIA Triton Inference Server integrates a built-in web server to expose its functionality and allow clients to interact with it. This web server is fundamental to how Triton operates and provides access to its inference capabilities on both Windows and Linux environments.

Vulnerability details: CVE-2025-23310 – NVIDIA Triton Inference Server for Windows and Linux contains a vulnerability where an attacker could cause stack buffer overflow by specially crafted inputs. A successful exploit of this vulnerability might lead to remote code execution, denial of service, information disclosure, and data tampering.

Official announcement: Please refer to the link for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5687

3I/ATLAS’s , who are you? (4th Aug 2025)

Quote: Hawking advised against active attempts to contact alien civilizations, which could be dangerous, arguing that an advanced alien race could see the same bacteria as humans, which could lead to catastrophic consequences if they discovered Earth.

Ref: https://pmc.ncbi.nlm.nih.gov/articles/PMC11462274/

Background: 3I/ATLAS, also known as C/2025 N1 and previously as A11pl3Z, is an interstellar comet discovered by the Asteroid Terrestrial-impact Last Alert System station at Río Hurtado, Chile on 1 July 2025, when it was entering the inner Solar System at a distance of 4.5 astronomical units from the Sun.

Technical details: A team of researchers has presented the wild theory that an interstellar object might be hostile “alien technology” that could reach Earth in fall 2025. Below is the speculation based on evidences.

-3I/ATLAS orbital plane lies virtually in the Ecliptic, though retrogade, i = 175.11◦

-3I/ATLAS is too large to be an asteroid

-3I/ATLAS shows no evidence of cometary outgassing.

Ref: cometary outgassing provides the energy to push a comet away from the Sun. As a comet approaches the Sun, its icy nucleus warms, causing ices to sublimate and release gas and dust, forming a coma. This outgassing exerts a force on the comet, pushing it in the opposite direction of the escaping gas, which is a major factor in the comet’s trajectory.

– 3I/ATLAS approaches unusually close to Venus, Mars and Jupiter

– I/ATLAS achieves perihelion on the opposite side of the Sun to Earth

Remark: The Earth revolves around the Sun in an elliptical orbit, and its closest point to the Sun is called perihelion.

– The optimal point to do a reverse Solar Oberth and stay bound to the Sun is at perihelion.

– 3I/ATLAS’s incoming radiant made it hard to detect sooner

The incoming radiant of comet 3I/ATLAS aligning with the Galactic Center, a bright and crowded region of the sky, made it difficult to detect, according to an Instagram post. This unusual entry path, coupled with the comet’s potential “silent propulsion” (lacking typical outgassing) and close encounters with planets, contributed to its delayed detection.

Technical papers announcement: The technical paper was published on the preprint server arXiv on July 16, 2025. For more information, please refer to the link – https://arxiv.org/abs/2507.12213

Yahoo headlines from July 27, 2025 https://www.yahoo.com/news/articles/possibly-hostile-alien-object-could-023132776.html

CVE-2025-54574: About Squid. Stay alert! (4 Aug 2025)

NVD Published Date: 08/01/2025

NVD Last Modified: 08/01/2025

Preface: While HTTP/1.0 is largely obsolete, HTTP/1.1 remains in widespread use, despite the newer HTTP/2 and HTTP/3 protocols. Though HTTP/1.1 has been updated in recent years, its core functionality is still foundational for much of the web.

Does processing Uniform Resource Names consume memory?

Yes, processing Uniform Resource Names (URNs) can consume memory. While URNs themselves are symbolic names and don’t directly represent the resource’s location or data, they need to be processed to resolve them, which often involves memory allocation for parsing, data storage, and potential redirection handling.

Background: Squid Proxy is a caching proxy, and that’s a key aspect of how it functions. It’s not just a proxy that forwards requests; it also stores copies of frequently accessed web content locally. This caching behavior significantly speeds up subsequent requests for the same content, making it faster and more efficient than a simple forwarding proxy.

A “Trivial-HTTP response,” often abbreviated as THTTP, refers to a convention for encoding resolution service requests and responses using the HTTP/1.0 or HTTP/1.1 protocols, as defined in RFC 2169.

Squid Proxy is primarily developed using C++. While it utilizes some C language components and libraries, the dominant language in its codebase is C++.

Ref: STCB, in the context of Squid cache, refers to the StoreEntry data structure, which is a key component of how Squid caches web content in memory. It’s a relatively small amount of metadata associated with each cached object, stored in memory to speed up access and retrieval

Vulnerability details: Squid is a caching proxy for the Web. In versions 6.3 and below, Squid is vulnerable to a heap buffer overflow and possible remote code execution attack when processing URN due to incorrect buffer management. This has been fixed in version 6.4. To work around this issue, disable URN access permissions.

Official announcement: Please see the link for details –

https://nvd.nist.gov/vuln/detail/CVE-2025-54574

antihackingonline.com