Category Archives: Application Development

CVE-2024-0132: About NVIDIA Container Toolkit 1.16.1 or earlier contains a Time-of-check Time-of-Use (TOCTOU) vulnerability (25th Sep 2024)

Preface: In software development, time-of-check to time-of-use (TOCTOU, TOCTTOU or TOC/TOU) is a class of software bugs caused by a race condition involving the checking of the state of a part of a system (such as a security credential) and the use of the results of that check.

Background: The NVIDIA container stack is architected so that it can be targeted to support any container runtime in the ecosystem. The components of the stack include:

-The NVIDIA Container Runtime (nvidia-container-runtime)

-The NVIDIA Container Runtime Hook (nvidia-container-toolkit / nvidia-container-runtime-hook)

-The NVIDIA Container Library and CLI (libnvidia-container1, nvidia-container-cli)

The components of the NVIDIA container stack are packaged as the NVIDIA Container Toolkit.

The NVIDIA Container Toolkit is a key component in enabling Docker containers to leverage the raw power of NVIDIA GPUs. This toolkit allows for the integration of GPU resources into your Docker containers.

Vulnerability details: NVIDIA Container Toolkit 1.16.1 or earlier contains a Time-of-check Time-of-Use (TOCTOU) vulnerability when used with default configuration where a specifically crafted container image may gain access to the host file system. This does not impact use cases where CDI is used. A successful exploit of this vulnerability may lead to code execution, denial of service, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5582

CVE-2024-34731: last week CVEs, today story. (29th Aug 2024)

Preface: A race condition vulnerability is a software bug that allows these unexpected results to be exploited by malicious entities.

The Race condition is a privilege escalation vulnerability that manipulates the time between imposing a security control and using services in a UNIX like system. This vulnerability is a result of interferences caused by multiple sequential threads running in the system and sharing the same resources.

Background: TranscodingResourcePolicy is a component of the Android platform/frameworks/av package that manages resource policies for transcoding operations. Transcoding is the process of converting media files from one format to another.

Vulnerability details: In multiple functions of TranscodingResourcePolicy.cpp, there is a possible memory corruption due to a race condition. This could lead to local escalation of privilege with no additional execution privileges needed. User interaction is not needed for exploitation.

Official announcement: Please refer to the link for details – https://nvd.nist.gov/vuln/detail/CVE-2024-34731

RHSA-2024-4982 -Security Advisory- OpenShift API for Data Protection (OADP) – Security Fix – golang: net/netip – CVE-2024-24790 (2nd Aug 2024)

Preface: The IPv4-mapped IPv6 address format allows the IPv4 address of an IPv4 node to be represented as an IPv6 address. The IPv4 address is encoded into the low-order 32 bits of the IPv6 address, and the high-order 96 bits hold the fixed prefix 0:0:0:0:0:FFFF.

Background: OpenShift API for Data Protection (OADP) enables you to back up and restore application resources, persistent volume data, and internal container images to external backup storage. OADP enables both file system-based and snapshot-based backups for persistent volumes.

Package netip defines an IP address type that’s a small value type. Building on that Addr type, the package also defines AddrPort (an IP address and a port) and Prefix (an IP address and a bit length prefix).

Compared to the net.IP type, Addr type takes less memory, is immutable, and is comparable (supports == and being a map key).

Vulnerability details: OpenShift API for Data Protection (OADP) enables you to back up and restore application resources, persistent volume data, and internal container images to external backup storage. OADP enables both file system-based and snapshot-based backups for persistent volumes.

Security Fixes from Bugzilla: golang: net/netip: Unexpected behavior from Is methods for IPv4-mapped IPv6 addresses (CVE-2024-24790)

Official announcement: Please refer to the website for details – https://access.redhat.com/errata/RHSA-2024:4982

Regarding CVE-2024-0108: The manufacturer did not describe much. Is the situation below exactly what CVE mentioned? (25/07/2024)

Preface: What is an example of autonomous AI?

Autonomous intelligence is artificial intelligence (AI) that can act without human intervention, input, or direct supervision. It’s considered the most advanced type of artificial intelligence. Examples may include smart manufacturing robots, self-driving cars, or care robots for the elderly.

Background: What is Jetson AGX Xavier used for?

As the world’s first computer designed specifically for autonomous machines, Jetson AGX Xavier has the performance to handle the visual odometry, sensor fusion, localization and mapping, obstacle detection, and path-planning algorithms that are critical to next-generation robots.

Vulnerability details: NVIDIA Jetson Linux contains a vulnerability in NvGPU where error handling paths in GPU MMU mapping code fail to clean up a failed mapping attempt. A successful exploit of this vulnerability may lead to denial of service, code execution, and escalation of privileges.

Official announcement: Please refer to the official announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5555

CVE-2024-6960: H2O Model Deserialization RCE (21st July 2024)

Preface: TensorFlow provides a flexible framework for deep learning tasks, but may not be as optimized as H2O for handling large datasets.

Background: H2O uses Iced classes as the primary means of moving Java Objects around the cluster.

Auto-serializer base-class using a delegator pattern (the faster option is to byte-code gen directly in all Iced classes, but this requires all Iced classes go through a ClassLoader).

Iced is a marker class, and Freezable is the companion marker interface. Marked classes have 2-byte integer type associated with them, and an auto-genned delegate class created to actually do byte-stream and JSON serialization and deserialization. Byte-stream serialization is extremely dense (includes various compressions), and typically memory-bandwidth bound to generate.

Vulnerability details: The H2O machine learning platform uses “Iced” classes as the primary means of moving Java Objects around the cluster. The Iced format supports inclusion of serialized Java objects. When a model is deserialized, any class is allowed to be deserialized (no class whitelist). An attacker can construct a crafted Iced model that uses Java gadgets and leads to arbitrary code execution when imported to the H2O platform.

Official announcement: Please refer to the official announcement for details – https://nvd.nist.gov/vuln/detail/CVE-2024-6960

A critical step in exploiting a buffer overflow is determining the offset where important program control information is overwritten. In the Linux kernel, the (CVE-2024-41011) vulnerability has been resolved. (18-07-2024)

Preface: The PAGE_SIZE macro defined in the Linux kernel source determines the page size. Its definition is in the kernel header file /usr/src/kernels/5.14[.] 0-22. el9[.] x86_64/include/asm-generic/page.

Background: MMIO stands for Memory-Mapped Input/Output. In Linux, MMIO is a mechanism used by devices to interface with the CPU that involves mapping their control registers and buffers directly into the processor’s memory address space.

This enables the CPU to access device registers and exchange data with devices using load and store instructions, just as if they were conventional memory locations. Graphics cards, network interfaces, and storage controllers all employ MMIO to effectively conduct input and output tasks.

Vulnerability details: drm/amdkfd: don’t allow mapping the MMIO HDP page with large pages We don’t get the right offset in that case. The GPU has an unused 4K area of the register BAR space into which you can remap registers. We remap the HDP flush registers into this space to allow userspace (CPU or GPU) to flush the HDP when it updates VRAM. However, on systems with >4K pages, we end up exposing PAGE_SIZE of MMIO space.

Official announcement: Please refer to the official announcement for details – https://nvd.nist.gov/vuln/detail/CVE-2024-41011

CVE-2024-0102:  About NVIDIA® CUDA® Toolkit. If you remember, a similar incident happened in April of this year. Believe this is a weakness of similar designs. (11 July 2024)

Preface: OpenAI revealed that the project cost $100 million, took 100 days, and used 25,000 NVIDIA A100 GPUs. Each server equipped with these GPUs uses approximately 6.5 kW, so an estimated 50 GWh of energy is consumed during training.

Background: Parallel processing is a method in computing of running two or more processors (CPUs) to handle separate parts of an overall task. Breaking up different parts of a task among multiple processors will help reduce the amount of time to run a program. GPUs render images more quickly than a CPU because of its parallel processing architecture, which allows it to perform multiple calculations across streams of data simultaneously. The CPU is the brain of the operation, responsible for giving instructions to the rest of the system, including the GPU(s).

NVIDIA CUDA provides a simple C/C++ based interface. The CUDA compiler leverages parallelism built into the CUDA programming model as it compiles your program into code.
CUDA is a parallel computing platform and programming interface model created by Nvidia for the development of software which is used by parallel processors. It serves as an alternative to running simulations on traditional CPUs.

Vulnerability details: NVIDIA CUDA Toolkit for all platforms contains a vulnerability in nvdisasm, where an attacker can cause an out-of-bounds read issue by deceiving a user into reading a malformed ELF file. A successful exploit of this vulnerability might lead to denial of service.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5548

Get closer look CVE-2024-39920: About “SnailLoad” issue (5-Jul-2024)

NVD Published Date: 07/03/2024

Preface: How is RTT measured in TCP? Measures the time from sending a packet to getting an acknowledgment packet from the target host.

Background: A new technology standard called “RFC 9293” was released on August 18, 2022.

Highlight:

-Acknowledgment Number:  32 bits – If the ACK control bit is set, this field contains the value of the next sequence number the sender of the segment is expecting to receive.  Once a connection is established, this is always sent.

-There are also methods of “fingerprinting” that can be used to infer the host TCP implementation (operating system) version or platform
information. These collect observations of several aspects, such as
the options present in segments, the ordering of options, the
specific behaviors in the case of various conditions, packet timing,
packet sizing, and other aspects of the protocol that are left to be
determined by an implementer, and can use those observations to
identify information about the host and implementation.

Vulnerability details: The TCP protocol in RFC 9293 has a timing side channel that makes it easier for remote attackers to infer the content of one TCP connection from a client system (to any server), when that client system is concurrently obtaining TCP data at a slow rate from an attacker-controlled server, aka the “SnailLoad” issue. For example, the attack can begin by measuring RTTs via the TCP segments whose role is to provide an ACK control bit and an Acknowledgment Number.

Official announcement: For detail, please refer to link – https://nvd.nist.gov/vuln/detail/CVE-2024-39920

CVE-2024-31335 GPU – PowerVR: Wrong order of operations in DevmemIntUnmapPMR2 may lead to temporarily dangling PTEs.AI accelerators called Neural Network Accelerator (NNA) staying alert! (7 June 2024)

Official Posted: 31st May 2024

Preface: PowerVR not limited 2D and 3D rendering, and for video encoding, decoding, associated image processing. It also develops AI accelerators called Neural Network Accelerator (NNA). The IMG Series4 is a revolutionary neural network accelerator (NNA) for the automotive industry that enables ADAS and autonomous driving.

PowerVR accelerators are not manufactured by PowerVR, but instead their IP blocks of integrated circuit designs and patents are licensed to other companies.

Remark: An IP block is a reusable unit of logic, cell, or chip layout design and can be used as building block for various chip- and logic designs. By making this technology available NXP is opening up the opportunity for chip designers to leverage our building blocks in a wide assortment of on-chip solutions.

Background: What is DDK? To build the Android kernel and other kernel artifacts (modules, boot images, etc.), they provide a framework called “Kleaf”. For Android 14+, Kleaf is strongly recommended. One part of Kleaf is the Driver Development Kit (DDK) which is used to build external modules.

Vulnerability details: CVE-2024-31335 – GPU – PowerVR: Wrong order of operations in DevmemIntUnmapPMR2 may lead to temporarily dangling PTEs.

Software installed and run as a non-privileged user may conduct improper GPU system calls to trigger use-after-free kernel exceptions.

Official announcement: For detail, please refer to link –

https://www.imaginationtech.com/gpu-driver-vulnerabilities/#may24

CVE-2024-0103 – NVIDIA Triton Inference Server for Linux hit Incorrect Initialization of Resource vulnerability (31-05-2024)

Preface: AI-powered systems analyse the severity of the vulnerability, potential impact, and exploitability and prioritise patches based on the criticality of the vulnerability. Perhaps AI contains self diagostic and do remedy by himself!

Background: An open-source software that helps standardize model deployment and delivers fast and scalable AI in production.

Vulnerability details:

CVE-2024-0103 Information disclosure

NVIDIA Triton Inference Server for Linux contains a vulnerability where a user may cause an incorrect Initialization of resource by network issue. A successful exploit of this vulnerability may lead to information disclosure.

Ref: For example, the minimum packet size is 60 bytes (the card typically adds a frame checksum to this, making the minimum packet size on the line 64 bytes). If you only have 40 bytes, then it will still transmit 60 bytes.

Because 40 bytes you send plus the next 20 bytes that happen to be sitting in the buffer beyond the 40 you intended to send.

If you haven’t explicitly initialized that area, those 20 bytes might well be data leftover from a previously sent packet, which may have belonged to some other connection. Or that memory could have previously been a data page for some program that was recently running (and hence could contain a password, or an encryption key or just about any kind of sensitive information).

Official details: For detail, please refer to link – https://nvidia.custhelp.com/app/answers/detail/a_id/5546