Category Archives: AI and ML

CVE-2024-47561: Apache Avro Java SDK: Arbitrary Code Execution when reading Avro Data (Java SDK)

Preface: Kafka understands only byte arrays. Kafka acts as a Broker to convert and transmit data over the network between producers and consumers. But it need a mechanism to convert data into a format that Kafka, producers and consumers can understand.

Background: Apache Avro is a powerful data serialization framework that provides many useful features. It uses the AVRO file format, which is a compact binary format suitable for evolving data schemas. For example, it supports schema enforcement and schema transformations, which are essential for data integrity and compatibility.

Vulnerability details: Schema parsing in the Java SDK of Apache Avro 1.11.3 and previous versions allows bad actors to execute arbitrary code. Users are recommended to upgrade to version 1.11.4 or 1.12.0, which fix this issue.

Official announcement: Please refer to the link for details – https://lists.apache.org/thread/c2v7mhqnmq0jmbwxqq3r5jbj1xg43h5x

CVE‑2024-0103, CVE-2024-0104 & CVE-2024-0105 Interference from the development of supercomputers and artificial intelligence (3rd Oct 2024)

Preface: OpenAI revealed that the project cost $100 million, took 100 days, and used 25,000 NVIDIA A100 GPUs. Each server equipped with these GPUs uses approximately 6.5 kW, so an estimated 50 GWh of energy is consumed during training.

Background: Parallel processing is a method in computing of running two or more processors (CPUs) to handle separate parts of an overall task. Breaking up different parts of a task among multiple processors will help reduce the amount of time to run a program. GPUs render images more quickly than a CPU because of its parallel processing architecture, which allows it to perform multiple calculations across streams of data simultaneously. The CPU is the brain of the operation, responsible for giving instructions to the rest of the system, including the GPU(s).

NVIDIA CUDA provides a simple C/C++ based interface. The CUDA compiler leverages parallelism built into the CUDA programming model as it compiles your program into code.
CUDA is a parallel computing platform and programming interface model created by Nvidia for the development of software which is used by parallel processors. It serves as an alternative to running simulations on traditional CPUs.

Vulnerability details:

CVE-2024-0123 – NVIDIA CUDA toolkit for Windows and Linux contains a vulnerability in the nvdisasm command line tool where an attacker may cause an improper validation in input issue by tricking the user into running nvdisasm on a malicious ELF file. A successful exploit of this vulnerability may lead to denial of service. (CWE‑1285 – Improper Validation of Specified Index, Position, or Offset in Input)

CVE-2024-0124 – NVIDIA CUDA Toolkit for Windows and Linux contains a vulnerability in the nvdisam command line tool, where a user can cause nvdisasm to read freed memory by running it on a malformed ELF file. A successful exploit of this vulnerability might lead to a limited denial of service. (CWE-416 – Use After Free)

CVE-2024-0125 – NVIDIA CUDA Toolkit for Windows and Linux contains a vulnerability in the nvdisam command line tool, where a user can cause a NULL pointer dereference by running nvdisasm on a malformed ELF file. A successful exploit of this vulnerability might lead to a limited denial of service. (CWE-476 – NULL Pointer Dereference)

Official announcement: Please refer to the vendor announcement for details –

https://nvidia.custhelp.com/app/answers/detail/a_id/5577

CVE-2024-0116: NVIDIA Triton Inference Server contains a vulnerability where a user may cause an out-of-bounds.(2nd Oct 2024)

Preface: Some systems which implement malloc() may not release memory back to the operating system right away causing a false memory leak.

Background: Triton Inference Server provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. Triton supports an HTTP/REST and GRPC protocol that allows remote clients to request inferencing for any model being managed by the server.

Vulnerability details: NVIDIA Triton Inference Server contains a vulnerability where a user may cause an out-of-bounds read issue by releasing a shared memory region while it is in use. A successful exploit of this vulnerability may lead to denial of service.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5565

CVE-2024-0132: About NVIDIA Container Toolkit 1.16.1 or earlier contains a Time-of-check Time-of-Use (TOCTOU) vulnerability (25th Sep 2024)

Preface: In software development, time-of-check to time-of-use (TOCTOU, TOCTTOU or TOC/TOU) is a class of software bugs caused by a race condition involving the checking of the state of a part of a system (such as a security credential) and the use of the results of that check.

Background: The NVIDIA container stack is architected so that it can be targeted to support any container runtime in the ecosystem. The components of the stack include:

-The NVIDIA Container Runtime (nvidia-container-runtime)

-The NVIDIA Container Runtime Hook (nvidia-container-toolkit / nvidia-container-runtime-hook)

-The NVIDIA Container Library and CLI (libnvidia-container1, nvidia-container-cli)

The components of the NVIDIA container stack are packaged as the NVIDIA Container Toolkit.

The NVIDIA Container Toolkit is a key component in enabling Docker containers to leverage the raw power of NVIDIA GPUs. This toolkit allows for the integration of GPU resources into your Docker containers.

Vulnerability details: NVIDIA Container Toolkit 1.16.1 or earlier contains a Time-of-check Time-of-Use (TOCTOU) vulnerability when used with default configuration where a specifically crafted container image may gain access to the host file system. This does not impact use cases where CDI is used. A successful exploit of this vulnerability may lead to code execution, denial of service, escalation of privileges, information disclosure, and data tampering.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5582

CVE-2024-39928: Vulnerability in Apache Linkis – Spark EngineConn (25-09-2024)

Preface: Apache Linkis is a computation middleware that acts as a layer between upper-level applications and underlying engines, such as Apache Spark, Apache Hive and Apache Flink. It started as an Apache Incubator project in 2021 and graduated to a Top Level Project in January 2023.

Background: Linkis provides standardized interfaces (REST, JDBC, WebSocket etc.) to easily connect to various underlying engines (Spark, Presto, Flink, etc.), and acts as a proxy between the upper applications layer and underlying engines layer.

Ref: Py4J is a popularly library integrated within PySpark that lets python interface dynamically with JVM objects (RDD’s). Apache Spark comes with an interactive shell for python as it does for Scala. The shell for python is known as “PySpark”. To use PySpark you will have to have python installed on your machine.

Vulnerability details: In Apache Linkis <= 1.5.0, a Random string security vulnerability in Spark EngineConn, random string generated by the Token when starting Py4j uses the Commons Lang’s RandomStringUtils.

Remedy: Users are recommended to upgrade to version 1.6.0, which fixes this issue.

Official announcement: Please refer to the vendor announcement for details – https://lists.apache.org/thread/g664n13nb17rsogcfrn8kjgd8m89p8nw

CVE-2024-8375: A vulnerability has been found in Google Deepmind Reverb object deserialization. (20-09-2024)

Preface: As companies and researchers leave Tensorflow and move to PyTorch, Google seems interested in moving its products to JAX to solve some of Tensorflow’s pain points, such as the complexity of the API and the complexity of training in custom chips such as TPUs.

PyTorch optimizes performance by taking advantage of Python’s native support for asynchronous execution. In TensorFlow, you have to manually code and fine-tune every operation to be performed on a specific device to allow for decentralized training.

Background: TensorFlow is open-source Python library designed by Google to develop Machine Learning models and deep learning neural networks.

Reverb is an efficient and easy-to-use data storage and transport system designed for machine learning research. Reverb is primarily used as an experience replay system for distributed reinforcement learning algorithms but the system also supports multiple data structure representations such as FIFO, LIFO, and priority queues.

Vulnerability details: This is a retroactive issue for the already-fixed security vulnerability. The Reverb Server stores Tensors represented by protos. These protos contain type information as well as a string field called “tensor_content”. When the Reverb client communicates with the server, it unpacks these protos by turning them back into tensors.

Reverb supports the VARIANT datatype, which is supposed to represent an arbitrary object in C++. When a tensor proto of type VARIANT is unpacked, memory is first allocated to store the entire tensor, and a ctor is called on each instance. Afterwards, Reverb copies the content in tensor_content to the previously mentioned pre-allocated memory, which results in the bytes in tensor_content overwriting the vtable pointers of all the objects which were previously allocated.

Reverb exposes 2 relevant gRPC endpoints: InsertStream and SampleStream. By default, neither is authenticated and there is no authorization. The attacker can insert this stream into the server’s database, then when the client next calls SampleStream they will unpack the tensor into RAM, and when any method on that object is called (including its destructor) the attacker gains control of the Program Counter.

Official announcement: Please refer to the vendor announcement for details – https://www.tenable.com/cve/CVE-2024-8375

CVE-2024-44961: Forward soft recovery errors to userspace (5th Sep 2024)

Preface: The AMD Radeon Instinct™ MI50 server accelerator designed on the world’s First 7nm FinFET technology process brings customers a full-feature set based on the industry’s newest technologies. The MI50 is AMD’s workhorse accelerator offering that is ideal for large scale deep learning. Delivering up to 26.5 TFLOPS of native half-precision (FP16) or up to 13.3 TFLOPS single-precision (FP32) peak floating point performance and INT8 support and combined with 16GB or 32GB of high-bandwidth HBM2 ECC memory, the AMD Radeon Instinct™ MI50 brings customers finely balanced performance needed for enterprise-class, mid-range compute capable of training complex neural networks for a variety of demanding deep learning applications in a cost effective design.

Background: The drm/amdgpu driver supports all AMD Radeon GPUs based on the Graphics Core Next (GCN), Radeon DNA (RDNA), and Compute DNA (CDNA) architectures.

CDNA (Compute DNA) is a compute-centered graphics processing unit (GPU) microarchitecture designed by AMD for datacenters.

AMD CDNA architecture is supported by AMD ROCm™, an open software stack that includes a broad set of programming models, tools, compilers, libraries, and runtimes for AI and HPC solution development targeting AMD Instinct accelerators.

Vulnerability details: In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: Forward soft recovery errors to userspace As we discussed before[1], soft recovery should be forwarded to userspace, or we can get into a really bad state where apps will keep submitting hanging command buffers cascading us to a hard reset. 1: https://lore.kernel.org/all/bf23d5ed-9a6b-43e7-84ee-8cbfd0d60f18@froggi.es/ (cherry picked from commit 434967aadbbbe3ad9103cc29e9a327de20fdba01)

Official announcement: Please refer to the website for details – https://nvd.nist.gov/vuln/detail/CVE-2024-44961

CVE‑2024-0110: Supercomputer and AI development Interlude (II)  (30th Aug 2024)

Preface: OpenAI revealed that the project cost $100 million, took 100 days, and used 25,000 NVIDIA A100 GPUs. Each server equipped with these GPUs uses approximately 6.5 kW, so an estimated 50 GWh of energy is consumed during training.

Background: Parallel processing is a method in computing of running two or more processors (CPUs) to handle separate parts of an overall task. Breaking up different parts of a task among multiple processors will help reduce the amount of time to run a program. GPUs render images more quickly than a CPU because of its parallel processing architecture, which allows it to perform multiple calculations across streams of data simultaneously. The CPU is the brain of the operation, responsible for giving instructions to the rest of the system, including the GPU(s).

NVIDIA CUDA provides a simple C/C++ based interface. The CUDA compiler leverages parallelism built into the CUDA programming model as it compiles your program into code.
CUDA is a parallel computing platform and programming interface model created by Nvidia for the development of software which is used by parallel processors. It serves as an alternative to running simulations on traditional CPUs.

Vulnerability details:

CVE-2024-0110: NVIDIA CUDA Toolkit contains a vulnerability in command `cuobjdump` where a user may cause an out-of-bound write by passing in a malformed ELF file. A successful exploit of this vulnerability may lead to code execution or denial of service.

CWE‑787   Code execution, denial of service (Severity –  Medium)

Ref: The integer overflow may result in an out-of-bounds write.

Official announcement: Please refer to the vendor announcement for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5564

CVE-2024-43855: md/raid5 – recheck if reshape has finished with device_lock held. From technical point of view, it also impact RedHat cluster. (18 Aug 2024)

Preface: LVM version 2, or LVM2, is the default for Red Hat Enterprise Linux, which uses the device mapper driver contained in the 2.6 kernel. LVM2, which is almost completely compatible with the earlier LVM1 version, can be upgraded from versions of Red Hat Enterprise Linux running the 2.4 kernel.

The Clustered Logical Volume Manager (CLVM) is a set of clustering extensions to LVM. These extensions allow a cluster of computers to manage shared storage (for example, on a SAN) using LVM.

Background: In the Mutex concept, when the thread is trying to lock or acquire the Mutex which is not available then that thread will go to sleep until that Mutex is available. Whereas in Spinlock it is different. The spinlock is a very simple single-holder lock. If a process attempts to acquire a spinlock and it is unavailable, the process will keep trying (spinning) until it can acquire the lock. This simplicity creates a small and fast lock.

Vulnerability details: Deadlock occurs when mddev is being suspended while some flush bio is in progress. It is a complex issue.

T1. the first flush is at the ending stage, it clears ‘mddev->flush_bio’ and tries to submit data, but is blocked because mddev is suspended by T4.

T2. the second flush sets ‘mddev->flush_bio’, and attempts to queue md_submit_flush_data(), which is already running (T1) and won’t execute again if on the same CPU as T1.

T3. the third flush inc active_io and tries to flush, but is blocked because ‘mddev->flush_bio’ is not NULL (set by T2).

T4. mddev_suspend() is called and waits for active_io dec to 0 which is inc by T3.

The root issue is non-atomic inc/dec of active_io during flush process.

Official announcement: For details, please refer to link –

https://nvd.nist.gov/vuln/detail/CVE-2024-43855

CVE-2024-5914 Cortex XSOAR: Improper Neutralization of Special Elements used in a Command (16th Aug 2024)

Preface: You can hold YAML content in files with any extension: .yml, .yaml or indeed anything else.

Background: Cortex XSOAR combines security orchestration, incident management, and interactive investigation into a seamless experience. The orchestration engine is designed to automate security product tasks and weave in human analyst tasks and workflows. Cortex XSOAR is powered by DBot, which learns from real-life analyst interactions and past investigations to help SOC teams with analyst assignment suggestions, playbook enhancements, and best next steps for investigations. With Cortex XSOAR, security teams can build future-proof security operations to reduce MTTR, create consistent and audited incident management process, and increase analyst productivity.

Remark: dBot is the Databank’s new AI-Powered Assistant.

Common Scripts are scripts that contain common code (functions, variables, etc.) to be used across scripts which can be embedded when writing your own Automation scripts and Integrations. The common scripts appear in the Automation page, but are used to enhance the API in other scripts and integrations.

Vulnerability details: A command injection issue in Palo Alto Networks Cortex XSOAR CommonScripts Pack allows an unauthenticated attacker to execute arbitrary commands within the context of an integration container.

Official announcement: Please refer to the website for details – https://security.paloaltonetworks.com/CVE-2024-5914