Category Archives: System

CVE-2023-31315: AMD SMM Lock Bypass (21-Aug-2024)

Preface: AMD EPYC™ Processors power the highest-performing x86 servers for the modern data center, on prem and in cloud environments, across industries.

Background: Model-specific registers (MSR) are control registers provided by the processor implementation so that system software can interact with a variety of features, including performance monitoring, checking processor status, debugging, program tracing or toggling specific CPU features.

Intel and AMD may use the same MSR for the same feature, such as the IA32_LSTAR MSR register.

When it came to the Intel Pentium processor, Intel officially introduced two instructions, RDMSR and WRMSR, for reading and writing the MSR temporary register. At this time, MSR was officially introduced. When the RDMSR and WRMSR instructions were introduced, the CPUID instruction was also introduced. This instruction is used to indicate which functions are available in a specific CPU chip, or whether the MSR registers corresponding to these functions exist. The software can query a certain function through the CPUID instruction. Whether these functions are supported on the current CPU.

Vulnerability details: Improper validation in a model specific register (MSR) could allow a malicious program with ring0 access to modify SMM configuration while SMI lock is enabled, potentially leading to arbitrary code execution.

Ref: Researchers from IOActive have reported that it may be possible for an attacker with ring 0 access to modify the configuration of System Management Mode (SMM) even when SMM Lock is enabled.

Official announcement: Please refer to the link for details – https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7014.html

CVE-2023-52910 – iommu/iova: Fix alloc iova overflows issue (21-08-2024)

Preface: Modern hardware provides an I/O memory management unit (IOMMU) that mediates direct memory accesses (DMAs) by I/O devices in the same way that a processor’s MMU mediates memory accesses by instructions.

Background: With IOMMU, when the device performs DMA access to memory, the system returns to the device driver no longer a physical address, but a virtual address. This address is generally called IOVA. When the device accesses memory, IOMMU converts this virtual address into a physical address. But when iommu bypass is used, the device can also directly use the physical address for DMA.

Vulnerability details: This issue occurs in the following two situations

-The first iova size exceeds the domain size. When initializing iova domain, iovad->cached_node is assigned as iovad->anchor. For example, the iova domain size is 10M, start_pfn is 0x1_F000_0000, and the iova size allocated for the first time is 11M.

-The node with the largest iova->pfn_lo value in the iova domain is deleted, iovad->cached_node will be updated to iovad->anchor, and then the alloc iova size exceeds the maximum iova size that can be allocated in the domain.

Official announcement: Please refer to the url for details – https://nvd.nist.gov/vuln/detail/CVE-2023-52910

CVE-2024-44070: FRRouting (FRR) – bgpd – ensure the hash works  (18th Aug 2024)

Preface: As Time Goes By , OSS (Open Source Software) for use by cost-conscious commercial companies. It is quite popular in cloud.

Background: FRRouting (FRR) is a free and open source Internet routing protocol suite for Linux and Unix platforms. It implements BGP, OSPF, RIP, IS-IS, PIM, LDP, BFD, Babel, PBR, OpenFabric and VRRP, with alpha support for EIGRP and NHRP.

FRR’s seamless integration with native Linux/Unix IP networking stacks makes it a general purpose routing stack applicable to a wide variety of use cases including connecting hosts/VMs/containers to the network, advertising network services, LAN switching and routing, Internet access routers, and Internet peering.

Vulnerability details: An issue was discovered in FRRouting (FRR) through 10.1. bgp_attr_encap in bgpd/bgp_attr.c does not check the actual remaining stream length before taking the TLV value.

Official announcement: For details, please refer to link – https://www.tenable.com/cve/CVE-2024-44070

CVE-2024-43855: md/raid5 – recheck if reshape has finished with device_lock held. From technical point of view, it also impact RedHat cluster. (18 Aug 2024)

Preface: LVM version 2, or LVM2, is the default for Red Hat Enterprise Linux, which uses the device mapper driver contained in the 2.6 kernel. LVM2, which is almost completely compatible with the earlier LVM1 version, can be upgraded from versions of Red Hat Enterprise Linux running the 2.4 kernel.

The Clustered Logical Volume Manager (CLVM) is a set of clustering extensions to LVM. These extensions allow a cluster of computers to manage shared storage (for example, on a SAN) using LVM.

Background: In the Mutex concept, when the thread is trying to lock or acquire the Mutex which is not available then that thread will go to sleep until that Mutex is available. Whereas in Spinlock it is different. The spinlock is a very simple single-holder lock. If a process attempts to acquire a spinlock and it is unavailable, the process will keep trying (spinning) until it can acquire the lock. This simplicity creates a small and fast lock.

Vulnerability details: Deadlock occurs when mddev is being suspended while some flush bio is in progress. It is a complex issue.

T1. the first flush is at the ending stage, it clears ‘mddev->flush_bio’ and tries to submit data, but is blocked because mddev is suspended by T4.

T2. the second flush sets ‘mddev->flush_bio’, and attempts to queue md_submit_flush_data(), which is already running (T1) and won’t execute again if on the same CPU as T1.

T3. the third flush inc active_io and tries to flush, but is blocked because ‘mddev->flush_bio’ is not NULL (set by T2).

T4. mddev_suspend() is called and waits for active_io dec to 0 which is inc by T3.

The root issue is non-atomic inc/dec of active_io during flush process.

Official announcement: For details, please refer to link –

https://nvd.nist.gov/vuln/detail/CVE-2024-43855

CVE-2024-21969: (AMD security focus) Whispering Pixels – Exploiting Uninitialized Register Accesses in Modern GPUs (14th Aug 2024)

Preface: The new AMD Radeon Instinct MI50 hints at the capabilities of AMD’s future GPUs. A study proof MI50 is capable scientific and ML applications.

Background: The proliferation of graphics processing units (GPUs) has brought unprecedented computing power.

Multiple register-based vulnerabilities found across different GPU implementations.

So-called whisper pixels. The vulnerability poses unique challenges to an adversary due to opaque scheduling and register remapping algorithms present in the GPU firmware, complicating the reconstruction of leaked data.

GPU Programming: An application has to use vendor- provided libraries in order to translate a shader from its high-level source code to an architecture-dependent binary code. Vendors provide these libraries for a variety of high-level languages.

Vulnerability details: Improper clearing of GPU registers could allow a malicious shader to read left-over pixel data leading to loss of confidentiality.

Mitigation: AMD plans to create a new operating mode designed to prevent processes from running in parallel on the GPU, and to clear registers between processes on supported products.

Official announcement: Please refer to the website for details – https://www.amd.com/en/resources/product-security/bulletin/amd-sb-6013.html

CVE-2024-38160: Windows Network Virtualization Remote Code Execution Vulnerability (13th Aug 2024)

Preface: A HEAP-Based buffer overflow vulnerability occurs when a program writes more data to a heap-allocated memory buffer than the buffer is designed to hold.

Background: Microsoft provides network virtualization in Hyper-V with Windows Server 2016 and 2019. With this feature, workloads on Hyper-V can connect to virtualized Layer 2 networks and traffic is routed between virtual networks in Hyper-V or to and from the physical network via gateways.

Vulnerability details: Windows Network Virtualization Remote Code Execution Vulnerability.

My speculation: The new SDN features starting from windows server 2016. Because Network Controller uses Representational State Transfer (REST) on its northbound interface with JavaScript Object Notation (JSON) payloads.

As a matter of fact, it is possible to exploit heap overflow techniques in the JavaScript interpreter. Are the vulnerabilities reported by Microsoft related to this factor?

Official announcement: Please refer to the url for details – https://nvd.nist.gov/vuln/detail/CVE-2024-38160

CVE-2024-42257: In the Linux kernel, the vulnerability related to ext4 has been resolved (12th Aug 2024)

Published: 2024-08-08

Updated: 2024-08-07

Preface: Ext4 is a series of backwards-compatible extensions of Ext2. It is also the file system of most Linux distributions. Ext4 is supported on other operating systems including Windows, Free BSD, macOS and KolibriOS (read-only)

Background: Ext4 – The fourth generation extension file system is a log file system under the Linux system and is the successor version of the ext3 file system.

Advantage

-Has the largest single file size and volume file system size

-Supports all bytes except NULL and ‘/.’

-You can convert Ext3 file system to Ext4

-Includes advanced features such as stretching, directory indexing,

-delayed allocation and disk defragmentation

-Unlimited subdirectories

Disadvantage

-No data security provided

-Difficulty creating snapshots on different volumes

-Use more disk space

Vulnerability details: In the Linux kernel, the following vulnerability has been resolved: ext4: use memtostr_pad() for s_volume_name As with the other strings in struct ext4_super_block, s_volume_name is not NUL terminated. The other strings were marked in commit 072ebb3bffe6 (“ext4: add nonstring annotations to ext4.h”). Using strscpy() isn’t the right replacement for strncpy(); it should use memtostr_pad() instead.

Ref: Failure to properly null-terminate a character sequence that is passed to a library function that expects a string can result in buffer overflows and the execution of arbitrary code with the permissions of the vulnerable process. Null-termination errors can also result in unintended information disclosure.

Official announcement: Please refer to the link for details – https://www.tenable.com/cve/CVE-2024-42257

CVE-2024-7589: Race condition vulnerability occurs when integration of blacklisted in OpenSSH in FreeBSD. (11-Aug-2024)

Preface: SSH clients are designed for direct user interaction, providing a command-line interface (CLI) or graphical user interface (GUI) on the initiating device. The SSHD operates as a background process, running silently in the background without any user intervention.

Background: How do I make my SSH connection more stable?

SSH servers often have an idle timeout period, after which they automatically disconnect idle sessions. To prevent premature disconnections, consider modifying the server’s idle timeout setting. To modify the idle timeout: Locate the SSH server configuration file, typically located at /etc/ssh/sshd_config .

Vulnerability details: A signal handler in sshd(8) may call a logging function that is not async-signal-safe. The signal handler is invoked when a client does not authenticate within the LoginGraceTime seconds (120 by default). This signal handler executes in the context of the sshd(8)’s privileged code, which is not sandboxed and runs with full root privileges. This issue is another instance of the problem in CVE-2024-6387 addressed by FreeBSD-SA-24:04.openssh.

The faulty code in this case is from the integration of blacklistd in OpenSSH in FreeBSD. As a result of calling functions that are not async-signal-safe in the privileged sshd(8) context, a race condition exists that a determined attacker may be able to exploit to allow an unauthenticated remote code execution as root.

Official announcement: Please refer to the link for details https://www.tenable.com/cve/CVE-2024-7589

CVE-2024-1067 was released in May 2024.  Are CVE-2024-2937 and CVE-2024-4607, released in August 2024, caused by similar design flaws? (7 Aug 2024)

Preface: May 2021 to May 2023 (ARM GPU evolution)

Valhall 3rd Gen – On May 25, 2021, Arm announced their Valhall 3rd Gen GPU Architecture (as part of TCS21), including the Mali-G710, Mali-G510, and Mali-G310 GPUs.

Valhall 4th Gen – On June 28, 2022, Arm announced their Valhall 4th Gen GPU Architecture (as part of TCS22), including the Immortalis-G715, Mali-G715, and Mali-G615 GPUs.

5th Gen – On May 29, 2023, Arm announced their 5th Gen Arm GPU Architecture (as part of TCS23), including the Immortalis-G720, Mali-G720 and Mali-G620 GPUs.

Background: The New 5th Gen Arm GPU Architecture

The 5th Gen GPU architecture introduces a key feature called Deferred Vertex Shading (DVS), which revolutionizes data flow within the GPU and expands the number of GPU cores, reaching up to 16 cores for enhanced performance.

The Arm 5th Gen GPU architecture is the most efficient GPU architecture Arm has ever created, designed with CPU and system architecture in mind. It redefines parts of the graphics pipeline to significantly reduce memory bandwidth, thus improving total system efficiency and power.

Technical reference: It solves the bandwidth problem of the traditional model because the fragment shader reads a small block each time and puts it on the chip. It does not need to read the memory frequently until the final operation is completed and then writes it to the memory. You can even further reduce memory reads and writes by compressing tiles. In addition, when some areas of the image are fixed, the function can be called to determine whether the tiles are the same to reduce repeated rendering.

Vulnerability details: Judging from the descriptions of the two different vulnerabilities, they appear to be the same (see below):

https://nvd.nist.gov/vuln/detail/CVE-2024-2937

Use After Free vulnerability in Arm Ltd Bifrost GPU Kernel Driver, Arm Ltd Valhall GPU Kernel Driver, Arm Ltd Arm 5th Gen GPU Architecture Kernel Driver allows a local non-privileged user to make improper GPU memory processing operations to gain access to already freed memory.This issue affects Bifrost GPU Kernel Driver: from r41p0 through r49p0; Valhall GPU Kernel Driver: from r41p0 through r49p0; Arm 5th Gen GPU Architecture Kernel Driver: from r41p0 through r49p0.

https://nvd.nist.gov/vuln/detail/CVE-2024-4607

Use After Free vulnerability in Arm Ltd Bifrost GPU Kernel Driver, Arm Ltd Valhall GPU Kernel Driver, Arm Ltd Arm 5th Gen GPU Architecture Kernel Driver allows a local non-privileged user to make improper GPU memory processing operations to gain access to already freed memory.This issue affects Bifrost GPU Kernel Driver: from r41p0 through r49p0; Valhall GPU Kernel Driver: from r41p0 through r49p0; Arm 5th Gen GPU Architecture Kernel Driver: from r41p0 through r49p0.

Affected Products

Bifrost GPU kernel driver versions : r41p0 (inclusive) ~ r49p0 (inclusive)

Valhall GPU Kernel Driver versions: r41p0 (inclusive) ~ r49p0 (inclusive)

Arm 5th Generation GPU Architecture Kernel Driver versions: r41p0 (inclusive) ~ r49p0 (inclusive)

CVE-2024-41012: filelock- Remove locks reliably when fcntl/close race is detected (24/07/2024)

In the Linux kernel, design weakness (CVE-2024-41012) has been resolved.

Preface: The GFP acronym stands for “get free pages”, the underlying memory allocation function. Diversity of the allocation APIs combined with the numerous GFP flags makes the question “How should I allocate memory?” not that easy to answer, although very likely you should use. kzalloc(<size>, GFP_KERNEL);

Background: You can build a 64-bit POSIX-compliant tick-less kernel with a Linux-compatible syscall implementation using Go.

Vulnerability details: When fcntl_setlk() races with close(), it removes the created lock with do_lock_file_wait(). However, LSMs can allow the first do_lock_file_wait() that created the lock while denying the second do_lock_file_wait() that tries to remove the lock. In theory (but AFAIK not in practice), posix_lock_file() could also fail to remove a lock due to GFP_KERNEL allocation failure (when splitting a range in the middle).

After the bug has been triggered, use-after-free reads will occur in lock_get_status() when userspace reads /proc/locks. This can likely be used to read arbitrary kernel memory, but can’t corrupt kernel memory. This only affects systems with SELinux / Smack / AppArmor / BPF-LSM in enforcing mode and only works from some security contexts.

Official announcement: Please refer to the official announcement for details – https://nvd.nist.gov/vuln/detail/CVE-2024-41012