Preface: High-performance computing is a method of processing large amounts of data and performing complex calculations at high speed. HPC is well suited for AI, which uses large data sets and complex models. HPC and AI combined have use cases in the following areas: Predictive Analytics. Physics and Modeling.
IO-Heavy HPC Computing: Requires systems that can read/write and store large amounts of data on disks. This type of computing includes systems that provide fast NVMe implementations for local IO or as part of a parallel file system.
Background: What is metadata for NVMe? Similar to SCSI / SAS devices, the NVMe standard supports the addition of 8 bytes (called metadata or protection information (PI)) to each data sector to ensure data integrity during data transfer.
NVMe protocol defines commands that utilize Physical Region Pages (PRP)/Scatter Gather Lists (SGL) to denote a data buffer location in host memory. The data buffer may be represented using single or multiple PRP/SGL entries similar to a linked list. Associated information for a command including PRP/SGL may be formed before the command is issued to the SSD for execution. The SSD, while executing the command, may fetch the associated PRP/SGL and perform data movement related to the command.
However, NVMe has no separate field to encode the metadata length expected (except when using SGLs). Because of that we can’t allow to transfer arbitrary metadata, as a metadata buffer that is shorted than what the device expects for the command will lead to arbitrary kernel (if bounce buffering) or userspace (if not) memory corruption.
Vulnerability details: A buffer overflow vulnerability was found in the NVM Express (NVMe) driver in the Linux kernel. An unprivileged user could specify a small meta buffer and let the device perform larger Direct Memory Access (DMA) into the same buffer, overwriting unrelated kernel memory, causing random kernel crashes and memory corruption.
Official announcement: Please refer to the link for details – https://nvd.nist.gov/vuln/detail/CVE-2023-6238