CVE-2022-28200 NVIDIA DGX A100 contains a vulnerability in SBIOS in the BiosCfgTool (3rd July 2022)

Preface: Collecting and storing data is one of the most important steps of the AI workflow. AI analytics refers to a subset of business intelligence that uses machine learning techniques to discover insights, find new patterns and discover relationships in the data. In practice, AI analytics is the process of automating much of the work that data analysts typically perform.

Background: There are two models of the NVIDIA DGX A100 system: the NVIDIA DGX A100 640GB system and the NVIDIA DGX A100 320GB system.Both FLOPS and MIPS are used to measure the performance of a computer’s number-crunching performance.The DGX A100, as the most recent iteration is named, is capable of five petaflops of FP16 performance, or 2.5 petaflops TF32, and 156 teraflops FP64. It also runs at 10 petaops (not flops) with INT8.
The NVIDIA DGX A100 system comes with a system BIOS with optimized settings for the DGX system. There may be situations where the settings would need to be changed, such as changes in the boot order, changes to enable PXE booting, or changes in the BMC network settings.
Connect to the DGX A100 console using either a direct connection or a remote connection through the BMC.
Ref: DGX OS Server software installs Docker Engine which uses the 172.17.xx.xx sub-net by default for Docker containers. If the DGX A100 system is on the same subnet, you will not be able to establish a network connection to the DGX A100 system.

Vulnerability details: NVIDIA DGX A100 contains a vulnerability in SBIOS in the BiosCfgTool, where a local user with elevated privileges can read and write beyond intended bounds in SMRAM, which may lead to code execution, escalation of privileges, denial of service, and information disclosure. The scope of impact can extend to other components.

My observation: It was because vendor not disclosed. Based on existing desisgn. See whether the “REDFISH” server had contained following vulnerability. This SSH design weakness was found this year.
My speculation is based on the following.
By default, Redfish support is enabled in the DGX A100 BMC and the BIOS.

  1. After first-boot setup. Click Launch KVM. The DGX A100 console appears in your browser.
  2. After the system has been configured, you can also establish an SSH connection to the DGX A100 OS through the network port.

If below circumstances occurs. The consequence will similar as CVE article description.

curl would reuse a connection even if the subsequent transfer would have changed one or more of these options.
TLS options
• CURLOPT_SSL_OPTIONS(since 7.25.0)
• CURLOPT_CRLFILE (since 7.19.0)
• CURLOPT_TLSAUTH_USERNAME (since 7.21.4)
• CURLOPT_TLSAUTH_PASSWORD (since 7.21.4)
• CURLOPT_PROXY_SSL_OPTIONS (since 7.52.0)
• CURLOPT_PROXY_CRLFILE (since 7.52.0)
• CURLOPT_PROXY_TLSAUTH_USERNAME (since 7.52.0)
• CURLOPT_PROXY_TLSAUTH_PASSWORD (since 7.52.0)
SSH options
• CURLOPT_SSH_PUBLIC_KEYFILE (since 7.16.1)
• CURLOPT_SSH_PRIVATE_KEYFILE (since 7.16.1)

Official announcement – NVIDIA has released a security update for NVIDIA DGX A100 firmware. This update addresses issues that may lead to information disclosure, denial of service, or escalation of privileges.
https://nvidia.custhelp.com/app/answers/detail/a_id/5367

Solution: To protect your system, download and install this firmware update through the NVIDIA Enterprise Support Portal.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.