Preface:The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.
Background: The NVIDIA host engine, nv-hostengine, is a thin wrapper around the DCGM shared library. Its main job is to instantiate the DCGM library as a persistent standalone process, including appropriate management of the monitoring and management activities.
Vulnerability details: NVIDIA DCGM contains a vulnerability in nvhostengine, where a network user can cause detection of error conditions without action, which may lead to limited code execution, some denial of service, escalation of privileges, and limited impacts to both data confidentiality and integrity.
My observations and details
Step1. The NVIDIA host engine, nv-hostengine, is a thin wrapper around the DCGM shared library. Its main job is to instantiate the DCGM library as a persistent standalone process, including appropriate management of the monitoring and management activities.
Step2. Starting the Prometheus Client
The script dcgm_prometheus.py is provided as a fully functional Prometheus client that will publish timeseries data on a port to be read (scraped) by Prometheus. By default, this script will publish common fields read from a DCGM instance running locally every 10 seconds to localhost:8000. Information on controlling what is published, how often, and on what port will be in the section on customization.
On the client side, start this script. It can either connect to a standalone host engine or run on embedded in the script.
To start an embedded host engine and check that it is publishing:
$ python dcgm_prometheus[.]py -e
$ curl localhost:8000 > tmp[.]txt
Step3. Whether below details is the design weakness of this vulnerability because vendor did not explain in details.
dcgm_prometheus.py error AttributeError: ‘DcgmPrometheus’ object has no attribute ‘m_publishFieldIds’
Remark: When class-validator is used to validate user-input, the attributes in the user-input object will be transformed into the validation class instance.
However, the transforming procedure will overwrite the internal attribute of validation class instance (e.g., constructor attribute) if the attacker injects an attribute with the same name into user-input. Once this internal attribute being overwritten, class-validator will be bypassed.
Official announcement: Reference link for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5328