Preface: Nvidia acquires Bright Computing, maker of Bright Cluster Manager software that controls the configuration of clustered HPC systems, including Nvidia’s own DGX servers and HGX systems manufactured by OEMs and ODMs, as well as clusters from other manufacturers.
Background: NVIDIA Base Command Manager provides cluster management software for streamlining cluster provisioning, workload management, and infrastructure monitoring. It provides all the tools for deploying and managing an AI data center.
Note: Base Command Manager 10 is licensed on a per-GPU base. This differs from the node-base licensing model of Bright Cluster Manager. Customers with active support subscriptions using Bright Cluster Manager 9.2 and earlier can upgrade to Base Command Manager 10 by exchanging their current licenses for GPU-based Base Command Manager 10 licenses at no cost.
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. SLURM is a vital component of a parallel computer for executing parallel jobs. It simplifies the job execution by hiding all the complexities.
Vulnerability details: NVIDIA Base Command Manager and Bright Cluster Manager for Linux contain an insecure temporary file vulnerability. A successful exploit of this vulnerability might lead to denial of service.
Official announcement: Please see the link below for details – https://nvidia.custhelp.com/app/answers/detail/a_id/5600