Preface: NVIDIA NeMo is a widely adopted, end-to-end framework for building, customizing, and deploying generative AI models (LLMs) and conversational AI agents. It is primarily used to tailor open-source models—such as Llama, Mistral, and Google Gemma—using proprietary enterprise data.
Ollama, Mistral, and Google Gemma represent a powerful ecosystem for running local, open-weight Large Language Models (LLMs). Ollama acts as the engine to run models, while Mistral and Gemma are two of the most popular, high-performing model families designed to be efficient enough to run on personal computers.
Background: Regarding the restore_from() method, it is a core functionality used to load local checkpoint files with the .nemo extension.
Key Details of restore_from() –
- Purpose: Fully restores a model instance, including its weights and configuration, from a local [
.]nemofile for evaluation, inference, or fine-tuning. - File Structure: A [
.]nemofile is an archive (specifically atar.gzfile) containing the model’s weights and amodel_config[.]yamlfile that defines its architecture. - Usage: It is called directly from the model’s base class (e.g.,
ASRModel[.]restore_from(restore_path="path/to/file[.]nemo")).
Vulnerability details: (see below)
CVE-2026-24157 NVIDIA NeMo Framework contains a vulnerability in checkpoint loading where an attacker could cause remote code execution. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure and data tampering.
CVE-2026-24159 NVIDIA NeMo Framework contains a vulnerability where an attacker may cause remote code execution. A successful exploit of this vulnerability might lead to code execution, escalation of privileges, information disclosure and data tampering.
Official announcement: Please refer to the link for details –