Category Archives: AI and ML

CVE-2023-22886: Improper Input Validation vulnerability in Apache Software Foundation Apache Airflow JDBC Provider (30th June 2023)

Preface: Airflow is a platform to programmatically author, schedule, and monitor workflows. Specifically, it is used in Machine Learning to create pipelines.

Background: Apache Airflow™ is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows. This open-source platform most suitable for pipelines that change slowly, are related to a specific time interval, or are pre-scheduled. It’s a popular solution that many data engineers rely on for building their data pipelines. Data pipelines work with ongoing data streams in real time. It’s been used to run SQL, machine learning models, and more.

Apache Airflow is a Python-based platform to programmatically author, schedule and monitor workflows. It is well-suited to machine learning for building pipelines, managing data and training models.

You can use Apache Airflow to schedule pipelines that extract data from multiple sources, and run Spark jobs or other data transformations. Machine learning model training.

Vulnerability details: Improper Input Validation vulnerability in Apache Software Foundation Apache Airflow JDBC Provider. Airflow JDBC Provider Connection’s [Connection URL] parameters had no restrictions, which made it possible to implement RCE attacks via different type JDBC drivers, obtain airflow server permission. This issue affects Apache Airflow JDBC Provider: before 4.0.0.

Recommendation: For security purposes, you should avoid building the connection URLs based on user input. For user name and password values, use the connection property collections. Restrict direct usage of driver params via extras for JDBC connection.

Remedy: To configure driver parameters (driver path and driver class), you can use the following methods:

  1. Supply them as constructor arguments when instantiating the hook.
  2. Set the “driver_path” and/or “driver_class” parameters in the “hook_params” dictionary when creating the hook using SQL operators.
  3. Set the “driver_path” and/or “driver_class” extra in the connection and correspondingly enable the “allow_driver_path_in_extra” and/or “allow_driver_class_in_extra” options in the “providers[.jdbc” section of the Airflow configuration.
  4. Patch the “JdbcHook.default_driver_path” and/or “JdbcHook.default_driver_class” values in the “local_settings[.]py” file.

Official announcement: For details, please refer to the link – https://github.com/advisories/GHSA-mm87-c3x2-6f89

Can you foresee how much AI and machine learning infrastructure there will be in the next few years? (28th June 2023)

Preface: ChatGPT Memory uses the Redis vector database to store an embedded conversation history of past user-bot interactions.
The first interaction between the user and bot is critical to the user experience, said Microsoft.
There are 1482 Companies currently using OpenAI, it also include Adobe and Schneider Electric.

Background: ChatGPT, the full name of Chat Generation Pre-training Converter, is an artificial intelligence chat robot program developed by OpenAI, which will be launched in November 2022. The program uses large language models based on the GPT-3.5 and GPT-4 architectures and is trained with reinforcement learning.
OpenAI is a suite of artificial intelligence (AI) models designed for application developers. It enables users to create AI applications to understand natural-language semantics and generate natural text, translate natural language into programming code, create images from text captions, and classify images.

FastAPI is a Python web framework based on the Starlette microframework. With deep support for asyncio, FastAPI is indeed very fast.
FastAPI also distinguishes itself with features like automatic OpenAPI (OAS) documentation for your API, easy-to-use data validation tools, and more.
Integrating OpenAI APIs into FastAPI applications to facilitate calling them using the Swagger UI.
FastAPI is a modern Python web framework for building APIs quickly and efficiently. By leveraging FastAPI’s features and integrating OpenAI’s APIs,
developers can build applications with powerful AI capabilities such as language translation, sentiment analysis, text summarization, question-answering, and more.

How to install OpenAI in python
Step 1: Sign up for an OpenAI API key: You will visit the link to register for an account and if you already have an active account using Chat-Gpt 3, you can use the same account to sign in.
If you are a Linux user, have a good try.
How to Install OpenAI on Linux?
Step 2 : upgrade pip and install the openai library.
python3 -m pip install –upgrade pip
python3 -m pip install –upgrade openai

….
For details, please refer to the official linkhttps://openai.com/

About CVE-2023-3220 An issue was discovered in the Linux kernel through 6.1-rc8 (20th June 2023)

Preface: AI Engines are built from the ground up to be software programmable and hardware adaptable. There are two distinct design flows for any developer to unleash the performance of these compute engines with the ability to compile in minutes and rapidly explore different microarchitectures.
As of today, current technology are capable On-device intelligence powered by the AI Engine. Our dreams come true, the 3rd generation AI Engine enables on-device intelligence and simplifies how pictures and videos are taken.

Background: The Qualcomm Robotics RB3 development kit includes the purpose-built robotics-focused DragonBoard™ 845c development board, based on the Qualcomm® SDA845 processor and compliant with the 96Boards open hardware specification to support a broad range of mezzanine-board expansions.
The development board supports Linux and Robotics Operating System (ROS), while also including support for the Qualcomm® Neural Processing software development kit (SDK) for advanced on-device AI, the Qualcomm ® Computer Vision Suite, the Qualcomm ® Hexagon DSP SDK, and AWS RoboMaker.

Vulnerability details: An issue was discovered in the Linux kernel through 6.1-rc8. dpu_crtc_atomic_check in drivers/gpu/drm/msm/disp/dpu1/dpu_crtc[.]c lacks check of the return value of kzalloc() and will cause the NULL Pointer Dereference.

Ref: The kzalloc() function is the same as kmalloc().
Difference: Cleared to zero after memory allocation is successful. After each use of kzalloc(), there must be a corresponding memory release function kfree().

Official announcement: For details, please refer to the link – https://nvd.nist.gov/vuln/detail/CVE-2023-3220

CVE-2023-29403: Are you falling into this Go runtime design weakness? (9th June 2023)

Preface: Go is garbage collected instead of manual memory management which is not suitable for a kernel.
Golang is useful for carrying out programming for scalable servers and large software systems. The Golang programming language was built to fill in the gaps of C++ and Java that Google came across while working with its servers and distributed systems.

Background: Go does have an extensive library, called the runtime, that is part of every Go program. The runtime library implements garbage collection, concurrency, stack management, and other critical features of the Go language.
The Go scheduler is part of the Go runtime, and the Go runtime is built into your application. This means the Go scheduler runs in user space, above the kernel.
For Go to “call the kernel directly” can exploit so-called ABI of the H/W and OS combo. For example: On linux, making a syscall requires filling a set of CPU registers with certain values, doing some other arrangements and then issuing the SYSENTER CPU instruction.
The 64-bit x86 Linux ABI supports the following entry points:

  • SYSCALL from 64-bit code;
  • interrupt 0x80 from 32- and 64-bit code;
  • SYSENTER from 32-bit code.

Vulnerability details: On Unix platforms, the Go runtime does not behave differently when a binary is run with the setuid/setgid bits. This can be dangerous in certain cases, such as when dumping memory state, or assuming the status of standard i/o file descriptors. If a setuid/setgid binary is executed with standard I/O file descriptors closed, opening any files can result in unexpected content being read or written with elevated privileges. Similarly, if a setuid/setgid program is terminated, either via panic or signal, it may leak the contents of its registers.

Official announcement: For details, please refer to the link- https://pkg.go.dev/vuln/GO-2023-1840

About CVE-2023-33962 – JStachio fails to escape single quotes in HTML (31st May 2023)

Preface: About 20 years ago, people know java is unsafe. Perhaps of technology trend, so whatever the design appyling java language. The flexibility and easy to use will let people contempt about awareness of cyber security. While Java is considered relatively safe because it is a server side language, there are still multiple ways to attack and access secure code you’d like to remain private.

Background: The Spring Framework is an application framework and inversion of control container for the Java platform. The framework’s core features can be used by any Java application, but there are extensions for building web applications on top of the Java EE platform.
Mustache is a logicless template engine and it is helpful for creating dynamic content like HTML and configuration files.
If your models are type based and not just Map then JStachio is good choice.

Ref:End-users only use JVM and JRE to execute the application program. JRE identifies all the helpful class libraries needed for execution, while JVM is a subclass of JRE that decodes the bytecode into machine language and other minor tasks. Each JVM server can have a maximum of 256 threads to run Java applications.

Vulnerability details: Prior to version 1.0.1, JStachio fails to escape single quotes ' in HTML, allowing an attacker to inject malicious code. This vulnerability can be exploited by an attacker to execute arbitrary JavaScript code in the context of other users visiting pages that use this template engine. This can lead to various consequences, including session hijacking, defacement of web pages, theft of sensitive information, or even the propagation of malware.

Solution: Version 1.0.1 contains a patch for this issue.

Workaround: To mitigate this vulnerability, the template engine should properly escape special characters, including single quotes. Common practice is to escape ' as &#39. As a workaround, users can avoid this issue by using only double quotes " for HTML attributes.

Official announcement: For details, please refer to the link – https://nvd.nist.gov/vuln/detail/CVE-2023-33962

CVE-2023-32067 – c-ares is vulnerable to denial of service (30th May 2023)

Preface: c-ares is a C language implementation of asynchronous request DNS. When using c-ares, you usually only need to reference the ares.h header file, and the related header files of the library are included.

Background: For some asynchronous DNS requests, Node[.]js uses a C library called c-ares.
So called Async DNS – This feature bypasses the normal operating system mechanisms for resolving domain names and uses the browser directly. In this mode, DNS requests will communicate directly own DNS servers and some third party providers.

Vulnerability details: c-ares is an asynchronous resolver library. c-ares is vulnerable to denial of service. If a target resolver sends a query, the attacker forges a malformed UDP packet with a length of 0 and returns them to the target resolver. The target resolver erroneously interprets the 0 length as a graceful shutdown of the connection. This issue has been patched in version 1.19.1.

According to the technical aspect of UDP packet with a length of 0 matter. Do you have below queries?
Writing a datagram of length 0 is acceptable. In the case of UDP, this results in an IP datagram containing an IP header (normally 20 bytes for IPv4 and 40 bytes for IPv6), an 8-byte UDP header, and no data. This also means that a return value of 0 from recvfrom is acceptable for a datagram protocol: It does not mean that the peer has closed the connection, as does a return value of 0 from read on a TCP socket. Since UDP is connectionless, there is no such thing as closing a UDP connection.

Official announcement: For details, please refer to the link – https://nvd.nist.gov/vuln/detail/CVE-2023-32067

About CVE-2023-33252 – Similar concern for cyber security today, it also apply to future digital world Artificial intelligence zone. (22nd May 2023)

Preface: Foreseeing the continuous development of artificial intelligence, use blockchain technology for network communication is a must.
A blockchain is a distributed database or ledger shared among nodes in a computer network. They are known for their key role in maintaining a secure and decentralized record of transactions in cryptocurrency systems, but they are not limited to the use of cryptocurrencies.

Background: IDEN3 is NOT an ICO (Initial Coin Offerings). It has no token at all. It is an open source permissionless identity layer built on top of Ethereum that we expect many projects will be able to use as a foundational layer for their own identity solution.
What are Initial Coin Offerings? ICOs are another form of cryptocurrency that businesses use in order to raise capital. Through ICO trading platforms, investors receive unique cryptocurrency “tokens” in exchange for their monetary investment in the business.

Ref: Circom is a compiler written in Rust for compiling circuits written in the circom language. The compiler outputs the representation of the circuit as constraints and everything needed to compute different ZK proofs.

Vulnerability details: iden3 snarkjs through 0.6.11 allows double spending because there is no validation that the publicSignals length is less than the field modulus.

References:
https://github.com/iden3/snarkjs/commits/master/src/groth16_verify.js
https://github.com/iden3/snarkjs/tags

About Artificial Intelligence (NVIDIA® DGX-1): Closer look at CVE‑2023‑0209 (11th May 2023)

Preface: NVIDIA DGX-1 is an integrated deep learning workstation with a large computing capacity, which can be used to run demanding deep learning workloads. It provides GPU computing power of 1 PetaFLOPS (1 quadrillion floating-point operations per second).

Background: Historically, both vendors and attackers have overlooked (pre)EFI boot process (in)security; pre-EFI Initialization (PEI) boot stage opens many doors and offers flexibility to attackers.
The Pre-EFI Initialization (PEI) phase provides a standardized method of loading and invoking specific initial configuration routines for the processor, chipset, and system board. The PEI phase occurs after the Security (SEC) phase. The primary purpose of code operating in this phase is to initialize enough of the system to allow instantiation of the Driver Execution Environment (DXE) phase.
The Driver Execution Environment (DXE) phase is where most of the system initialization is performed. Pre-EFI Initialization (PEI), the phase prior to DXE, is responsible for initializing permanent memory in the platform so that the DXE phase can be loaded and executed.

Note: Security (SEC) and Pre-EFI (PEI) phases – both are controlled by the firmware vendor.
SEC – Init CPU, clear caches, load BIOS ROM
PEI – Initialize chipset, RAM, devices, Secure Boot (Record Secure Boot in PCR 7)

Vulnerability details: NVIDIA DGX-1 SBIOS contains a vulnerability in the Uncore PEI module, where authentication of the code executed by SSA is missing, which may lead to arbitrary code execution, denial of service, escalation of privileges, information disclosure, data tampering, and SecureBoot bypass.

Official announcement: For details, please refer to the link – https://nvidia.custhelp.com/app/answers/detail/a_id/5458

Do you know what AI thinking?But sooner or later he will become smarter than common people (8th May 2023).

Preface: The 2012 Mayans doomsday prophecy perhaps is a joke.
However the poor weather havoc and unpredictable. The sudden weather change become more destructive. As a matter of fact, scientists urge that this extreme change of weather related to current environment.
What’s more, the prophecy does not record how nature will change immediately. Nor does it say that the specified date will be over on the same day.

Background: AI inventions, similar advent of the atomic bomb, will completely change the world, said Warren Buffett.
Going back two years, some speakers at the workshop said not to worry. AI won’t affect your job. In fact, artificial intelligence will replace low-level labor in the next few years. It does not substitute 100% immediately within the specified time.
Buffett said that Microsoft founder Bill Gates once showed him the latest version of ChatGPT, in which the program can check all legal opinions in a very short time, which impressed him. He believes the power of AI technology should be cause for concern.

Common Open Source AI Software: Before AI knows how to use his intelligence to protect itself well without relying on people. We should know who he is and receive a basic understand to him.

Acumos AI: Based on Linux, to help integrate other frameworks and develop cloud-based AI apps.
ClearML: ClearML announced a free hosted plan to give data scientists the freedom to manage AI/ML experiments and orchestrate workloads without investing in additional resources.
H2O.ai: Integration with Hadoop and Spark for big data-based AI modeling. Library of ML algorithms including supervised and unsupervised learning
Mycroft.ai: Mycroft powers various elements of the voice stack using open source AI technology. There is a large community of users, developers, and translators, to constantly improve the AI algorithms.
OpenCV: Proven applications across a variety of use cases, including facial recognition, human-computer interactions, object detection, motion tracking, and more.
ML library containing algorithms for decision tree learning, k-nearest neighbor algorithm, artificial neural networks, random forest, and deep neural networks (DNN), among others.
OpenNN: OpenNN is an open source AI software library for implementing neural networks and ML.
PyTorch: A production-ready environment powered by TorchServe for quickly deploying models. A distributed backend architecture to enable distributed training and performance optimization.
Rasa (Open Source): Natural language understanding to convert messages into structured data and analyze intent.
TensorFlow: Support for multiple languages, including JavaScript, which is relatively rare in the open source AI space.
Tesseract OCR: Tesseract is an OCR engine originally developed by Hewlett Packard as a proprietary technology in the 1980s. It launched as an open source AI software with sponsorship from Google in 2006. Its primary implementation is meant for unstructured data processing and text from image extraction, executed entirely from a common line interface.

Former CNBC article for reference: https://www.cnbc.com/2017/11/06/stephen-hawking-ai-could-be-worst-event-in-civilization.html

Your first quantum programming (24th Mar 2023)

Preface: How much does a 1 qubit quantum computer cost? Commercial quantum computers like D-Wave One with 50 qubits – $10,000,000. D-Wave systems use a process called quantum annealing to search for solutions to a problem.

Background: The Microsoft Quantum Development Kit offers durable quantum application development on hardware-accelerated compute resources. Program your quantum algorithms and formulate optimization solutions to cope with your AI development. Perhaps this is a prelude before the development of a scalable quantum computer in the future. Quantum computers have great potential to process the large datasets often used in AI experiments. By using quantum computing techniques to analyze data sets faster and more accurately than ever before, AI researchers have been able to make significant advances in fields such as machine learning.
Because quantum computers are fundamentally different from classical computers, conventional techniques used to communicate electronic information do not directly translate to quantum devices.

Suggestion:
Learn quantum programming (Q# programming language) with Microsoft QUANTUM Development KIT. It can use simulation. API for quantum computing simulation using the .NET ecosystem and Python.

Quantum Development Kit (SDK) contains below basic components

  • The Q# programming language (pronounced like Q-Sharp)
    Remark: The Q# development tools are based on the Microsoft.NET ecosystem. Therefore, need to install the correct version of the .NET Software Development Kit (so called .NET SDK)
  • API for quantum computing simulation using the .NET ecosystem and.or Python
  • Tools to help you develop and simulate your quantum programs using commas-line tools, Visual Studio Code, or Microsoft Visual Studio

Please refer to this link for details- https://azure.microsoft.com/en-us/resources/development-kit/quantum-computing/#overview