Preface: Expat is free software. You may copy, distribute, and modify it under the terms of the License contained in the file COPYING distributed with this package. This license is the same as the MIT/X Consortium license.
Background: As a open-source XML parser, Expat can be implemented by several programming languages, such as: Python, PHP,Perl,…etc. There are four functions will be frequent in use (see below):
- XML_ParserCreate – To create a new analyzer object.
- XML_SetElementHandler – To define handlers for start and end tags.
- XML_SetCharacterDataHandler – To define the handler for text.
- XML_Parse – To pass a buffer full of documents to the analyzer
After the Web Server receives the XML data, it will give it to the XML parser. To use the Expat library, programs first register handler functions with Expat. When Expat parses an XML document, it calls the registered handlers as it finds relevant tokens in the input stream. These tokens and their associated handler calls are called events. Typically, programs register handler functions for XML element start or stop events and character events. Expat provides facilities for more sophisticated event handling such as XML Namespace declarations, processing instructions and DTD events.
Vulnerability details: CVE-2022-23852 – Fix signed integer overflow (undefined behavior) in function XML_GetBuffer (that is also called by function XML_Parse internally) when XML_CONTEXT_BYTES is defined to >0 (which is both common and default). Impact is denial of service or more.
Some kinds of integer overflow are undefined, and these kinds of behavior are especially problematic. According to the C99 standard, undefined behavior is “behavior, upon use of a non-portable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.”
Ref: A signed integer is a 32-bit datum that encodes an integer in the range [-2147483648 to 2147483647]. An unsigned integer is a 32-bit datum that encodes a nonnegative integer in the range [0 to 4294967295]. The signed integer is represented in twos complement notation.
Official article: https://github.com/libexpat/libexpat/pull/550