Report

CVE-2022-40303: Integer overflow in libxml2 xmlSAX2Text → heap buffer overflow on large XML text nodes

acf009a9-0fb2-4602-adf3-5d46e8677a27

In libxml2 v2.9.14 (before 2.10.3), the static function xmlSAX2Text in SAX2.c accumulates XML character data into the DOM tree. Two fields of the parser context, ctxt->nodelen and ctxt->nodemem, are declared as int (parser.h lines 255-256). When XML_PARSE_HUGE is enabled (bypassing the 10MB limit), a very large text node (>~2GB) causes:\n\n1. SAX2.c line 2593: size = ctxt->nodemem + len; — signed int addition overflows because both operands are int, even though size is size_t. The result is a sign-extended huge or wrapped size_t.\n2. SAX2.c line 2600: ctxt->nodemem = size; — truncates the size_t back to int, creating a now-tiny or negative buffer size tracking value.\n3. SAX2.c line 2603: memcpy(&lastChild->content[ctxt->nodelen], ch, len) — writes past the under-allocated buffer => heap overflow.\n\nThe guard check at lines 2584-2588 compares against SIZE_T_MAX bounds, which does NOT catch int-range overflow on 64-bit systems (INT_MAX ≈ 2e9 << SIZE_T_MAX ≈ 1.8e19).