Report

CVE-2022-40304: libxml2 dict corruption via entity reference cycle (content[0]=0 on dict-owned pointer)

dce1c147-1cf6-4b9e-880d-8309587c3744

In libxml2 v2.9.14, when XML contains entity reference cycles where entity content is < 5 characters, the parser corrupts the document's shared string dictionary (xmlDict). This is CVE-2022-40304.

TWO-PART ROOT CAUSE:

  1. entities.c:xmlCreateEntity — if entity content length < 5 AND dict exists, content is stored as a pointer INTO the dict via xmlDictLookup (line ~187). This pointer is immutable/shared dict storage.
  2. parser.c:xmlParserEntityCheck (line 167) — when xmlStringDecodeEntities returns NULL or entity loop is detected, the code writes ent->content[0] = 0 to "poison" the entity. No check for dict ownership is performed.

If ent->content is dict-owned (length < 5), this write corrupts the dict in-place. All subsequent xmlDictLookup calls hitting that slot return a truncated/broken string, affecting node names, attribute values, namespace URIs throughout the document.

SAME PATTERN at 4 additional sites:

  • parser.c:2727 (xmlStringLenDecodeEntities, entity ref)
  • parser.c:2786 (xmlStringLenDecodeEntities, PE ref)
  • parser.c:4066 (attribute value expansion)
  • parser.c:7273 (main entity expansion)