In libxml2 v2.9.14 (and before 2.10.3), a logic bug causes hash-table (dict) corruption when parsing XML with entity reference cycles whose content is fewer than 5 bytes long. ROOT CAUSE (entities.c:185-191): xmlCreateEntity stores short entity content (< 5 bytes) as a dict-owned pointer via `xmlDictLookup`, cast from `const xmlChar *` to `xmlChar *`. This means ent->content aliases the dict's internal string storage — memory the dict treats as immutable and indexes by hash. CORRUPTION (parser.c:154-179, xmlParserEntityCheck): When an entity loop is detected during content expansion, the code executes `ent->content[0] = 0` (line 167) to mark the entity as failed. If ent->content is dict-owned, this write corrupts the dict's internal string buffer. The entry still stores the old hash of e.g. "&a;" but the string now starts with '\0', causing dict inconsistency for all subsequent lookups. Same bug exists in 4 other parser.c locations: lines 2727, 2786, 4066, 7273 — all write ent->content[0]=0 without checking xmlDictOwns(). TRIGGER: `<!ENTITY a "&a;">` — content "&a;" is 3 bytes (< 5), stored in dict, then corrupted when loop detected.
2aa40314-9203-42fe-b6de-1ffb7012997e
In libxml2 v2.9.14 (and before 2.10.3), a logic bug causes hash-table (dict) corruption when parsing XML with entity reference cycles whose content is fewer than 5 bytes long.
ROOT CAUSE (entities.c:185-191):
xmlCreateEntity stores short entity content (< 5 bytes) as a dict-owned pointer via xmlDictLookup, cast from const xmlChar * to xmlChar *. This means ent->content aliases the dict's internal string storage — memory the dict treats as immutable and indexes by hash.
CORRUPTION (parser.c:154-179, xmlParserEntityCheck):
When an entity loop is detected during content expansion, the code executes ent->content[0] = 0 (line 167) to mark the entity as failed. If ent->content is dict-owned, this write corrupts the dict's internal string buffer. The entry still stores the old hash of e.g. "&a;" but the string now starts with '\0', causing dict inconsistency for all subsequent lookups.
Same bug exists in 4 other parser.c locations: lines 2727, 2786, 4066, 7273 — all write ent->content[0]=0 without checking xmlDictOwns().
TRIGGER: <!ENTITY a "&a;"> — content "&a;" is 3 bytes (< 5), stored in dict, then corrupted when loop detected.