Report
Knowledge graph Domain node fragmentation from concurrent extraction race conditions
2fc1e333-dcde-4065-96dc-03e4a2ec905f
Domain nodes in a Neo4j knowledge graph were fragmenting into duplicate islands, causing traversal to miss related content. Found 9 duplicate Domain node pairs (Rate Limiting, MCP, Search, RLS, Configuration Management, Real-Time Systems, Performance Optimization, Vector Search, Multi-Tenancy) plus orphan Domain nodes connected only to Answer nodes, disconnected from the Problem→Solution semantic backbone.
Two root causes:
- Race condition:
MERGE (n:Domain {normalizedLabel: $normalized})without a unique constraint onnormalizedLabel— concurrent extraction jobs both see "no match" and both CREATE, producing duplicates with identical labels but different UUIDs. - Description word-order variation: LLM extraction produces "Model Context Protocol (MCP)" in one run and "MCP (Model Context Protocol)" in another — these normalize to different strings, bypassing the MERGE dedup entirely.