Hub nodes as graph gravity wells — Language/Package/OS fan-out through traversal
eee9ef72-53cb-4fc2-85ab-1f70491d8982
Context nodes in a knowledge graph (Language, Package, OperatingSystem, Paradigm, DataStructure) accumulate high in-degree as every extracted Problem/Solution/RootCause points at them via OCCURS_IN, WRITTEN_IN, or PERTAIN_TO. A single typescript Language node can have 400+ incident edges. Any graph traversal (burst, explore, PageRank) that walks THROUGH these hubs explodes: a walk from one Problem expands across every unrelated Problem in the ecosystem because they all share the same Context anchor. The previous workaround was to exclude Context nodes from traversal entirely, which lost them as legitimate grounding/navigation anchors. You can't tell an agent "this Problem occurs in Rust" because Rust isn't reachable without also returning 400 unrelated Rust problems.
seed_id: "typescript" — got zero results. The BURST_BOTH_FILTER in taxonomy.ts excluded all structural edges (OCCURS_IN, WRITTEN_IN, OCCURS_ON) in an earlier attempt to prevent hub fan-out. But excluding the edges meant APOC path expansion had nothing to traverse from a Context seed. Meanwhile, a walk from a Problem node that DID include those edges would blow up through the hub. Researched APOC labelFilter syntax (Neo4j docs + APOC reference): the /Label prefix means "valid endpoint, don't expand through nodes with this label." This is exactly the asymmetry we wanted. Also checked existing EDGE_WEIGHT / pathConductance logic in pagerank.ts — Taxonomic edges (PERTAIN_TO, RELATES_TO, IMPLEMENTATION_OF) were already demoted to 0.3-0.5, but OCCURS_IN was still at 1.0 and WRITTEN_IN at 0.6, so paths through context nodes carried full causal-edge signal.
1. Conductance suppression (pagerank.ts EDGE_WEIGHT):
- OCCURS_IN: 1.0 → 0.3, OCCURS_ON: 1.0 → 0.3, WRITTEN_IN: 0.6 → 0.2 Paths through hubs now attenuate 3-5× per hop. Even if a walk crosses a Context node, signal drops fast enough that those results rank low in discoveryScore.
2. Re-include structural edges in BURST_BOTH_FILTER:
const BURST_BOTH_FILTER = edgeFilterForFamilies(['causal', 'resolution']) + '|INSTANCE_OF|IMPLEMENTS|OCCURS_IN|OCCURS_ON|WRITTEN_IN'Without this, APOC has no edges to traverse from a Context seed. Context nodes have to be reachable.
3. APOC labelFilter hub-as-terminator:
CALL apoc.path.expandConfig(seed, {
relationshipFilter: $edgeFilter,
labelFilter: '/Language|/Package|/OperatingSystem|/Paradigm|/DataStructure|/Context',
...
}) YIELD pathThe / prefix tells APOC: "valid endpoint, don't traverse THROUGH this label." Walks land on Context nodes as grounding but can't fan out through them.
Together: Context nodes are reachable (#2 allows expansion to them), they're terminal (#3 stops expansion through them), and any residual through-traversal attenuates fast (#1). The asymmetry "reachable but not traversable" is what you actually want for hub-like anchor nodes in a reasoning graph.