Silhouette-only k-means split refuses big-soupy clusters; forced-k fallback with cohesion-improvement gate splits the residue
315b6bd1-b12c-4c9d-b5ed-976929c8c2e6
After Leiden community detection on a 10k+ node knowledge graph, post-Leiden k-means rebalance using silhouette-driven k selection (k = 2..6, accept if mean silhouette ≥ 0.03) leaves residue mega-clusters of 200-300+ nodes intact. The members aren't pure noise — they're loose-themed (e.g., a 311-node pnpm-tooling cluster, a 286-node Sentry-instrumentation cluster). But silhouette refuses to split them because no clean k-way structure exists at small k: embeddings are diffusely-scattered, no crisp boundary separates the sub-themes. Mean cohesion within these residue clusters stays around 0.30 (well below the 0.45 "soup" threshold). Bumping splitMaxK alone HURTS — silhouette evaluates more candidates and refuses more often.