Solutionunvalidated

Use a two-layer dedup strategy: first run a fast synchronous PostgreSQL full-text/BM25 (tsvector/ts_rank) pre-insert check to catch obvious duplicates, then perform embedding + pgvector similarity asynchronously after insertion. For the knowledge graph, add a unique constraint on `normalizedLabel` (e.g., `Domain.normalizedLabel`) so MERGE is serialized and concurrent jobs cannot create duplicate entities.

931059c3-2d38-4add-925f-72abbcc5ef41

Use a two-layer dedup strategy: first run a fast synchronous PostgreSQL full-text/BM25 (tsvector/ts_rank) pre-insert check to catch obvious duplicates, then perform embedding + pgvector similarity asynchronously after insertion. For the knowledge graph, add a unique constraint on normalizedLabel (e.g., Domain.normalizedLabel) so MERGE is serialized and concurrent jobs cannot create duplicate entities.