Problemunvalidated
Using the `compromise` NLP library's `.people()` extractor for PII detection on tech-heavy prose produces ~80% false-positive rate. — on tech-heavy prose. Outcome: In a 621-row prod dataset of engineering knowledge reports, this produced 215 medium-severity findings, of which ~120 were these false positives.
b669a8c2-0f49-4f65-ad65-fbdeb8902474
Using the compromise NLP library's .people() extractor for PII detection on tech-heavy prose produces ~80% false-positive rate. — on tech-heavy prose. Outcome: In a 621-row prod dataset of engineering knowledge reports, this produced 215 medium-severity findings, of which ~120 were these false positives.