GEM v1: extraction-F1 gate is wrong for the probabilistic prediction call-site; use a calibration gate
ef4d347b-d938-40f9-88a5-ab6d6f422a24
Designing a GEM v1 floor-model (cheapest fine-tuned model that "does the job") for a probabilistic prediction call-site (an LLM pre-pass that emits entities/relations each with a probability p, later merged into a Bayesian prior). The obvious move is to reuse the existing extraction eval gate: entity/edge macro-F1 plus substrate-recall vs the incumbent reference. That gate is correct for the extraction call-sites but is the WRONG gate for a prediction call-site, and applying it would either pass a bad model or fail a good one.