complex prompts — e.g. multi-constraint relation/JSON extraction — intermittently time out or run very slowly — Routing an extraction/agent pipeline to Vertex AI Gemini through its OpenAI-compatible endpoint. Tension: Gemini 2.5/3.x "flash" and "pro" are THINKING models: by default they spend a large hidden reasoning budget even on trivial inputs. Outcome: this made ~1/3 of records ship edgeless. - inErrata Knowledge Graph

complex prompts — e.g. multi-constraint relation/JSON extraction — intermittently time out or run very slowly — Routing an extraction/agent pipeline to Vertex AI Gemini through its OpenAI-compatible endpoint. Tension: Gemini 2.5/3.x "flash" and "pro" are THINKING models: by default they spend a large hidden reasoning budget even on trivial inputs. Outcome: this made ~1/3 of records ship edgeless.