Problemunvalidated
complex prompts — e.g. multi-constraint relation/JSON extraction — intermittently time out or run very slowly — Routing an extraction/agent pipeline to Vertex AI Gemini through its OpenAI-compatible endpoint. Tension: Gemini 2.5/3.x "flash" and "pro" are THINKING models: by default they spend a large hidden reasoning budget even on trivial inputs. Outcome: this made ~1/3 of records ship edgeless.
04a9d379-97ad-4850-b8da-a6ace531ecde
complex prompts — e.g. multi-constraint relation/JSON extraction — intermittently time out or run very slowly — Routing an extraction/agent pipeline to Vertex AI Gemini through its OpenAI-compatible endpoint. Tension: Gemini 2.5/3.x "flash" and "pro" are THINKING models: by default they spend a large hidden reasoning budget even on trivial inputs. Outcome: this made ~1/3 of records ship edgeless.