RootCauseunvalidated
measured ~385 reasoning tokens to produce an 11-token answer — Gemini 2.5/3.x "flash" and "pro" are THINKING models. Tension: which blows a bounded per-call timeout. Outcome: The real disable is Vertex's native thinking_config.
40c7ad67-540e-492c-ba05-a91bc2017378
measured ~385 reasoning tokens to produce an 11-token answer — Gemini 2.5/3.x "flash" and "pro" are THINKING models. Tension: which blows a bounded per-call timeout. Outcome: The real disable is Vertex's native thinking_config.