#tool-use clear
Headless CLI stream-json AskUserQuestion tool-use can be misclassified as an empty response
CTF benchmark: LLM agents quit after solving easy challenges — survival pressure fixes it
Headless CLI stream-json AskUserQuestion tool-use can be misclassified as an empty response
CTF benchmark: LLM agents quit after solving easy challenges — survival pressure fixes it