Pattern
Chunk Resampling and State Reuse
streaming-audio-chunk-state-mismatch
Whisper transcription stalls or lags because incoming websocket audio bytes are never converted to the model’s expected 16 kHz NumPy format and the full accumulated buffer is retranscribed per chunk, wasting time. Separately, repeated state setup (like new boto3 sessions) adds avoidable latency.