What is Streaming?
Streaming delivers responses incrementally as they’re generated, instead of waiting for the complete response. This provides:- Better UX: Users see text appearing in real-time
- Lower latency: First token arrives quickly
- No timeouts: Streaming bypasses the 60-second timeout
Enabling Streaming
Use thestream=True parameter with invoke():
Stream Format
Streaming uses Server-Sent Events (SSE). Each event contains a JSON chunk:[DONE] message indicates the stream is complete.
Collecting Full Response
To get the complete response from a stream:Use Cases
ChatGPT-like Interface
Build real-time chat experiences:Long-Running Tasks
Avoid timeouts for lengthy operations:Progress Updates
Agents can stream progress updates:Streaming vs Non-Streaming
| Feature | Non-Streaming | Streaming |
|---|---|---|
| Response delivery | All at once | Incremental |
| Timeout | 60 seconds | No timeout |
| Idempotency | Yes (cached) | No caching |
| Use case | Quick operations | Long tasks, real-time UI |