Streaming
ShuYou allows any model to return generated results incrementally in a streaming fashion, rather than returning the full response at once. Streaming output lets users see the first token from the model immediately, reducing wait time. This can significantly improve user experience, especially for real-time conversations and long-form generation. You can enable streaming output by setting thestream parameter to true in your request. Below are two example approaches:
Method 1: Use the OpenAI-compatible API (Recommended)
Python
TypeScript
Method 2: Call the ShuYou API Directly
Python (httpx)
TypeScript (fetch)
Shell (cURL)