Model Routing
Model routing is an intelligent feature of ShuYou that helps you automatically select the most suitable model from a wide range of large language models. The system intelligently balances performance and cost based on the request content, task characteristics, and your preference settings.Why Model Routing
In real-world applications, different tasks have different model requirements:- Simple conversations: using a high-performance model may be wasteful
- Complex reasoning: a budget model may not meet quality requirements
- Production environments: you must balance quality, cost, and speed
- Model selection is hard: dozens of models on the market make manual selection time-consuming
Model List
Click any model card to open its details page and view model-specific information across different providers, including performance comparisons, price comparisons, and parameter differences. For details, see the Provider Routing documentation.
Core Benefits
| Benefit | Description |
|---|---|
| Intelligent decisions | Automatically analyzes request content and task characteristics to select the most suitable model |
| Cost optimization | Prioritizes better cost-performance models while ensuring quality |
| Flexible configuration | Supports custom model pools and preference strategies for different business scenarios |
| Transparent and controllable | Returns the actual model used for easy monitoring and optimization |
| Continuous optimization | Continuously improves routing strategies based on historical data |
Quick Start
Basic Usage
Model routing is easy to use—simply set themodel parameter to ShuYou/auto and specify the candidate model pool via model_routing_config. If you do not specify model_routing_config.available_models, the system will use the platform’s full model pool.
Models on the ShuYou platform have unique slugs. You can get a model’s slug from the Models list page:
Or from the model detail page for a specific model:

Or from the model detail page for a specific model:

cURL Request
OpenAI Python SDK
The
model field in the response returns the model selected by intelligent routing, making it easy to monitor and analyze routing behavior.How It Works
ShuYou/auto model
ShuYou/auto is a special model identifier in ShuYou. When you specify this model, the system enables intelligent routing.
Routing decision process:
- Request analysis: parse prompt content, context length, task type, and other features
- Model evaluation: score each model in the candidate pool
- Aggregated decision: balance performance, price, and availability according to the
preferencestrategy - Model selection: choose the optimal model and forward the request
- Result return: annotate the actual model used in the response
Factors considered in routing decisions
- Task complexity: simple conversation vs. complex reasoning
- Context length: short dialogue vs. long document analysis
- Model performance: accuracy, response speed, creativity
- Model pricing: input/output token unit price
- Model availability: real-time load, regional restrictions
- User preference: performance / balanced / price
Configuration Parameters
model_routing_config object
Configure intelligent routing behavior via the model_routing_config parameter:
| Parameter | Type | Required | Description |
|---|---|---|---|
available_models | string[] | Yes | Candidate model list for routing |
preference | string | No | Routing preference strategy, default balanced |
available_models - Candidate model pool
Specify the list of models that intelligent routing can choose from. We recommend including 3–5 models across different performance and price tiers.
preference - Routing preference strategy
Specify the priority strategy used in routing decisions:
balanced - Balanced mode (default)
Seeks the optimal balance between performance and cost; suitable for most application scenarios.
Characteristics:
- Prioritizes budget models for simple tasks
- Automatically upgrades to high-performance models for complex tasks
- Balances quality and cost
- General-purpose apps in production environments
- Mixed scenarios such as conversational assistants and content generation
- Situations where you must control cost without sacrificing quality
performance - Performance-first mode
Prioritizes the highest-performing models; suitable for scenarios with very high output quality requirements.
Characteristics:
- Tends to choose top flagship models
- Ensures the highest answer quality and accuracy
- Relatively higher cost
- Critical business decision support
- Professional content creation (legal, medical, finance, etc.)
- Complex code generation and debugging
- Academic research and data analysis
price - Price-first mode
Prioritizes models with the best cost-effectiveness; suitable for large-scale, cost-sensitive applications.
Characteristics:
- Prefers the cheapest models
- Only upgrades to more expensive models when necessary
- Maximizes cost efficiency
- High-concurrency simple conversation applications
- Internal tools and test environments
- Education and learning scenarios
- Budget-limited startup projects
Preference strategy comparison
| Strategy | Performance | Cost | Suitable scenarios |
|---|---|---|---|
balanced | ⭐⭐⭐⭐ | ⭐⭐⭐ | Production, general apps |
performance | ⭐⭐⭐⭐⭐ | ⭐⭐ | Critical business, professional content |
price | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | High concurrency, cost-sensitive |
Best Practices
1. Configure the candidate model pool appropriately
Follow these principles when choosing candidate models: Recommended:- Include 3–5 models across different tiers
- Mix flagship, mid-tier, and budget models
- Consider model strengths (creativity, reasoning, speed, etc.)
- Ensure all models have the necessary API keys configured
- Only choosing models from the same tier (loses routing advantages)
- Including too many models (increases decision complexity)
FAQ
Q: How much latency does intelligent routing add?
A: Routing decisions typically complete within 50–100 ms, with negligible impact for most applications. The actual request response time mainly depends on the selected model’s processing speed.Q: How many models should the candidate pool include?
A: We recommend 3–5 models. Too few cannot fully leverage routing advantages; too many increase decision complexity with diminishing returns.Q: What factors does intelligent routing consider?
A: The routing system considers multiple factors:- Prompt content and length
- Task type (conversation, creation, reasoning, etc.)
- Model performance metrics (accuracy, speed)
- Model pricing
- Current load and availability
- Your
preferencesetting
Q: Can I view detailed routing decision logs?
A: The response returns the actual model used (response.model). You can also view call logs in the ShuYou user console to see the routing details for each request.