Skip to main content

Web Search

This document explains how to use the Web Search feature on the ShuYou platform. ShuYou supports invoking Web Search via multiple compatible protocols, including Chat Completions, Messages, Responses, and Vertex AI.

Overview

Web Search allows an AI model to access real-time web information while generating an answer, enabling more accurate and up-to-date responses. This feature is particularly useful for:
  • Querying breaking news and current events
  • Getting the latest product information and pricing
  • Looking up dynamic data such as weather and stock quotes
  • Accessing the latest technical documentation and resources

Supported Protocols

ProtocolEndpointWeb Search Parameter
Chat Completions (OpenAI-compatible)/v1/chat/completionsweb_search_options
Messages (Anthropic-compatible)/v1/messagesweb_search_20250305 within tools
Responses (OpenAI Responses)/v1/responsesweb_search family within tools
Vertex AI (Google-compatible)/api/vertex-ai/v1/...googleSearch within tools

1. Chat Completions API

The Chat Completions API enables Web Search via the web_search_options parameter.

Parameters

ParameterTypeRequiredDescription
web_search_optionsobjectNoWeb search configuration
web_search_options.search_context_sizestringNoSearch context size: low / medium / high
web_search_options.user_locationobjectNoUser location info for localized search results
web_search_options.user_location.typestringYesLocation type, fixed as approximate
web_search_options.user_location.citystringNoCity name
web_search_options.user_location.countrystringNoCountry code (2-letter ISO, e.g. CN, US)
web_search_options.user_location.regionstringNoRegion/province
web_search_options.user_location.timezonestringNoTimezone (IANA format, e.g. Asia/Shanghai)

Example

curl -X POST "https://api.shuyou.ai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-5.2",
    "messages": [
      {
        "role": "user",
        "content": "How is the weather in Beijing today?"
      }
    ],
    "web_search_options": {
      "search_context_size": "medium",
      "user_location": {
        "type": "approximate",
        "city": "Beijing",
        "country": "CN",
        "region": "Beijing",
        "timezone": "Asia/Shanghai"
      }
    }
  }'
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.shuyou.ai/v1",
});

async function chatWithWebSearch() {
  const response = await client.chat.completions.create({
    model: "openai/gpt-5.2",
    messages: [
      {
        role: "user",
        content: "How is the weather in Beijing today?",
      },
    ],
    // @ts-ignore - web_search_options is a ShuYou extension parameter
    web_search_options: {
      search_context_size: "medium",
      user_location: {
        type: "approximate",
        city: "Beijing",
        country: "CN",
        region: "Beijing",
        timezone: "Asia/Shanghai",
      },
    },
  });

  console.log(response.choices[0].message.content);

  // Check whether there are URL citations
  const annotations = response.choices[0].message.annotations;
  if (annotations) {
    console.log("\nCitations:");
    annotations.forEach((annotation: any) => {
      if (annotation.type === "url_citation") {
        console.log(
          `- ${annotation.url_citation.title}: ${annotation.url_citation.url}`,
        );
      }
    });
  }
}

chatWithWebSearch();
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.shuyou.ai/v1"
)

response = client.chat.completions.create(
    model="openai/gpt-5.2",
    messages=[
        {
            "role": "user",
            "content": "How is the weather in Beijing today?"
        }
    ],
    extra_body={
        "web_search_options": {
            "search_context_size": "medium",
            "user_location": {
                "type": "approximate",
                "city": "Beijing",
                "country": "CN",
                "region": "Beijing",
                "timezone": "Asia/Shanghai"
            }
        }
    }
)

print(response.choices[0].message.content)

# Check whether there are URL citations
if hasattr(response.choices[0].message, 'annotations'):
    annotations = response.choices[0].message.annotations
    if annotations:
        print("\nCitations:")
        for annotation in annotations:
            if annotation.get("type") == "url_citation":
                citation = annotation.get("url_citation", {})
                print(f"- {citation.get('title')}: {citation.get('url')}")

2. Messages API (Anthropic-compatible)

The Messages API enables Web Search using the web_search_20250305 type within the tools parameter.

Parameters

ParameterTypeRequiredDescription
tools[].typestringYesTool type, fixed as web_search_20250305
tools[].namestringYesTool name, fixed as web_search
tools[].allowed_domainsarrayNoAllowlist of domains to search
tools[].blocked_domainsarrayNoBlocklist of domains to exclude
tools[].max_usesnumberNoMax number of searches in a single request
tools[].user_locationobjectNoUser location info
tools[].user_location.typestringYesLocation type, fixed as approximate
tools[].user_location.citystringNoCity name
tools[].user_location.countrystringNoCountry code (ISO 3166-1 alpha-2)
tools[].user_location.regionstringNoRegion
tools[].user_location.timezonestringNoTimezone (IANA format)

Example

curl -X POST "https://api.shuyou.ai/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "anthropic/claude-sonnet-4.5",
    "max_tokens": 4096,
    "messages": [
      {
        "role": "user",
        "content": "Please search for recent AI news"
      }
    ],
    "tools": [
      {
        "type": "web_search_20250305",
        "name": "web_search",
        "max_uses": 3,
        "user_location": {
          "type": "approximate",
          "country": "CN",
          "timezone": "Asia/Shanghai"
        }
      }
    ]
  }'
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.shuyou.ai",
});

async function messageWithWebSearch() {
  const response = await client.messages.create({
    model: "anthropic/claude-sonnet-4.5",
    max_tokens: 4096,
    messages: [
      {
        role: "user",
        content: "Please search for recent AI news",
      },
    ],
    tools: [
      {
        type: "web_search_20250305",
        name: "web_search",
        max_uses: 3,
        user_location: {
          type: "approximate",
          country: "CN",
          timezone: "Asia/Shanghai",
        },
      } as any,
    ],
  });

  // Process response content
  for (const block of response.content) {
    if (block.type === "text") {
      console.log(block.text);
    } else if (block.type === "web_search_tool_result") {
      console.log("\nSearch results:");
      if (Array.isArray(block.content)) {
        block.content.forEach((result: any) => {
          console.log(`- ${result.title}: ${result.url}`);
        });
      }
    }
  }

  // View Web Search usage stats
  if (response.usage?.server_tool_use) {
    console.log(
      `\nWeb Search request count: ${response.usage.server_tool_use.web_search_requests}`,
    );
  }
}

messageWithWebSearch();
import anthropic

client = anthropic.Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://api.shuyou.ai"
)

response = client.messages.create(
    model="anthropic/claude-sonnet-4.5",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Please search for recent AI news"
        }
    ],
    tools=[
        {
            "type": "web_search_20250305",
            "name": "web_search",
            "max_uses": 3,
            "user_location": {
                "type": "approximate",
                "country": "CN",
                "timezone": "Asia/Shanghai"
            }
        }
    ]
)

# Process response content
for block in response.content:
    if block.type == "text":
        print(block.text)
    elif block.type == "web_search_tool_result":
        print("\nSearch results:")
        if isinstance(block.content, list):
            for result in block.content:
                print(f"- {result.get('title')}: {result.get('url')}")

# View Web Search usage stats
if hasattr(response.usage, 'server_tool_use') and response.usage.server_tool_use:
    print(f"\nWeb Search request count: {response.usage.server_tool_use.get('web_search_requests', 0)}")

3. Responses API (OpenAI Responses)

The Responses API enables Web Search using the web_search family of types within the tools parameter.

Supported Web Search Types

TypeDescription
web_searchWeb search (generally available)
web_search_2025_08_26Web search 2025 version
web_search_previewWeb search preview
web_search_preview_2025_03_11Web search preview 2025 version

Parameters

ParameterTypeRequiredDescription
tools[].typestringYesWeb Search type
tools[].search_context_sizestringNoSearch context size: low / medium / high
tools[].filtersobjectNoSearch filters (only for web_search type)
tools[].filters.allowed_domainsarrayNoAllowlist of domains
tools[].user_locationobjectNoUser location info
tools[].user_location.typestringYesLocation type, fixed as approximate
tools[].user_location.citystringNoCity name
tools[].user_location.countrystringNoCountry code (2-letter ISO)
tools[].user_location.regionstringNoRegion/state code
tools[].user_location.timezonestringNoTimezone (IANA format)

Example

curl -X POST "https://api.shuyou.ai/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-5.2",
    "input": "What is the latest iPhone model this year? What new features does it have?",
    "tools": [
      {
        "type": "web_search",
        "search_context_size": "high",
        "user_location": {
          "type": "approximate",
          "country": "CN",
          "timezone": "Asia/Shanghai"
        }
      }
    ]
  }'
curl -X POST "https://api.shuyou.ai/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-5.2",
    "input": "What are the most important tech news today?",
    "stream": true,
    "tools": [
      {
        "type": "web_search_preview",
        "search_context_size": "medium"
      }
    ]
  }'
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.shuyou.ai/v1",
});

async function responsesWithWebSearch() {
  // Non-streaming request
  const response = await client.responses.create({
    model: "openai/gpt-5.2",
    input:
      "What is the latest iPhone model this year? What new features does it have?",
    tools: [
      {
        type: "web_search",
        search_context_size: "high",
        user_location: {
          type: "approximate",
          country: "CN",
          timezone: "Asia/Shanghai",
        },
      },
    ],
  } as any);

  // Process output
  for (const item of response.output) {
    if (item.type === "message") {
      for (const content of item.content) {
        if (content.type === "output_text") {
          console.log(content.text);

          // Print citations
          if (content.annotations) {
            console.log("\nCitations:");
            content.annotations.forEach((annotation: any) => {
              if (annotation.type === "url_citation") {
                console.log(
                  `- ${annotation.url_citation.title}: ${annotation.url_citation.url}`,
                );
              }
            });
          }
        }
      }
    } else if (item.type === "web_search_call") {
      console.log(`\nWeb Search status: ${item.status}`);
    }
  }
}

// Streaming request
async function responsesWithWebSearchStream() {
  const stream = await client.responses.create({
    model: "openai/gpt-5.2",
    input: "What are the most important tech news today?",
    stream: true,
    tools: [
      {
        type: "web_search_preview",
        search_context_size: "medium",
      },
    ],
  } as any);

  for await (const event of stream) {
    if (event.type === "response.web_search_call.in_progress") {
      console.log("🔍 Searching...");
    } else if (event.type === "response.web_search_call.searching") {
      console.log("🔎 Searching...");
    } else if (event.type === "response.web_search_call.completed") {
      console.log("✅ Search completed");
    } else if (event.type === "response.output_text.delta") {
      process.stdout.write(event.delta);
    }
  }
}

responsesWithWebSearch();
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.shuyou.ai/v1"
)

# Non-streaming request
response = client.responses.create(
    model="openai/gpt-5.2",
    input="What is the latest iPhone model this year? What new features does it have?",
    tools=[
        {
            "type": "web_search",
            "search_context_size": "high",
            "user_location": {
                "type": "approximate",
                "country": "CN",
                "timezone": "Asia/Shanghai"
            }
        }
    ]
)

# Process output
for item in response.output:
    if item.type == "message":
        for content in item.content:
            if content.type == "output_text":
                print(content.text)

                # Print citations
                if hasattr(content, 'annotations') and content.annotations:
                    print("\nCitations:")
                    for annotation in content.annotations:
                        if annotation.type == "url_citation":
                            print(f"- {annotation.url_citation.title}: {annotation.url_citation.url}")
    elif item.type == "web_search_call":
        print(f"\nWeb Search status: {item.status}")


# Streaming request
def responses_with_web_search_stream():
    stream = client.responses.create(
        model="openai/gpt-5.2",
        input="What are the most important tech news today?",
        stream=True,
        tools=[
            {
                "type": "web_search_preview",
                "search_context_size": "medium"
            }
        ]
    )

    for event in stream:
        if event.type == "response.web_search_call.in_progress":
            print("🔍 Searching...")
        elif event.type == "response.web_search_call.searching":
            print("🔎 Searching...")
        elif event.type == "response.web_search_call.completed":
            print("✅ Search completed")
        elif event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)

responses_with_web_search_stream()

4. Vertex AI API (Google-compatible)

The Vertex AI API enables Google Search Grounding via googleSearch in the tools parameter.

Parameters

In Vertex AI, Web Search is enabled via the googleSearch tool, and source information is returned in groundingMetadata in the response.
ParameterTypeRequiredDescription
tools[].googleSearchobjectYesGoogle Search configuration (an empty object enables it)

Grounding Information in the Response

FieldTypeDescription
groundingMetadata.webSearchQueriesarrayExecuted search queries
groundingMetadata.groundingChunksarrayEvidence chunks
groundingMetadata.groundingChunks[].web.uristringSource URL
groundingMetadata.groundingChunks[].web.titlestringSource title
groundingMetadata.groundingChunks[].web.domainstringSource domain

Example

import { GoogleGenAI } from "@google/genai";

// Use the ShuYou proxy
const client = new GoogleGenAI({
  apiKey: "YOUR_API_KEY",
  vertexai: true,
  httpOptions: {
    baseUrl: "https://api.shuyou.ai",
    apiVersion: "v1",
  },
});

async function generateWithGoogleSearch() {
  const response = await client.models.generateContent({
    model: "google/gemini-2.0-flash",
    contents: "Please tell me today's top tech news headlines",
    config: {
      tools: [{ googleSearch: {} }],
      temperature: 0.7,
      maxOutputTokens: 2048,
    },
  });

  // Get generated text
  console.log("Answer:", response.text);

  // Get Grounding info
  const groundingMetadata = response.candidates?.[0]?.groundingMetadata;
  if (groundingMetadata) {
    console.log("\nSearch queries:", groundingMetadata.webSearchQueries);

    if (groundingMetadata.groundingChunks) {
      console.log("\nCitations:");
      groundingMetadata.groundingChunks.forEach((chunk: any) => {
        if (chunk.web) {
          console.log(`- ${chunk.web.title}: ${chunk.web.uri}`);
        }
      });
    }
  }
}

// Streaming request
async function generateWithGoogleSearchStream() {
  const response = await client.models.generateContentStream({
    model: "google/gemini-2.0-flash",
    contents: "What are the recent major developments in AI?",
    config: {
      tools: [{ googleSearch: {} }],
    },
  });

  console.log("Answer:");
  for await (const chunk of response) {
    if (chunk.text) {
      process.stdout.write(chunk.text);
    }

    // The final chunk may include groundingMetadata
    const groundingMetadata = chunk.candidates?.[0]?.groundingMetadata;
    if (groundingMetadata?.groundingChunks) {
      console.log("\n\nCitations:");
      groundingMetadata.groundingChunks.forEach((c: any) => {
        if (c.web) {
          console.log(`- ${c.web.title}: ${c.web.uri}`);
        }
      });
    }
  }
}

generateWithGoogleSearch();
from google import genai
from google.genai import types

# Configure to use the ShuYou proxy
client = genai.Client(
    api_key="YOUR_API_KEY",
    vertexai=True,
    http_options=types.HttpOptions(
        api_version='v1',
        base_url='https://api.shuyou.ai'
    ),
)

# Non-streaming request
def generate_with_google_search():
    response = client.models.generate_content(
        model="google/gemini-2.0-flash",
        contents="Please tell me today's top tech news headlines",
        config=types.GenerateContentConfig(
            tools=[types.Tool(google_search=types.GoogleSearch())],
            temperature=0.7,
            max_output_tokens=2048
        )
    )

    # Get generated text
    print("Answer:", response.text)

    # Get Grounding info
    if response.candidates and response.candidates[0].grounding_metadata:
        metadata = response.candidates[0].grounding_metadata

        if metadata.web_search_queries:
            print("\nSearch queries:", metadata.web_search_queries)

        if metadata.grounding_chunks:
            print("\nCitations:")
            for chunk in metadata.grounding_chunks:
                if chunk.web:
                    print(f"- {chunk.web.title}: {chunk.web.uri}")

# Streaming request
def generate_with_google_search_stream():
    response = client.models.generate_content_stream(
        model="google/gemini-2.0-flash",
        contents="What are the recent major developments in AI?",
        config=types.GenerateContentConfig(
            tools=[types.Tool(google_search=types.GoogleSearch())]
        )
    )

    print("Answer:")
    for chunk in response:
        if chunk.text:
            print(chunk.text, end="", flush=True)

        # The final chunk may include grounding_metadata
        if chunk.candidates and chunk.candidates[0].grounding_metadata:
            metadata = chunk.candidates[0].grounding_metadata
            if metadata.grounding_chunks:
                print("\n\nCitations:")
                for c in metadata.grounding_chunks:
                    if c.web:
                        print(f"- {c.web.title}: {c.web.uri}")

generate_with_google_search()

Response Format Comparison

Chat Completions Response

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Based on the search results...",
        "annotations": [
          {
            "type": "url_citation",
            "url_citation": {
              "title": "Source Title",
              "url": "https://example.com/article",
              "start_index": 0,
              "end_index": 0
            }
          }
        ]
      }
    }
  ]
}

Messages Response

{
  "content": [
    {
      "type": "text",
      "text": "Based on the search results..."
    },
    {
      "type": "web_search_tool_result",
      "tool_use_id": "...",
      "content": [
        {
          "type": "web_search_result",
          "title": "Source Title",
          "url": "https://example.com/article"
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 100,
    "output_tokens": 200,
    "server_tool_use": {
      "web_search_requests": 2
    }
  }
}

Responses Response

{
  "output": [
    {
      "type": "web_search_call",
      "id": "ws_...",
      "status": "completed"
    },
    {
      "type": "message",
      "content": [
        {
          "type": "output_text",
          "text": "Based on the search results...",
          "annotations": [
            {
              "type": "url_citation",
              "url_citation": {
                "title": "Source Title",
                "url": "https://example.com/article"
              }
            }
          ]
        }
      ]
    }
  ]
}

Vertex AI Response

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Based on the search results..."
          }
        ]
      },
      "groundingMetadata": {
        "webSearchQueries": ["Tech news today"],
        "groundingChunks": [
          {
            "web": {
              "uri": "https://example.com/article",
              "title": "Source Title",
              "domain": "example.com"
            }
          }
        ]
      }
    }
  ]
}

Streaming Events (Responses API)

When using streaming mode with the Responses API, you may receive the following Web Search-related events:
Event TypeDescription
response.web_search_call.in_progressWeb Search call started
response.web_search_call.searchingSearch in progress
response.web_search_call.completedSearch completed

Best Practices

1. Choose the Right Search Context Size

  • low: Suitable for simple queries; faster responses and lower cost
  • medium: Balanced choice for most scenarios
  • high: Suitable for complex questions that require deeper research

2. Provide User Location Information

To get more relevant localized results, provide user location information:
{
  "user_location": {
    "type": "approximate",
    "city": "Shanghai",
    "country": "CN",
    "timezone": "Asia/Shanghai"
  }
}

3. Use Domain Filtering Appropriately

In the Messages API, you can use allowed_domains or blocked_domains to control the search scope:
{
  "type": "web_search_20250305",
  "name": "web_search",
  "allowed_domains": ["wikipedia.org", "github.com"],
  "blocked_domains": ["spam-site.com"]
}

4. Limit the Number of Searches

In the Messages API, use max_uses to control the maximum number of searches per request to manage cost:
{
  "type": "web_search_20250305",
  "name": "web_search",
  "max_uses": 3
}

5. Handle Citation Information

Always check and display citation information in responses to help users verify the reliability of the sources.

Notes

  1. Billing: Web Search incurs additional charges; see the pricing documentation for details.
  2. Latency: Enabling Web Search increases response latency because a real-time search must be performed.
  3. Availability: Not all models support Web Search; confirm support for your target model.
  4. Result Accuracy: Web Search results come from the live web; accuracy depends on the search engine and source websites.

FAQ

A: You can determine this in the following ways:
  • Chat Completions: Check for url_citation in message.annotations
  • Messages: Check usage.server_tool_use.web_search_requests
  • Responses: Look for web_search_call items in output
  • Vertex AI: Check whether groundingMetadata exists

Q: Why are there sometimes no search results returned?

A: Possible reasons include:
  1. The question does not require real-time information; the model decides not to search
  2. Search results are not relevant to the question and are filtered by the model
  3. Network issues cause the search to fail

Q: How can I optimize search performance?

A: Recommendations:
  1. Ask clear, specific questions
  2. Use an appropriate search context size
  3. Provide user location information to get localized results
  4. Use domain filtering in the Messages API to focus the search scope