SiliconFlow

创建对话请求(Anthropic)

Creates a model response for the given chat conversation.

POST
/messages
AuthorizationBearer <token>required

Use the following format for authentication: Bearer

In: header

modelstringrequired

Corresponding Model Name. To better enhance service quality, we will make periodic changes to the models provided by this service, including but not limited to model on/offlining and adjustments to model service capabilities. We will notify you of such changes through appropriate means such as announcements or message pushes where feasible.For a complete list of available models, please check the Models.

Example"Pro/zai-org/GLM-4.7"
messagesarray<object>required

A list of messages comprising the conversation so far.

systemstring | array<Text>

System prompt. A system prompt is a way of providing context and instructions to llm, such as specifying a particular goal or role.

stop_sequencesStop Sequences

Custom text sequences that will cause the model to stop generating.

Our models will normally stop when they have naturally completed their turn, which will result in a response stop_reason of "end_turn".

If you want the model to stop generating when it encounters custom strings of text, you can use the stop_sequences parameter. If the model encounters one of the custom sequences, the response stop_reason value will be "stop_sequence" and the response stop_sequence value will contain the matched stop sequence.

streamboolean

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]

Value infalse | true
Exampletrue
max_tokensintegerrequired

The maximum number of tokens to generate before stopping.

Note that our models may stop before reaching this maximum. This parameter only specifies the absolute maximum number of tokens to generate.

Different models have different maximum values for this parameter. See models for details.

temperaturenumber

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Formatfloat
Rangevalue <= 2
Example0.7
top_pnumber

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.

Formatfloat
Range0.1 <= value <= 1
Example0.7
top_knumber
Formatfloat
Rangevalue <= 50
Example50
toolsarray<object>

Each tool definition includes:

  • name: Name of the tool.

  • description: Optional, but strongly-recommended description of the tool.

  • input_schema: JSON schema for the tool input shape that the model will produce in tool_use output content blocks.

tool_choiceAuto | Tool | None

How the model should use the provided tools. The model can use a specific tool, any available tool, decide by itself, or not use tools at all.

The model will automatically decide whether to use tools.

disable_parallel_tool_useDisable Parallel Tool Use

Whether to disable parallel tool use.

Defaults to false. If set to true, the model will output at most one tool use.

typeTyperequired
Value in"auto"

The model will use the specified tool with tool_choice.name.

disable_parallel_tool_useDisable Parallel Tool Use

Whether to disable parallel tool use.

Defaults to false. If set to true, the model will output exactly one tool use.

nameNamerequired

The name of the tool to use.

typeTyperequired
Value in"tool"

The model will not be allowed to use tools.

typeTyperequired
Value in"none"

Response Body

The response from the model. The response header contains the x-siliconcloud-trace-id field, which serves as a unique identifier for tracing requests, facilitating log queries and issue troubleshooting.

TypeScript Definitions

Use the response body type in TypeScript.

idstring
typeType

Object type.

For Messages, this is always "message".

Default"message"
Value in"message"
roleRole

Conversational role of the generated message.

This will always be "assistant".

Default"assistant"
Value in"assistant"
contentContent

Content generated by the model.

This is an array of content blocks, each of which has a type that determines its shape.

Example:

[{"type": "text", "text": "Hi"}]

If the request input messages ended with an assistant turn, then the response content will continue directly from that last turn. You can use this to constrain the model's output.

For example, if the input messages were:

[
  {"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"},
  {"role": "assistant", "content": "The best answer is ("}
]

Then the response content might be:

[{"type": "text", "text": "B)"}]
modelModel

The model that handled the request.

stop_reasonStop Reason

The reason that we stopped.

This may be one the following values:

  • "end_turn": the model reached a natural stopping point or one of your provided custom stop_sequences was generated
  • "max_tokens": we exceeded the requested max_tokens or the model's maximum
  • "tool_use": the model invoked one or more tools
  • "refusal": when streaming classifiers intervene to handle potential policy violations

In non-streaming mode this value is always non-null. In streaming mode, it is null in the message_start event and non-null otherwise.

stop_sequenceStop Sequence

Which custom stop sequence was generated, if any.

This value will be a non-null string if one of your custom stop sequences was generated.

usageUsage

curl --request POST \
  --url https://api.siliconflow.cn/v1/messages \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "Pro/zai-org/GLM-4.7",
    "messages": [
      {"role": "system", "content": "你是一个有用的助手"},
      {"role": "user", "content": "你好,请介绍一下你自己"}
    ],
    "stream": true
  }'
import requests
url = "https://api.siliconflow.cn/v1/messages"
payload = {
    "model": "Pro/zai-org/GLM-4.7",
    "messages": [
        {
            "role": "user",
            "content": "What opportunities and challenges will the Chinese large model industry face in 2025?"
        }
    ],
    "stream": True
}
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)
print(response.text)
fetch('https://api.siliconflow.cn/v1/messages', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify({
    model: 'Pro/zai-org/GLM-4.7',
    messages: [
      {role: 'system', content: '你是一个有用的助手'},
      {role: 'user', content: '你好,请介绍一下你自己'}
    ],
    stream: true
  })
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
curl --location 'https://api.siliconflow.cn/v1/messages' \
--header 'x-api-key: YOUR_API_KEY' \
--header 'content-type: application/json' \
--data '{
  "model": "Pro/zai-org/GLM-4.7",
  "tools": [
    {
    "name": "get_weather",
    "description": "Get the current weather in a given location",
    "input_schema": {
      "type": "object",
      "properties": {
      "location": {
          "type": "string",
          "description": "The city and state, e.g. San Francisco, CA"
      }
      },
      "required": ["location"]
    }
    }
  ],
  "tool_choice": {"type": "any"},
  "messages": [
    {
    "role": "user",
    "content": "What is the weather like in San Francisco?"
    }
  ]
}' 
import requests
url = "https://api.siliconflow.cn/v1/messages"
payload = {
  "model": "Pro/zai-org/GLM-4.7",
  "tools": [
    {
    "name": "get_weather",
    "description": "Get the current weather in a given location",
    "input_schema": {
      "type": "object",
      "properties": {
      "location": {
          "type": "string",
          "description": "The city and state, e.g. San Francisco, CA"
      }
      },
      "required": ["location"]
    }
    }
  ],
  "tool_choice": {"type": "any"},
  "messages": [
    {
    "role": "user",
    "content": "What is the weather like in San Francisco?"
    }
  ]
}
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

response = requests.post(url, json=payload, headers=headers)
print(response.text)
const url = 'https://api.siliconflow.cn/v1/messages';
const apiKey = 'YOUR_API_KEY';
const requestData = {
  model: "Pro/zai-org/GLM-4.7",
  tools: [
    {
      name: "get_weather",
      description: "Get the current weather in a given location",
      input_schema: {
        type: "object",
        properties: {
          location: {
            type: "string",
            description: "The city and state, e.g. San Francisco, CA"
          }
        },
        required: ["location"]
      }
    }
  ],
  tool_choice: { type: "any" },
  messages: [
    {
      role: "user",
      content: "What is the weather like in San Francisco?"
    }
  ]
};
fetch(url, {
  method: 'POST',
  headers: {
    'x-api-key': apiKey,
    'content-type': 'application/json'
  },
  body: JSON.stringify(requestData)
})
  .then(response => {
    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}`);
    }
    return response.json();
  })
  .then(data => {
    console.log('Response:', data);
  })
  .catch(error => {
    console.error('Error:', error);
  });

{
  "content": [
    {
      "type": "thinking",
      "thinking": "...",
      "signature": "tvshsltrjs"
    },
    {
      "text": "Hello! I'm GLM, trained by Z.ai. How can I assist you today? Whether you have questions or just want to chat, I'm happy to help.",
      "type": "text"
    }
  ],
  "id": "msg_T15jjp718fACotrwiLp3KwVu",
  "model": "Pro/zai-org/GLM-4.7",
  "role": "assistant",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "input_tokens": 6,
    "output_tokens": 215
  }
}

{
  "code": 20012,
  "message": "string",
  "data": "string"
}
"Invalid token"
"Forbidden"
"404 page not found"
{
  "message": "Request was rejected due to rate limiting. If you want more, please contact contact@siliconflow.cn. Details:TPM limit reached.",
  "data": "string"
}
{
  "code": 50505,
  "message": "Model service overloaded. Please try again later.",
  "data": "string"
}
"string"