Chat Completions
Overview
The chat completions endpoint is the primary way to send conversational inputs to MultiRoute and receive model-generated responses. You can use it for:
- Single-turn prompts or multi-turn conversations.
- Tooling and agent-style interactions (when supported by the selected model).
- Both streaming and non-streaming responses.
This endpoint uses a request and response schema compatible with common chat completion APIs, so existing clients can typically be adapted with minimal changes.
Endpoint
- Method:
POST - Path:
/v1/chat/completions - Base URL:
https://api.multiroute.ai/v1
Full URL:
https://api.multiroute.ai/v1/chat/completions
Authentication is required via the Authorization: Bearer <your-api-key> header. See Authentication for details.
Request body
The request body is JSON with at least a model and a messages array, plus optional generation parameters.
Core fields
model(string, required): The model identifier to use (for example,"gpt-4.1"or a MultiRoute-specific routing alias such as"multiroute-latest").messages(array, required): Ordered list of message objects representing the conversation.- Each message has:
role(string):"system","user","assistant", or"tool"(depending on capabilities).content(string or structured content): The message text or content payload.
- Each message has:
temperature(number, optional): Sampling temperature in the range [0, 2]. Higher values = more random outputs. Default may vary by model.max_tokens(integer, optional): Maximum number of tokens to generate in the completion.stream(boolean, optional): Iftrue, the response is returned as a stream of JSON lines (Server-Sent Events style) rather than a single JSON object.
Additional fields such as top_p, tools, tool_choice, and response_format are also available; see the API schema for the exact shape.
Example request body
{
"model": "multiroute-chat-latest",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Give me three ideas for a weekend project." }
],
"temperature": 0.7,
"max_tokens": 256,
"stream": false
}
Response body
For non-streaming requests ("stream": false or omitted), the endpoint returns a single JSON object with the following structure:
id(string): Unique identifier for the completion.object(string): Object type, e.g."chat.completion".created(integer): Unix timestamp (seconds).model(string): The model that produced the response. This may be a resolved provider-specific model ID, even if you requested a routing alias.choices(array): List of completion choices.- Each choice includes:
index(integer): Position of this choice in the list.message(object): The assistant message generated.role(string): Typically"assistant".content(string or structured content): The generated text/content.
finish_reason(string or null): Why the generation stopped, e.g."stop","length", or other model-specific reasons.
- Each choice includes:
usage(object, optional): Token usage summary.prompt_tokens(integer)completion_tokens(integer)total_tokens(integer)
Example response (non-streaming)
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1710000000,
"model": "multiroute-chat-latest",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Here are three ideas for a weekend project..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 128,
"total_tokens": 152
}
}
For streaming responses ("stream": true), the endpoint sends a sequence of JSON fragments (typically as data: {...} lines) that can be assembled into a final message. The exact streaming protocol matches common chat completion streaming conventions.
Examples
Non-streaming
import os
import requests
API_KEY = os.environ.get("MULTIROUTE_API_KEY")
def run_chat_completion():
url = "https://api.multiroute.ai/v1/chat/completions"
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
}
json_body = {
"model": "multiroute-chat-latest",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "List three ways to test an API."},
],
"temperature": 0.5,
"max_tokens": 128,
}
resp = requests.post(url, headers=headers, json=json_body, timeout=30)
resp.raise_for_status()
data = resp.json()
print(data["choices"][0]["message"]["content"])
if __name__ == "__main__":
run_chat_completion()
const apiKey = process.env.MULTIROUTE_API_KEY!;
async function runChatCompletion() {
const response = await fetch("https://api.multiroute.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "multiroute-chat-latest",
messages: [
{ role: "system", content: "You are a concise assistant." },
{ role: "user", content: "Summarize the benefits of MultiRoute in two sentences." }
],
temperature: 0.3,
max_tokens: 128,
}),
});
if (!response.ok) {
const errorBody = await response.text();
throw new Error(`Request failed: ${response.status} ${errorBody}`);
}
const data = await response.json();
console.log(data.choices[0]?.message?.content);
}
runChatCompletion().catch(console.error);
curl https://api.multiroute.ai/v1/chat/completions \
-H "Authorization: Bearer $MULTIROUTE_API_KEY" \
-H "Content-Type": "application/json" \
-d '{
"model": "multiroute-chat-latest",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Tell me a short joke about APIs." }
],
"temperature": 0.4,
"max_tokens": 64
}'
cURL (streaming)
curl https://api.multiroute.ai/v1/chat/completions \
-H "Authorization": "Bearer $MULTIROUTE_API_KEY" \
-H "Content-Type": "application/json" \
-d '{
"model": "multiroute-chat-latest",
"messages": [
{ "role": "user", "content": "Stream the response to this message." }
],
"stream": true
}'