Skip to main content
POST
/
models
/
v2
/
openai
/
v1
/
chat
/
completions
Chat Completions
curl --request POST \
  --url https://api.bytez.com/models/v2/openai/v1/chat/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "max_tokens": 256,
  "temperature": 0.7,
  "stream": false
}'
{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>"
      },
      "finish_reason": "<string>"
    }
  ]
}

Headers

Authorization
string
required

Token for authentication

provider-key
string

Optional provider key for running requests against closed source models. Required for providers anthropic and cohere. Removes rate limits for the other providers.

Body

application/json
model
string
required

The ID of the model to run (e.g., Qwen/Qwen3-1.7B, openai/gpt-4)

messages
object[]
required

Conversation messages (OpenAI chat format)

max_tokens
integer
default:256

Maximum number of tokens to generate

temperature
number
default:0.7

Sampling temperature

stream
boolean
default:false

Whether to stream responses

Response

Successful model completion

id
string

Unique ID for this completion

object
string

Type of returned object (usually chat.completion)

created
integer

Unix timestamp of completion

choices
object[]

Generated completions