Responses API (推薦)

建立模型回應。支援文本和圖像輸入，生成文本或 JSON 輸出。支援函式呼叫（Tool Calling）、串流回應和多輪對話。

推薦新專案使用 Responses API。 這是 OpenAI 推出的新一代 API，相比 Chat Completions 具備以下優勢：

原生 Prompt Caching — instructions 與 input 分離，系統指令自動作為快取前綴，多輪對話中不變的前綴部分快取命中率更高，可節省最高 50% 輸入 token 費用，同時降低延遲
結構化 item 模式 — 輸入/輸出格式更清晰，原生支援工具呼叫流程
更豐富的串流事件 — 細粒度的 SSE 事件類型，便於即時 UI 渲染

端點


POST https://api.ofox.io/v1/responses

請求參數

參數	類型	必填	說明
`model`	string	✅	模型識別碼，如 `openai/gpt-5.4-mini`
`input`	string \| array	✅	輸入內容，可以是純文本字串或結構化訊息陣列
`instructions`	string	—	系統指令（獨立於 input，自動享受 Prompt Caching）
`stream`	boolean	—	是否啟用 SSE 串流回應，預設 `false`
`max_output_tokens`	number	—	最大生成 token 數
`temperature`	number	—	取樣溫度 0-2，預設 1
`top_p`	number	—	核取樣參數
`tools`	array	—	可用工具定義（Function Calling）
`tool_choice`	string \| object	—	工具選擇策略：`auto`、`none` 或指定工具
`truncation`	string	—	截斷策略：`auto` 自動截斷 / `disabled` 超限報錯（預設）
`text`	object	—	文本生成格式設定
`store`	boolean	—	是否儲存回應（預設 `true`）
`metadata`	object	—	自訂元資料鍵值對
`provider`	object	—	OfoxAI 擴充：路由和回退設定

Input 格式

input 支援兩種格式：

1. 簡單字串 — 直接傳入文本


{
  "input": "你好，請介紹一下自己"
}

2. 結構化訊息陣列 — 多輪對話和多模態輸入


interface InputItem {
  type: 'message'
  role: 'user' | 'assistant'
  content: ContentPart[]
  id?: string               // assistant 訊息必填
  status?: 'completed'      // assistant 訊息必填
}
 
type ContentPart =
  | { type: 'input_text'; text: string }           // 使用者文本輸入
  | { type: 'input_image'; image_url: string }     // 圖像輸入
  | { type: 'output_text'; text: string; annotations?: any[] }  // 助手文本輸出

在多輪對話中包含 assistant 角色訊息時，id 和 status 欄位為必填。 Responses API 為無狀態設計，每次請求需攜帶完整對話歷史。

請求範例

cURL

Terminal


curl https://api.ofox.io/v1/responses \
  -H "Authorization: Bearer $OFOX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.4-mini",
    "input": "解釋什麼是 API Gateway",
    "instructions": "你是一個有幫助的技術助手，用中文回答。",
    "max_output_tokens": 1024
  }'

Python

responses.py


from openai import OpenAI
 
client = OpenAI(
    base_url="https://api.ofox.io/v1",
    api_key="<你的 OFOXAI_API_KEY>"
)
 
response = client.responses.create(
    model="openai/gpt-5.4-mini",
    input="解釋什麼是 API Gateway",
    instructions="你是一個有幫助的技術助手，用中文回答。",
    max_output_tokens=1024
)
 
print(response.output_text)

TypeScript

responses.ts


import OpenAI from 'openai'
 
const client = new OpenAI({
  baseURL: 'https://api.ofox.io/v1',
  apiKey: '<你的 OFOXAI_API_KEY>'
})
 
const response = await client.responses.create({
  model: 'openai/gpt-5.4-mini',
  input: '解釋什麼是 API Gateway',
  instructions: '你是一個有幫助的技術助手，用中文回答。',
  max_output_tokens: 1024
})
 
console.log(response.output_text)

回應格式


{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1703123456,
  "model": "openai/gpt-5.4-mini",
  "status": "completed",
  "output": [
    {
      "type": "message",
      "id": "msg_def456",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "API Gateway（API 閘道）是一個...",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 25,
    "output_tokens": 150,
    "total_tokens": 175
  }
}

回應欄位說明

欄位	類型	說明
`id`	string	回應唯一識別碼，以 `resp_` 開頭
`object`	string	固定值 `"response"`
`created_at`	number	建立時間戳記（Unix 秒）
`model`	string	實際使用的模型 ID
`status`	string	回應狀態：`completed`、`failed`、`in_progress`、`cancelled`
`output`	array	輸出 item 陣列，包含訊息和工具呼叫
`usage`	object	Token 用量統計

結構化訊息輸入

使用結構化訊息陣列實現多輪對話：

Python

multi_turn.py


response = client.responses.create(
    model="openai/gpt-5.4-mini",
    input=[
        {
            "type": "message",
            "role": "user",
            "content": [
                {"type": "input_text", "text": "法國的首都是哪裡？"}
            ]
        },
        {
            "type": "message",
            "role": "assistant",
            "id": "msg_abc123",
            "status": "completed",
            "content": [
                {"type": "output_text", "text": "法國的首都是巴黎。", "annotations": []}
            ]
        },
        {
            "type": "message",
            "role": "user",
            "content": [
                {"type": "input_text", "text": "那裡有多少人口？"}
            ]
        }
    ]
)
 
print(response.output_text)

串流回應

設定 stream: true 啟用 SSE 串流回應：

Python

stream.py


stream = client.responses.create(
    model="openai/gpt-5.4-mini",
    input="講一個關於程式設計的笑話",
    stream=True
)
 
for event in stream:
    if event.type == "response.output_text.delta":
        print(event.delta, end="", flush=True)

串流事件類型

串流回應透過 SSE 發送以下事件：


data: {"type":"response.created","response":{"id":"resp_abc123","object":"response","status":"in_progress"}}

data: {"type":"response.output_item.added","output_index":0,"item":{"type":"message","id":"msg_def456","role":"assistant","status":"in_progress","content":[]}}

data: {"type":"response.content_part.added","output_index":0,"content_index":0,"part":{"type":"output_text","text":""}}

data: {"type":"response.output_text.delta","output_index":0,"content_index":0,"delta":"你"}

data: {"type":"response.output_text.delta","output_index":0,"content_index":0,"delta":"好"}

data: {"type":"response.output_item.done","output_index":0,"item":{"type":"message","id":"msg_def456","role":"assistant","status":"completed","content":[{"type":"output_text","text":"你好..."}]}}

data: {"type":"response.completed","response":{"id":"resp_abc123","object":"response","status":"completed","usage":{"input_tokens":12,"output_tokens":45,"total_tokens":57}}}

data: [DONE]

事件類型	說明
`response.created`	回應物件建立
`response.output_item.added`	新增輸出 item
`response.content_part.added`	新增內容片段
`response.output_text.delta`	文本增量（逐 token 輸出）
`response.output_item.done`	輸出 item 完成
`response.completed`	回應全部完成
`response.function_call_arguments.delta`	函式呼叫參數增量
`response.function_call_arguments.done`	函式呼叫參數完成

Function Calling

Responses API 原生支援工具呼叫：

Python

tools.py


response = client.responses.create(
    model="openai/gpt-5.4-mini",
    input="北京今天天氣怎麼樣？",
    tools=[
        {
            "type": "function",
            "name": "get_weather",
            "description": "取得指定城市的目前天氣",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "城市名稱，如 北京"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"]
                    }
                },
                "required": ["location"]
            }
        }
    ],
    tool_choice="auto"
)
 
# 處理工具呼叫
for item in response.output:
    if item.type == "function_call":
        print(f"呼叫函式: {item.name}")
        print(f"參數: {item.arguments}")

工具呼叫回應格式

當模型呼叫工具時，output 中包含 function_call 類型的 item：


{
  "id": "resp_abc123",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "type": "function_call",
      "id": "fc_abc123",
      "call_id": "call_xyz789",
      "name": "get_weather",
      "arguments": "{\"location\":\"北京\",\"unit\":\"celsius\"}"
    }
  ],
  "usage": {
    "input_tokens": 45,
    "output_tokens": 25,
    "total_tokens": 70
  }
}

提交工具結果

將工具執行結果回傳給模型，在 input 中包含完整的呼叫鏈：


# 第二次請求：提交工具結果
response = client.responses.create(
    model="openai/gpt-5.4-mini",
    input=[
        {
            "type": "message",
            "role": "user",
            "content": [{"type": "input_text", "text": "北京今天天氣怎麼樣？"}]
        },
        {
            "type": "function_call",
            "id": "fc_abc123",
            "call_id": "call_xyz789",
            "name": "get_weather",
            "arguments": "{\"location\":\"北京\",\"unit\":\"celsius\"}"
        },
        {
            "type": "function_call_output",
            "id": "fco_abc123",
            "call_id": "call_xyz789",
            "output": "{\"temperature\":\"22°C\",\"condition\":\"晴\"}"
        }
    ]
)
 
print(response.output_text)
# => "北京今天天氣晴朗，氣溫 22°C，非常適合戶外活動。"

Tool Choice 選項

值	說明
`"auto"`	模型自行決定是否呼叫工具（預設）
`"none"`	禁止呼叫工具
`{"type": "function", "name": "tool_name"}`	強制呼叫指定工具

與 Chat Completions 的對比

特性	Chat Completions	Responses API
端點	`/v1/chat/completions`	`/v1/responses`
輸入格式	`messages` 陣列	`input` 字串或結構化 item 陣列
系統指令	`role: "system"` message	`instructions` 參數（獨立快取）
Prompt Caching	系統指令混在 messages 中，快取前綴不穩定	`instructions` 獨立傳遞，自動快取，命中率更高
輸出格式	`choices[0].message.content`	`output[0].content[0].text` 或 `output_text`
工具呼叫	`tool_calls` 在 message 中	獨立的 `function_call` output item
工具結果	`role: "tool"` message	`function_call_output` input item
串流事件	`chat.completion.chunk`	結構化事件類型（`response.*`）
Token 欄位	`prompt_tokens` / `completion_tokens`	`input_tokens` / `output_tokens`

兩個 API 均可用於生產環境。如果你已有 Chat Completions 整合，無需遷移。 推薦新專案使用 Responses API，尤其是需要複雜工具呼叫流程或高頻呼叫（可充分利用快取降低成本）的場景。詳見函式呼叫指南。