Gemini

Google: Gemini 2.5 Flash Lite

Chat
google/gemini-2.5-flash-lite

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, [thinking] (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the Reasoning API parameter to selectively trade off cost for intelligence.

上下文窗口
1M
最大输出 Token
66K
发布日期
2025-07-22
能力
视觉函数调用提示缓存PDF 输入
可用供应商
GoogleCloudVertex
支持的协议
OpenAIopenaiGeminigemini

供应商

GoogleCloudVertex
输入 Token
$0.1/M
输出 Token
$0.4/M
缓存读取
$0.025/M
缓存写入
$1/M
音频输入
$0.3/M
缓存音频
$0.3/M
网络搜索
$0.035/R
接入协议
OpenAIopenai/v1/chat/completions
Geminigemini

代码示例

from google import genai
client = genai.Client(
api_key="YOUR_OFOX_API_KEY",
http_options={"api_version": "v1beta", "base_url": "https://api.ofox.io/gemini"},
)
response = client.models.generate_content(
model="google/gemini-2.5-flash-lite",
contents="Hello!",
)
print(response.text)

运行状态

常见问题

Google: Gemini 2.5 Flash Lite 在 Ofox.ai 上的价格为输入 $0.1/M/百万 Token,输出 $0.4/M/百万 Token。按量计费,无月费。