Google: Gemini 3.5 Flash
Chatgoogle/gemini-3.5-flashGemini 3.5 Flash is Google's efficient multimodal model delivering near-Pro level coding and reasoning at Flash-tier cost and speed. Optimized for coding tasks and parallel agent execution with configurable thinking levels. Released May 20, 2026.
1M context window
66K max output tokens
Released: 2026-05-20
Supported Protocols:openaigemini
Available Providers:Vertex
Capabilities:VisionFunction CallingReasoningPrompt CachingWeb SearchAudio InputVideo InputPDF Input
Providers
Vertex
Input Tokens
$1.5/M
Output Tokens
$9/M
Cache Read
$0.15/M
Cache Write
$0.83/M
Audio Input
$3/M
Cached Audio
$0.3/M
Web Search
$0.014/R
Protocols
openai
/v1/chat/completionsgemini
Code Examples
from google import genaiclient = genai.Client(api_key="YOUR_OFOX_API_KEY",http_options={"api_version": "v1beta", "url": "https://api.ofox.io/gemini"},)response = client.models.generate_content(model="google/gemini-3.5-flash",contents="Hello!",)print(response.text)
Related Models
Frequently Asked Questions
Google: Gemini 3.5 Flash on Ofox.ai costs $1.5/M per million input tokens and $9/M per million output tokens. Pay-as-you-go, no monthly fees.
Google: Gemini 3.5 Flash supports a context window of 1M tokens with max output of 66K tokens, allowing you to process large documents and maintain long conversations.
Simply set your base URL to https://api.ofox.ai/v1 and use your Ofox API key. The API is OpenAI-compatible — just change the base URL and API key in your existing code.
Google: Gemini 3.5 Flash supports the following capabilities: Vision, Function Calling, Reasoning, Prompt Caching, Web Search, Audio Input, Video Input, PDF Input. Access all features through the Ofox.ai unified API.