Changelog
Every step of OfoxAI — new models, new features, new experience. Updated weekly.
v1.1.0-20260428
💰 Budget Controls — Across Team, Member, and API Key
Turn “how much we spend” from a verbal agreement into a system-enforced limit. A single organization can now set spend caps along three dimensions × three time windows:
| Dimension | Use case |
|---|---|
| Team (Organization) | Company- or project-wide budget |
| Member (User) | Per-employee monthly quota |
| API Key | Independent budget for a specific app or service |
Each dimension supports daily / monthly / lifetime caps independently. Requests that would exceed any cap are rejected automatically.
Progress bars surface three warning levels:
- 🟢 40% — usage is healthy
- 🟡 80% — approaching the limit
- 🔴 110% — exceeded (a buffer prevents bursty traffic from instantly tripping the cap)
Hierarchy is validated for you: API Key cap ≤ Member cap ≤ Team cap. The UI shows the parent quota in real time so you can’t accidentally misconfigure.
Entry point: Settings → Quotas
⏱️ Team-Level RPM Quota
Introducing team-level rate limits (RPM) to stop multiple API keys from collectively blowing past your upstream provider’s limits.
- RPM is aggregated across the entire team, not measured per key
- Default is 100 RPM — contact support@ofoxai.com for higher limits
- Excess requests get an automatic
429 Too Many Requests
Useful for: bursty CI/CD traffic, runaway batch jobs, and unifying limits across collaborative teams.
🪙 Balance OpenAPI
A new endpoint, GET /v1/user/balance, returns the account’s available balance, lifetime credits, and lifetime spend — using any OfoxAI API key.
curl https://api.ofox.ai/v1/user/balance \
-H "Authorization: Bearer $OFOX_API_KEY"The response shape is compatible with third-party tools like cc-switch , so you can plug OfoxAI in as a balance provider directly.
🧰 cc-switch Integration
OfoxAI now works natively with cc-switch — switch to OfoxAI inside cc-switch and you’ll see your live balance, no extra glue code required.

Set it up in four steps:
- Open the usage-query config — click the 📊 icon in the top-right of the OfoxAI provider card
- Enable usage queries — flip the toggle on
- Paste your API key — any user-level OfoxAI API key works (create one in the Dashboard )
- Endpoint — choose “Generic Template” and set the URL to
https://api.ofox.ai/v1
Save, and the provider card immediately shows live status like Remaining: 64.77 USD.
Full walkthrough: cc-switch Integration Guide.
New Models · Apr 24, 2026
🤖 New Models
- GPT-5.5 (OpenAI) — A new flagship for complex, professional workloads. 1M+ token context (922K input / 128K output), with end-to-end gains in reasoning reliability and token efficiency over GPT-5.4
- DeepSeek V4 Pro (DeepSeek) — A 1.6T-parameter MoE flagship with 49B active params and 1M token context, optimized for advanced reasoning, code, and long-running agent workflows
- DeepSeek V4 Flash (DeepSeek) — A 284B-parameter / 13B-active MoE accelerator with 1M token context, built for high throughput and low latency at an aggressive price point
New Models · Apr 21, 2026
🤖 New Models
- Kimi K2.6 (Moonshot AI) — Moonshot’s smartest Kimi yet, with across-the-board upgrades to code, reasoning, and visual understanding
- GPT Image 2 (OpenAI) — Next-generation image model with richer, more accurate detail
New Models · Apr 16, 2026
🤖 New Models
- Claude Opus 4.7 (Anthropic) — Anthropic’s new flagship — another step up in reasoning and writing quality
Campaign · Apr 15, 2026
🎁 GPT April Rebate — Up to $250 Back
- Window — Apr 15 – Apr 25, 11 days only
- Rebate — Flat 25% back across the GPT lineup, six tiers, up to $250
- Redemption — Credits never expire; redeem in one click after the campaign ends
- Teams — Member spend is pooled automatically to unlock higher tiers
Campaign page: GPT April Rebate .
v1.0.55-20260407
🎁 Gift Card System
Enter a gift card code on the Wallet page — balance credits instantly. The most elegant way to give someone AI as a gift.
- Privacy by default — Transaction records show only the last four digits of the card
- Safe by design — Multi-layer anti-abuse protection with end-to-end encryption
🔍 Model Verify Tool
First, let’s set the record straight: OfoxAI is not a reseller gateway.
- Entity — Operated by NICE TALK PTE. LTD. (a global LLM platform)
- Licensing — Official authorization from model providers
- Compute — Azure, AWS, Google Cloud, Alibaba Cloud, Z.AI, Moonshot, Volcano Engine — direct from the cloud providers
- Routing — Edge CDN straight to each provider, no repackaging, no model swapping
So users can verify model authenticity on any LLM gateway, we’ve released a free tool. Point it at any API base + key, and it tells you whether the model has been substituted.
Tool: Model Verify . Works on any platform, not just OfoxAI.
v1.0.54-20260403
💳 Payments and Top-Ups, Upgraded
- Airwallex, alongside Stripe — more choice for international payments
- USD, CNY, or SGD — settle in the currency you already think in
- Top-up cap raised to $10,000 — headroom for larger customers
- $3 first-top-up bonus via partner referral — users referred by a partner get $3 credit on their first top-up, automatically
🏢 Enterprise Page — Spend More, Save More
Automatic rebates when your monthly spend hits a threshold. No application. No sales call. Credit lands on the first of next month.
| Tier | Monthly Spend | Rebate |
|---|---|---|
| Bronze | $1,000+ | 3% |
| Silver | $5,000+ | 4% |
| Gold | $10,000+ | 5% |
| Platinum | $20,000+ | 7% |
Stacks with these enterprise capabilities:
- 0% platform fee — pay the model provider’s list price
- Global edge routing — Tokyo / Singapore / Frankfurt POPs
- 99.99% availability SLA — multi-region redundancy with auto-failover
- Zero content retention — prompts and responses are not logged, not used for training
See: Enterprise .
🤖 New Models
- GLM-5V-Turbo (Zhipu) — Turbo-accelerated variant of GLM’s multimodal line
- Qwen3.6 Plus (Alibaba Bailian) — Latest Plus tier of Qwen3.6
v1.0.47-20260327
🏷️ One Model, Many Names
Short names, legacy IDs — call a model however you want. Migration becomes a no-op. The router normalizes aliases automatically.
A few examples:
| Canonical ID | Aliases |
|---|---|
anthropic/claude-opus-4.7 | claude-opus-4.7 · claude-opus-4-7 · claude-opus-4-7-20260416 |
anthropic/claude-sonnet-4.6 | claude-sonnet-4.6 · claude-sonnet-4-6 · claude-sonnet-4-6-20260217 |
openai/gpt-5.4-pro | gpt-5.4-pro |
openai/gpt-5.4 | gpt-5.4 |
moonshotai/kimi-k2.6 | kimi-k2.6 |
z-ai/glm-5.1 | glm-5.1 |
Fetch the full alias list via GET https://api.ofox.ai/v1/models — every model carries its aliases array in the response.
🖼️ Per-Image Billing
The Images API now bills per generated image, with transparent pricing. Standard sizes map to each provider’s native dimensions automatically — no client-side changes required.
📊 Image Usage, Fully Visible
Image generation is now a first-class dimension on the dashboard, usage, cost, and rankings pages. Monthly image spend is visible at a glance.
🤖 New Models
- GLM 5.1 (Zhipu) — Next-generation GLM with across-the-board capability upgrades
🔗 Shorter Invitation Links
Invitation links shortened to /x/your-code. Easier to remember, easier to share.
v1.0.39-20260320
🔄 Model Fallback — Automatic on Upstream Errors
When the primary model returns a 4xx or 5xx, the gateway automatically tries up to three fallback models. Works across OpenAI, Anthropic, and Gemini. Zero client-side changes. See the Fallback docs.
⚔️ OfoxAI vs OpenRouter, Side by Side
OpenRouter charges 5.5% per top-up. We don’t. Same 100+ models, and you keep 10%+ more once you pass $1,000/month in spend. Full breakdown: OfoxAI vs OpenRouter .
🤖 New Models
- GLM-5-Turbo (Zhipu) — Turbo-accelerated variant of GLM-5
- GPT-5.4 Mini / Nano (OpenAI) — Lightweight pair of GPT-5.4, dramatically lower cost per call
- MiniMax M2.7 / M2.7 Highspeed — MiniMax’s next-gen M2.7; Highspeed is tuned for low latency
v1.0.36-20260313
🎊 March Claude Rebate
A clean 20% rebate across every tier. Copy the coupon OFOXAI2603 with one click from the campaign modal.
| Top-Up | Rebate | You Get |
|---|---|---|
| $20 | $4 | $24 |
| $50 | $10 | $60 |
| $100 | $20 | $120 |
| $200 | $40 | $240 |
| $500 | $100 | $600 |
Campaign page: Claude Spring, Round 2 .
🤖 New Models
- GPT-5.4 / GPT-5.4 Pro (OpenAI) — New flagship pair; Pro offers a higher reasoning ceiling
- Gemini Embedding 2 Preview (Google) — Google’s next-generation multimodal embedding
🖼️ Embeddings, Every Modality
Gemini Embedding now handles text, image, audio, and video across all four modalities. Direct integrations with Qwen and Volcengine multimodal embeddings ship simultaneously.
⚡ Usage Data, Fresh by the Hour
Usage aggregation moved from daily to hourly. Spend shows up on the dashboard shortly after it happens.
💰 Clearer Coupons
Every order now shows discount and gift amounts at a glance.
v1.0.32-20260303
🎉 March Claude Campaign Goes Live
The dashboard gains a campaign banner and a live spend-progress bar. Coupon errors are now localized in English and Chinese. Campaign page: Claude Spring, Round 1 .
🤖 New Models
- GPT-5.3 Chat (OpenAI) — Conversation-tuned variant of GPT-5.3
- Gemini 3.1 Flash Lite Preview (Google) — Gemini 3.1’s lightweight preview
- Nano Banana 2 (Google) — Gemini 3.1 Flash Image Preview — next-generation image generation
🏷️ Navigation Refresh
- “My Billing” → “My Wallet” — a closer match to how users think about the page
- “Models” → “Model Plaza” — framed as a catalog to browse
- Blog link added to the header
v1.0.30-20260226
🔒 One-Click GitHub Login
A new GitHub OAuth option on the sign-in page. The system remembers your last login method for next time. Settings supports binding, unbinding, and GitHub profile sync.
🤖 New Models
- The full Qwen3.5 family, all at once (Alibaba Bailian) — Flash / 27B / 35B A3B / 122B A10B / 397B A17B
- GPT-5.3 Codex (OpenAI) — GPT-5.3 purpose-built for code
- Gemini 3.1 Pro Preview (Google) — Gemini 3.1 Pro preview release
- Qwen3 Coder Next (Alibaba Bailian) — Qwen’s new code-specialized model
📱 Mobile-Responsive Console
Users, Organizations, and Orders modules are now fully mobile-responsive. Collapsible sidebar, smart column hiding, and a touch-friendly experience on small screens.
v1.0.27-20260217
📊 Your Analytics Dashboard
Three interactive charts for Usage, Cost, and Requests. See monthly trends, rank your models, and combine filters across Provider, Model, User, API Key, and time range. Which model is doing the heavy lifting? Now it’s obvious.
🤖 New Models
- Claude Sonnet 4.6 (Anthropic) — Sonnet’s latest, the pragmatic value pick
- Qwen3.5 Plus (Alibaba Bailian) — Qwen3.5 Plus tier live
- Doubao Seed 2.0, all four tiers (Volcengine) — Code / Lite / Mini / Pro — the full Seed 2.0 lineup online
🌐 Aligned with OpenAI
chat/completions without stream now defaults to non-streaming — exactly like OpenAI. Your code? Unchanged.
v1.0.24-20260212
🤖 New Models
- GLM-5 (Zhipu) — GLM’s new-generation flagship
- MiniMax M2.5 / M2.5 Lightning — MiniMax pair; Lightning tuned for low latency
🎊 First-Login Welcome
On first login, the welcome modal presents all three API endpoints — OpenAI, Anthropic, Gemini — with one-click copy. Paired with a burst of confetti, because first impressions matter to developers too.
🧠 Provider Affinity Cache
When the same user switches between different models, the gateway prefers the same underlying provider. Prompt cache hit rate climbs, responses get faster, costs come down.
🎟️ Angel Referral Program
Full referral system shipped: card-based UI, one-click join dialog, and usage-history table. Two-way rewards for both inviter and invitee, plus one-click personal invite poster generation.
v1.0.20-20260206
🤖 New Models
- Claude Opus 4.6 (Anthropic) — Anthropic’s new flagship, raising the bar on reasoning and writing once more
🌍 English / Chinese Parity
Over 1,100 translation keys shipped. Full English / Chinese parity across the platform. Language preference is remembered via cookie.
🔍 Web Search Billing
Web Search tool calls across OpenAI, Anthropic, and Gemini are now accurately billed, per invocation.
📊 Dashboard Refresh
- Personalized greeting by username, instead of a generic “Hi”
- Weekly usage stats replace the single-day view
- API Key display, three modes: none, masked, or full
💵 Clearer Pricing Display
$0.6000 automatically drops trailing zeros, showing as $0.6. Low-balance error messages are now in dollar format — easier to read, no mental math.
📚 Documentation Site Launched
- Full OpenAI / Anthropic / Gemini protocol references
- Integration guides for 10+ tools — Claude Code, Codex, Gemini CLI, Zed, Cline, Cherry Studio, OpenClaw, OpenCode, and more — with end-to-end setup coverage
v1.0.1 ~ v1.0.9 · Jan 20 – Feb 1, 2026 — Two Weeks of Laying the Foundation
We didn’t take a breath after launch. Every release in these two weeks made the platform more stable, more precise, and easier to plug in.
💻 Claude Code, First-Class
We build with Claude Code ourselves. On Jan 21, the gateway shipped full Claude Code compatibility — point the API base at OfoxAI, swap the sk-*** key, and every Claude model just works.
🧠 Thinking Blocks
Thinking blocks — the model’s reasoning chain — now flow through end-to-end for Claude and Gemini. You see how the model thinks, not just the answer.
🌐 Native Gemini Protocol
Beyond OpenAI compatibility — Gemini’s native generateContent API is live. Google’s official SDK connects directly, with no translation loss.
💵 Multi-Currency Stripe
CNY, SGD, and more — in addition to USD. Exchange-rate snapshots are stored per order. Asia-Pacific users can now pay in their local currency.
🎯 Billing Precision to 6 Decimals
NanoDollar-level precision. Even an API call that costs a fraction of a cent is recorded and billed accurately. No rounding away large customer savings. No shortchanging small ones.
v1.0.0 · Jan 16, 2026 — The Gateway Goes Live
“From today on, one hundred models. One key.”
This is the day the OfoxAI platform opened to the public.
🚀 Day-One Capabilities
- Three protocols, one surface — OpenAI, Anthropic, and Gemini, all natively compatible. Zero code changes
- 100+ models — Claude , GPT , Gemini , DeepSeek , Qwen , and more — unified behind a single key. Full catalog: Model Plaza
- Smart routing — Provider × Model level routing chooses the fastest, steadiest path automatically. See Provider Routing
- API keys, self-serve — Create, rotate, and observe usage from the Dashboard
- Pay-as-you-go — The model provider’s list price. Zero platform fee. See Pricing
- Stripe checkout — Credit-card top-ups, balance tracked in real time
- Global edge — Tokyo, Singapore, and Frankfurt points of presence
🌐 The Infrastructure Underneath
Not a reseller gateway. A platform. Requests flow through edge CDN straight to Azure, AWS, Google Cloud, Alibaba Cloud, Z.AI, Moonshot, and Volcano Engine.
Day 1 · Dec 27, 2025 — How It Began
“Give developers the simplest way to reach the smartest models in the world.”
🦊 The First Line of Code
Late December 2025, a single small commit laid down the first line of OfoxAI’s code:
feat: initialize ofox-studio monorepo⚡ The Moment We Knew
Three days later, we got two things working at the same time: Claude on AWS Bedrock, and GPT on Azure — two hyperscalers, two top-tier models, directly connected, no reseller in the middle.
When both responses landed in the terminal at the same moment, we knew: this is going to work.
This wasn’t a demo-grade adapter. This was real multi-cloud direct connectivity. Google Cloud, Alibaba Cloud, Z.AI, Moonshot, and Volcano Engine followed, one after another. “Not a reseller gateway, a platform” — that principle was set in stone from Day 3.
🌱 The Starting Point
commit 0001
One line of code, one direction. Make the world’s smartest intelligence accessible to anyone.
Engines, ignite.