Skip to Content
ChangelogChangelog

Changelog

Every step of OfoxAI — new models, new features, new experience. Updated weekly.


v1.1.0-20260428

💰 Budget Controls — Across Team, Member, and API Key

Turn “how much we spend” from a verbal agreement into a system-enforced limit. A single organization can now set spend caps along three dimensions × three time windows:

DimensionUse case
Team (Organization)Company- or project-wide budget
Member (User)Per-employee monthly quota
API KeyIndependent budget for a specific app or service

Each dimension supports daily / monthly / lifetime caps independently. Requests that would exceed any cap are rejected automatically.

Progress bars surface three warning levels:

  • 🟢 40% — usage is healthy
  • 🟡 80% — approaching the limit
  • 🔴 110% — exceeded (a buffer prevents bursty traffic from instantly tripping the cap)

Hierarchy is validated for you: API Key cap ≤ Member cap ≤ Team cap. The UI shows the parent quota in real time so you can’t accidentally misconfigure.

Entry point: Settings → Quotas 

⏱️ Team-Level RPM Quota

Introducing team-level rate limits (RPM) to stop multiple API keys from collectively blowing past your upstream provider’s limits.

  • RPM is aggregated across the entire team, not measured per key
  • Default is 100 RPM — contact support@ofoxai.com for higher limits
  • Excess requests get an automatic 429 Too Many Requests

Useful for: bursty CI/CD traffic, runaway batch jobs, and unifying limits across collaborative teams.

🪙 Balance OpenAPI

A new endpoint, GET /v1/user/balance, returns the account’s available balance, lifetime credits, and lifetime spend — using any OfoxAI API key.

Terminal
curl https://api.ofox.ai/v1/user/balance \ -H "Authorization: Bearer $OFOX_API_KEY"

The response shape is compatible with third-party tools like cc-switch , so you can plug OfoxAI in as a balance provider directly.

🧰 cc-switch Integration

OfoxAI now works natively with cc-switch  — switch to OfoxAI inside cc-switch and you’ll see your live balance, no extra glue code required.

cc-switch configuration for OfoxAI balance lookup

Set it up in four steps:

  1. Open the usage-query config — click the 📊 icon in the top-right of the OfoxAI provider card
  2. Enable usage queries — flip the toggle on
  3. Paste your API key — any user-level OfoxAI API key works (create one in the Dashboard )
  4. Endpoint — choose “Generic Template” and set the URL to https://api.ofox.ai/v1

Save, and the provider card immediately shows live status like Remaining: 64.77 USD.

Full walkthrough: cc-switch Integration Guide.


New Models · Apr 24, 2026

🤖 New Models

  • GPT-5.5 (OpenAI) — A new flagship for complex, professional workloads. 1M+ token context (922K input / 128K output), with end-to-end gains in reasoning reliability and token efficiency over GPT-5.4
  • DeepSeek V4 Pro (DeepSeek) — A 1.6T-parameter MoE flagship with 49B active params and 1M token context, optimized for advanced reasoning, code, and long-running agent workflows
  • DeepSeek V4 Flash (DeepSeek) — A 284B-parameter / 13B-active MoE accelerator with 1M token context, built for high throughput and low latency at an aggressive price point

New Models · Apr 21, 2026

🤖 New Models

  • Kimi K2.6 (Moonshot AI) — Moonshot’s smartest Kimi yet, with across-the-board upgrades to code, reasoning, and visual understanding
  • GPT Image 2 (OpenAI) — Next-generation image model with richer, more accurate detail

New Models · Apr 16, 2026

🤖 New Models

  • Claude Opus 4.7 (Anthropic) — Anthropic’s new flagship — another step up in reasoning and writing quality

Campaign · Apr 15, 2026

🎁 GPT April Rebate — Up to $250 Back

  • Window — Apr 15 – Apr 25, 11 days only
  • Rebate — Flat 25% back across the GPT lineup, six tiers, up to $250
  • Redemption — Credits never expire; redeem in one click after the campaign ends
  • Teams — Member spend is pooled automatically to unlock higher tiers

Campaign page: GPT April Rebate .


v1.0.55-20260407

🎁 Gift Card System

Enter a gift card code on the Wallet  page — balance credits instantly. The most elegant way to give someone AI as a gift.

  • Privacy by default — Transaction records show only the last four digits of the card
  • Safe by design — Multi-layer anti-abuse protection with end-to-end encryption

🔍 Model Verify Tool

First, let’s set the record straight: OfoxAI is not a reseller gateway.

  • Entity — Operated by NICE TALK PTE. LTD. (a global LLM platform)
  • Licensing — Official authorization from model providers
  • Compute — Azure, AWS, Google Cloud, Alibaba Cloud, Z.AI, Moonshot, Volcano Engine — direct from the cloud providers
  • Routing — Edge CDN straight to each provider, no repackaging, no model swapping

So users can verify model authenticity on any LLM gateway, we’ve released a free tool. Point it at any API base + key, and it tells you whether the model has been substituted.

Tool: Model Verify . Works on any platform, not just OfoxAI.


v1.0.54-20260403

💳 Payments and Top-Ups, Upgraded

  • Airwallex, alongside Stripe — more choice for international payments
  • USD, CNY, or SGD — settle in the currency you already think in
  • Top-up cap raised to $10,000 — headroom for larger customers
  • $3 first-top-up bonus via partner referral — users referred by a partner get $3 credit on their first top-up, automatically

🏢 Enterprise Page — Spend More, Save More

Automatic rebates when your monthly spend hits a threshold. No application. No sales call. Credit lands on the first of next month.

TierMonthly SpendRebate
Bronze$1,000+3%
Silver$5,000+4%
Gold$10,000+5%
Platinum$20,000+7%

Stacks with these enterprise capabilities:

  • 0% platform fee — pay the model provider’s list price
  • Global edge routing — Tokyo / Singapore / Frankfurt POPs
  • 99.99% availability SLA — multi-region redundancy with auto-failover
  • Zero content retention — prompts and responses are not logged, not used for training

See: Enterprise .

🤖 New Models

  • GLM-5V-Turbo (Zhipu) — Turbo-accelerated variant of GLM’s multimodal line
  • Qwen3.6 Plus (Alibaba Bailian) — Latest Plus tier of Qwen3.6

v1.0.47-20260327

🏷️ One Model, Many Names

Short names, legacy IDs — call a model however you want. Migration becomes a no-op. The router normalizes aliases automatically.

A few examples:

Canonical IDAliases
anthropic/claude-opus-4.7claude-opus-4.7 · claude-opus-4-7 · claude-opus-4-7-20260416
anthropic/claude-sonnet-4.6claude-sonnet-4.6 · claude-sonnet-4-6 · claude-sonnet-4-6-20260217
openai/gpt-5.4-progpt-5.4-pro
openai/gpt-5.4gpt-5.4
moonshotai/kimi-k2.6kimi-k2.6
z-ai/glm-5.1glm-5.1

Fetch the full alias list via GET https://api.ofox.ai/v1/models — every model carries its aliases array in the response.

🖼️ Per-Image Billing

The Images API now bills per generated image, with transparent pricing. Standard sizes map to each provider’s native dimensions automatically — no client-side changes required.

📊 Image Usage, Fully Visible

Image generation is now a first-class dimension on the dashboard, usage, cost, and rankings pages. Monthly image spend is visible at a glance.

🤖 New Models

  • GLM 5.1 (Zhipu) — Next-generation GLM with across-the-board capability upgrades

Invitation links shortened to /x/your-code. Easier to remember, easier to share.


v1.0.39-20260320

🔄 Model Fallback — Automatic on Upstream Errors

When the primary model returns a 4xx or 5xx, the gateway automatically tries up to three fallback models. Works across OpenAI, Anthropic, and Gemini. Zero client-side changes. See the Fallback docs.

⚔️ OfoxAI vs OpenRouter, Side by Side

OpenRouter charges 5.5% per top-up. We don’t. Same 100+ models, and you keep 10%+ more once you pass $1,000/month in spend. Full breakdown: OfoxAI vs OpenRouter .

🤖 New Models


v1.0.36-20260313

🎊 March Claude Rebate

A clean 20% rebate across every tier. Copy the coupon OFOXAI2603 with one click from the campaign modal.

Top-UpRebateYou Get
$20$4$24
$50$10$60
$100$20$120
$200$40$240
$500$100$600

Campaign page: Claude Spring, Round 2 .

🤖 New Models

🖼️ Embeddings, Every Modality

Gemini Embedding now handles text, image, audio, and video across all four modalities. Direct integrations with Qwen and Volcengine multimodal embeddings ship simultaneously.

⚡ Usage Data, Fresh by the Hour

Usage aggregation moved from daily to hourly. Spend shows up on the dashboard shortly after it happens.

💰 Clearer Coupons

Every order now shows discount and gift amounts at a glance.


v1.0.32-20260303

🎉 March Claude Campaign Goes Live

The dashboard gains a campaign banner and a live spend-progress bar. Coupon errors are now localized in English and Chinese. Campaign page: Claude Spring, Round 1 .

🤖 New Models

🏷️ Navigation Refresh

  • “My Billing” → “My Wallet” — a closer match to how users think about the page
  • “Models” → “Model Plaza” — framed as a catalog to browse
  • Blog link added to the header

v1.0.30-20260226

🔒 One-Click GitHub Login

A new GitHub OAuth option on the sign-in page. The system remembers your last login method for next time. Settings supports binding, unbinding, and GitHub profile sync.

🤖 New Models

📱 Mobile-Responsive Console

Users, Organizations, and Orders modules are now fully mobile-responsive. Collapsible sidebar, smart column hiding, and a touch-friendly experience on small screens.


v1.0.27-20260217

📊 Your Analytics Dashboard

Three interactive charts for Usage, Cost, and Requests. See monthly trends, rank your models, and combine filters across Provider, Model, User, API Key, and time range. Which model is doing the heavy lifting? Now it’s obvious.

🤖 New Models

🌐 Aligned with OpenAI

chat/completions without stream now defaults to non-streaming — exactly like OpenAI. Your code? Unchanged.


v1.0.24-20260212

🤖 New Models

🎊 First-Login Welcome

On first login, the welcome modal presents all three API endpoints — OpenAI, Anthropic, Gemini — with one-click copy. Paired with a burst of confetti, because first impressions matter to developers too.

🧠 Provider Affinity Cache

When the same user switches between different models, the gateway prefers the same underlying provider. Prompt cache hit rate climbs, responses get faster, costs come down.

🎟️ Angel Referral Program

Full referral system shipped: card-based UI, one-click join dialog, and usage-history table. Two-way rewards for both inviter and invitee, plus one-click personal invite poster generation.


v1.0.20-20260206

🤖 New Models

  • Claude Opus 4.6 (Anthropic) — Anthropic’s new flagship, raising the bar on reasoning and writing once more

🌍 English / Chinese Parity

Over 1,100 translation keys shipped. Full English / Chinese parity across the platform. Language preference is remembered via cookie.

🔍 Web Search Billing

Web Search tool calls across OpenAI, Anthropic, and Gemini are now accurately billed, per invocation.

📊 Dashboard Refresh

  • Personalized greeting by username, instead of a generic “Hi”
  • Weekly usage stats replace the single-day view
  • API Key display, three modes: none, masked, or full

💵 Clearer Pricing Display

$0.6000 automatically drops trailing zeros, showing as $0.6. Low-balance error messages are now in dollar format — easier to read, no mental math.

📚 Documentation Site Launched


v1.0.1 ~ v1.0.9 · Jan 20 – Feb 1, 2026 — Two Weeks of Laying the Foundation

We didn’t take a breath after launch. Every release in these two weeks made the platform more stable, more precise, and easier to plug in.

💻 Claude Code, First-Class

We build with Claude Code ourselves. On Jan 21, the gateway shipped full Claude Code compatibility — point the API base at OfoxAI, swap the sk-*** key, and every Claude model just works.

🧠 Thinking Blocks

Thinking blocks — the model’s reasoning chain — now flow through end-to-end for Claude and Gemini. You see how the model thinks, not just the answer.

🌐 Native Gemini Protocol

Beyond OpenAI compatibility — Gemini’s native generateContent API is live. Google’s official SDK connects directly, with no translation loss.

💵 Multi-Currency Stripe

CNY, SGD, and more — in addition to USD. Exchange-rate snapshots are stored per order. Asia-Pacific users can now pay in their local currency.

🎯 Billing Precision to 6 Decimals

NanoDollar-level precision. Even an API call that costs a fraction of a cent is recorded and billed accurately. No rounding away large customer savings. No shortchanging small ones.


v1.0.0 · Jan 16, 2026 — The Gateway Goes Live

“From today on, one hundred models. One key.”

This is the day the OfoxAI  platform opened to the public.

🚀 Day-One Capabilities

  • Three protocols, one surfaceOpenAI, Anthropic, and Gemini, all natively compatible. Zero code changes
  • 100+ modelsClaude , GPT , Gemini , DeepSeek , Qwen , and more — unified behind a single key. Full catalog: Model Plaza 
  • Smart routing — Provider × Model level routing chooses the fastest, steadiest path automatically. See Provider Routing
  • API keys, self-serve — Create, rotate, and observe usage from the Dashboard 
  • Pay-as-you-go — The model provider’s list price. Zero platform fee. See Pricing
  • Stripe checkout — Credit-card top-ups, balance tracked in real time
  • Global edge — Tokyo, Singapore, and Frankfurt points of presence

🌐 The Infrastructure Underneath

Not a reseller gateway. A platform. Requests flow through edge CDN straight to Azure, AWS, Google Cloud, Alibaba Cloud, Z.AI, Moonshot, and Volcano Engine.


Day 1 · Dec 27, 2025 — How It Began

“Give developers the simplest way to reach the smartest models in the world.”

🦊 The First Line of Code

Late December 2025, a single small commit laid down the first line of OfoxAI’s code:

feat: initialize ofox-studio monorepo

⚡ The Moment We Knew

Three days later, we got two things working at the same time: Claude on AWS Bedrock, and GPT on Azure — two hyperscalers, two top-tier models, directly connected, no reseller in the middle.

When both responses landed in the terminal at the same moment, we knew: this is going to work.

This wasn’t a demo-grade adapter. This was real multi-cloud direct connectivity. Google Cloud, Alibaba Cloud, Z.AI, Moonshot, and Volcano Engine followed, one after another. “Not a reseller gateway, a platform” — that principle was set in stone from Day 3.

🌱 The Starting Point

commit 0001

One line of code, one direction. Make the world’s smartest intelligence accessible to anyone.

Engines, ignite.


Last updated on