No. The hosted model on Z.ai's Coding Plan starts at roughly $10/month (Lite tier) once the Coding Plan API opens the week of June 22, 2026. MIT-licensed open weights are scheduled for release that same week on Hugging Face under zai-org — self-hosting is free of license fees but you pay your own GPU bill (the HF model card now lists GLM 5.2 as a 753B-total-parameter MoE; Zhipu does not separately break out the active-parameter count).

Does GLM 5.2 work with Claude Code?

Yes, once the Coding Plan API opens the week of June 22, 2026. Z.ai's launch documentation describes an Anthropic-compatible endpoint at `https://api.z.ai/api/anthropic` specifically for Claude Code, and an OpenAI-compatible Coding Plan endpoint at `https://api.z.ai/api/coding/paas/v4` for the other seven clients in the launch announcement (Cline, OpenCode, Roo Code, Goose, Crush, OpenClaw, Kilo Code). For Claude Code you set `ANTHROPIC_BASE_URL` and `ANTHROPIC_AUTH_TOKEN`; for the rest, set `OPENAI_BASE_URL` and `OPENAI_API_KEY`. Endpoint URLs published in the announcement; treat the exact paths as provisional until the dashboard goes live.

How big is the 1M-token context window in practice?

1,000,000 tokens of input and up to 131,072 tokens of output. That's roughly the source of a mid-sized monolith plus tests in a single request, but the practical limit is your latency budget — long-context calls take longer to first token and cost more even at hosted Coding Plan rates.

Has GLM 5.2 published SWE-bench or LiveCodeBench scores?

No. As of June 14, 2026 — 24 hours after launch — Zhipu has not published SWE-bench Verified, LiveCodeBench, HumanEval, or Aider polyglot results. Independent third-party benchmarks are pending. Treat any claim that GLM 5.2 'beats Claude' as unverified until benchmark numbers land.

When do the MIT weights drop, and where?

Zhipu's launch post says 'next week' from June 13 — meaning the week of June 22, 2026 (the same window the Z.ai Coding Plan API opens). Track huggingface.co/zai-org for the GLM-5.2 repo. The MIT license means commercial use, modification, and redistribution are all allowed.

Can I use GLM 5.2 through ofox?

Not at the time of writing. ofox's catalog (verified June 15, 2026 against ofox.ai/en/models) lists DeepSeek V4 Pro, Kimi K2.6, and Qwen3 Coder Next as managed Chinese coding alternatives, but GLM 5.2 is not yet on the model list. If you want a managed Chinese coding model on a single OpenAI-compatible endpoint today, DeepSeek V4 Pro is the closest analog.

What's the difference between GLM 5.2 'High' and 'Max' thinking modes?

Zhipu ships only two thinking presets — High and Max. There is no Low / Auto tier like other reasoning models. Max consumes more thinking tokens and is intended for multi-hour agentic refactors; High is the default for normal coding turns. Tier choice does not change weights, only the reasoning budget.

Will running GLM 5.2 weights locally save money vs the Coding Plan?

Only at very high volume. The HF model card lists GLM 5.2 as a 753B-total-parameter MoE (active-parameter count not separately published), which puts the model out of reach for single-GPU inference and pushes you toward 8x H100 or equivalent for full-precision production throughput. Below ~3,000 prompts/week, the $30/mo Pro tier of the Coding Plan is cheaper than electricity plus depreciation on a self-hosted node.

How to Access GLM 5.2 (2026): Step-by-Step API Setup & Pricing

Zhipu just announced a frontier-class coding model with a 1M-token context window, MIT-licensed weights, and a $10/month entry price — all in one drop. The Z.ai Coding Plan API and the MIT weights both open the week of June 22, 2026. If you’ve been waiting for an open-weights Claude Code competitor that you can actually fork, the next seven days are when to read this guide, decide tier, and pre-stage your client config so you can wire it up on day one.

Why Now: The Reverse-Narrative Window Is Open

GLM 5.2 didn’t land in a vacuum. The 24 hours around its release are the reason this article exists — and the reason “should I switch?” is no longer hypothetical for some readers.

June 12, 2026 — Anthropic received an export-control directive from the US Department of Commerce restricting access to Claude Fable 5 and Mythos 5 for foreign nationals (inside or outside the US). The trigger was a security finding raised through Amazon: CEO Andy Jassy escalated jailbreak research to senior administration officials, including Treasury Secretary Scott Bessent (Fortune, Semafor). Rather than ship a US-only variant, Anthropic withdrew both models from public availability.

June 13, 2026 — same day the Anthropic withdrawal hit news cycles — Zhipu released GLM 5.2. Jie Tang (Tsinghua, GLM team lead) posted to X with the line “GLM-5.2 is Fully Open, Frontier Intelligence Belongs to Everyone”, framing the launch as a direct response: “the sudden restriction of certain frontier models is deeply regrettable… access to frontier models is abruptly cut off for non-technical reasons” (Jie Tang on X, June 13). The post traveled — roughly 898K views and Hacker News front page within 36 hours.

Side	Move	Date
US (Commerce + Anthropic)	Export-control directive → Anthropic withdraws Fable 5 + Mythos 5 from public availability	Notice June 12, public June 13, 2026
China (Zhipu)	Ships GLM 5.2 + announces MIT weights drop within 7 days	June 13, 2026
Public signal	Jie Tang tweet — ~898K views, Hacker News front page	June 13–15, 2026

One nuance worth being precise about: Fable 5 was not deprecated, sunset, or retired by Anthropic. It was withdrawn after a US government export-control order, and Anthropic publicly disputed the severity of the underlying jailbreak finding that triggered the order (Tom’s Hardware). If you write or read about this elsewhere, “Anthropic shut down Fable” is the wrong framing.

For most readers this geopolitics is irrelevant — you pick a coding model on price and benchmarks. But three concrete things change in the next 30 days, and they decide whether the rest of this article is worth your time:

Hedge value: if your team was on Claude Fable for coding workflows and you are outside the US, GLM 5.2 is the first frontier-class coding model that is licensed (MIT, weights drop next week) for you to fork and self-host. “Open weights as political insurance” is no longer abstract.
Pricing pressure: open-weights frontier models put a ceiling on hosted subscription prices. Expect Anthropic, OpenAI, and Google to soften coding-plan tiers within ~60 days, regardless of whether GLM 5.2 benchmarks competitively.
Tooling parity: Z.ai shipped Claude Code drop-in support on day one (the dedicated /api/anthropic endpoint, covered in the Drop-in section below). The standard 2026 coding-CLI workflow no longer locks you into a single model family.

If none of those three apply, skip to the setup section. If any of them do, the rest of this article is the operational path: the 10-minute access flow you wire up once the Z.ai API opens the week of June 22, the drop-in replacement for Claude Code, and the self-host plan for when MIT weights drop the same week.

A Note on Availability (Read This First)

Zhipu’s June 13, 2026 launch was an announcement plus documentation, not a dashboard you can sign into the same day. Two access surfaces unlock on the next Z.ai release wave:

Z.ai Coding Plan API — opens the week of June 22, 2026. Account creation, Coding Plan tier selection, API key issuance, and the /api/anthropic + /api/coding/paas/v4 endpoints all light up in that window. Until then, the endpoint URLs in this guide are the ones published in the launch post; treat them as provisional until you can hit them.
MIT-licensed open weights — drop the same week under huggingface.co/zai-org/GLM-5.2. The HF repo is currently a placeholder; the config.json that confirms architecture and the BF16 / FP8 shards land on that calendar.

This guide is structured so you can do the planning work this week (pick a tier, pre-stage env vars, decide drop-in vs clean install) and execute setup in ~10 minutes the day the API turns on. If you need something working today, jump to the Alternatives section — ofox already serves DeepSeek V4 Pro / Kimi K2.6 / Qwen3 Coder Next on a single endpoint.

What You Get with GLM 5.2 (The 30-Second Answer)

Item	Value
What you can do today (June 13–21, 2026)	Read this guide, pick a Coding Plan tier, pre-stage `~/.claude/settings.json` or `OPENAI_BASE_URL` env, and queue a waitlist on `z.ai` if available
What you can do once API opens (week of June 22, 2026)	Use GLM 5.2 inside Claude Code, Cline, OpenCode, OpenClaw, Goose, Crush, Roo Code, or Kilo Code via Z.ai’s hosted Coding Plan; self-host the MIT-licensed weights from `huggingface.co/zai-org` (753B-total-parameter MoE per the HF model card)
Time to first call once keys go live	~10 minutes (sign up → API key → CLI config → smoke test)
Minimum cost	~$10/mo Lite tier; ~$30/mo Pro for ~2,000 prompts/week
What you need	A Z.ai account, an OpenAI-compatible coding client (or any tool that accepts a custom `base_url`), and 8 GB of patience for the first long-context call
What you can’t do (yet)	Quote SWE-bench numbers (Zhipu didn’t publish any), get a 5-tier thinking preset (only High and Max), or pull the weights through ofox (DeepSeek V4 Pro is the closest managed analog)

Decision Frame: When GLM 5.2 Is Worth Your Setup Time

Use this section to bail out before reading the rest of the article.

When to use GLM 5.2

You’re running multi-file refactors on a monolith and have been bumping into the 200K context ceiling of competing coding agents
Your compliance team requires open, auditable model weights — MIT is one of the friendliest open-source licenses in the LLM space
You want a Chinese-origin coding model to hedge against US-side access restrictions — GLM 5.2 launched the day Anthropic withdrew Claude Fable 5 + Mythos 5 after a US Commerce export-control directive (Why Now above has the full timeline)

When NOT to use GLM 5.2

You need a model with published benchmarks before you ship it to a production team. Zhipu has not released SWE-bench, LiveCodeBench, or Aider numbers as of June 14, 2026 — independent benchmarks are days away at minimum
You already pay for Claude Code with Sonnet/Opus and don’t have a specific gap GLM fills. Switching costs (tool config, prompt re-tuning, eval re-runs) are not worth the ~$10/mo savings unless context-window is the actual bottleneck
You want a single managed endpoint that hands you GLM, GPT-5.5, and Claude Opus 4.8 behind one API key. GLM 5.2 isn’t on ofox yet (verified June 15, 2026) — if endpoint consolidation matters more than this specific model, see the Alternatives section

Stop rule

If you’ve never run into a 200K-token context limit on a real task in the last 30 days, you do not need GLM 5.2. Stop reading and revisit when Zhipu publishes a benchmark or ofox lists the model — whichever comes first.

System Requirements

Before you start the setup, confirm you have:

A Z.ai account with payment method on file (Coding Plan billed monthly, USD or RMB)
An OpenAI-compatible coding CLI — one of: Claude Code v0.x, Cline ≥ 3.x, OpenCode, Roo Code, Goose, Crush, OpenClaw, Kilo Code. Each supports a custom base_url and model-name override
Network egress to api.z.ai — verify with curl -I https://api.z.ai/api/paas/v4/ (you should get an HTTP response, not a connection error)
A side branch in your repo for the first run. Long-context coding agents are smart enough to delete unrelated files when given a vague prompt — never point one at main on your first day

If you want to self-host the weights when they drop the week of June 22, 2026, additional requirements:

8x H100 80GB GPUs or equivalent for full-precision production throughput. The HF model card confirms GLM 5.2 is a 753B-total-parameter MoE (active-parameter count not separately published). Community-built lower-VRAM GGUF quants (2-bit to 4-bit) are already available for single-machine inference
vLLM or SGLang as the inference server (community examples will arrive on the HF repo; check huggingface.co/zai-org/GLM-5.2 once published)
Disk for the weight shards — estimate ~1.5 TB BF16 / ~860 GB FP8 if the GLM-5 lineage shape holds; treat as a planning placeholder, not a procurement number, until the HF repo confirms

Step-by-Step Setup (Hosted, ~10 Minutes — Once API Opens)

The Z.ai Coding Plan API opens the week of June 22, 2026. Steps 1–4 below execute in ~10 minutes the day the dashboard goes live; until then, you can pre-stage your CLI config (Step 3) and queue a waitlist on z.ai if available.

flowchart LR
  A[Sign up Z.ai] --> B[Pick Coding Plan tier]
  B --> C[Generate API key]
  C --> D[Configure CLI base_url + model]
  D --> E[First smoke test]
  E --> F[Wire repo, run a real task]

Go to https://z.ai and create an account. Pick a Coding Plan tier:

Tier	Approx. price	Quota	Best fit
Lite	~$10/mo	~400 prompts/week	Personal tinkering, light side projects
Pro	~$30/mo	~2,000 prompts/week	Solo dev, daily coding agent use
Max	~$80/mo	~8,000 prompts/week	Heavy agentic refactors, multi-hour autonomous runs
Team	Seat-based	Org-wide pooling	3+ developers sharing quota

Expected result: account dashboard with a “Coding Plan” entry showing your tier and remaining quota.

Step 2: Generate an API key

Inside the Z.ai dashboard, open API Keys → Create new key. Scope it to “Coding Plan” only — Z.ai exposes other paid endpoints (general chat, vision) that share your wallet but should not share the same key.

export ZAI_API_KEY="zai-..."

Expected result: a key starting with zai-. Drop it into your shell’s secrets file or 1Password — Z.ai shows the full key exactly once.

Step 3: Configure your coding CLI

Z.ai exposes two compat endpoints, and you pick the one that matches your client. Claude Code talks the Anthropic protocol; the other seven launch-day clients (Cline, OpenCode, Roo Code, Goose, Crush, OpenClaw, Kilo Code) speak the OpenAI chat-completions shape.

For Claude Code (Anthropic-compatible endpoint) — the minimal config is the shell-env or ~/.claude/settings.json env block covered in the Drop-in Replacement for Claude Code section below. That section also lists what carries over (CLAUDE.md, slash commands, subagents) and what changes (thinking presets, tool-result bridging) before you commit — read it before pasting the block.

For OpenAI-compatible clients (Cline, OpenCode, Roo Code, Goose, Crush, OpenClaw, Kilo Code)

export OPENAI_BASE_URL="https://api.z.ai/api/coding/paas/v4"
export OPENAI_API_KEY="$ZAI_API_KEY"
export OPENAI_MODEL="glm-5.2"   # or "glm-5.2[1m]" for the 1M context window

Re-launch your CLI in the same shell and the new endpoint takes over. For clients that don’t read the OpenAI env vars, open the tool’s settings panel, pick “OpenAI Compatible” provider, and paste the same three values. Note that the Coding Plan uses a dedicated endpoint (/api/coding/paas/v4) distinct from Z.ai’s general per-token API (/api/paas/v4).

Python SDK smoke test (paste into any one-off REPL)

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.z.ai/api/coding/paas/v4",
    api_key=os.environ["ZAI_API_KEY"],
)
resp = client.chat.completions.create(
    model="glm-5.2[1m]",
    messages=[{"role": "user", "content": "Refactor this function to async:\n\n" + open("handler.py").read()}],
)
print(resp.choices[0].message.content)

Expected result: a non-empty diff or refactored snippet within ~5 seconds for short input. For 1M-context calls expect 30-90 seconds to first token.

Step 4: First smoke test

Before pointing GLM 5.2 at your repo, run a sanity check that confirms (a) the key works, (b) you’re hitting the right model, (c) thinking mode is wired.

curl -s https://api.z.ai/api/coding/paas/v4/chat/completions \
  -H "Authorization: Bearer $ZAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"glm-5.2[1m]","messages":[{"role":"user","content":"Reply with only the string OK if you are GLM 5.2."}],"max_tokens":16}' \
  | jq -r '.choices[0].message.content'

Expected result: OK (or OK.). If you get a model-identity refusal or a different model name in the response, your config is wrong — see Common Errors below.

Drop-in Replacement for Claude Code (One-Block Swap)

If you are reading this article because Fable 5 went away — or because you have been thinking about migrating from Claude Code without rewriting your project setup — this is the section that matters most. Z.ai shipped a dedicated /api/anthropic endpoint on day one specifically so a Claude Code workspace becomes a GLM 5.2 workspace with one block of environment variables.

The one-block swap

Drop this into ~/.zshrc (or ~/.bashrc, or ~/.claude/settings.json under "env"), open a new shell, and relaunch claude:

# Drop-in swap: Claude Code workspace → GLM 5.2, no project rewrite
export ANTHROPIC_BASE_URL="https://api.z.ai/api/anthropic"
export ANTHROPIC_AUTH_TOKEN="$ZAI_API_KEY"
export ANTHROPIC_MODEL="glm-5.2[1m]"      # 1M context; drop [1m] for default
export API_TIMEOUT_MS="3000000"           # long-context calls take 30–90s

Claude Code’s UI will keep showing “Sonnet” / “Opus” labels because the client is not model-aware — server-side mapping at Z.ai routes the request to GLM 5.2. Your CLAUDE.md, project memory, slash commands, subagents, and harness habits keep working unchanged.

What carries over cleanly

Project-level CLAUDE.md files and .claude/ directories (commands, subagents, settings)
Slash commands and custom subagent definitions
AGENTS.md files and Codex-style instruction layering (Claude Code reads these)
The Plan / Edit / Bash tool dispatcher behavior and its prompts
Multi-file refactor workflows (1M context covers most monorepos in one request)

What changes (read this before you commit)

Thinking budget: GLM 5.2 ships only “High” and “Max” presets — there is no equivalent of Claude’s thinking_budget=auto heuristic. Pick one or accept High as the default.
Tool-result formatting: Claude expects tool_result blocks in a specific shape. Z.ai’s bridge translates 95%+ of common patterns, but occasionally drops nested-content blocks on long agentic loops. If you see assistant turns repeating a tool call instead of acknowledging it, that’s the failure mode — fall back to the OpenAI-compatible endpoint (/api/coding/paas/v4) and use Cline or OpenCode for that workflow.
Latency profile: first-token latency for 1M-context calls is 30–90 seconds, vs 5–15 for Claude on equivalent-size prompts. The API_TIMEOUT_MS=3000000 line above is mandatory, not optional — Claude Code defaults will kill the connection on long Plan-mode calls.
Quota model: you are now spending Coding Plan quota, not Claude Plan quota. The bursty agent-loop pattern that drains a Claude weekly cap in a few hours also drains a Lite-tier GLM plan; budget Pro or Max for sustained work.

Pick this path vs. a clean Cline setup

Pick the drop-in swap if	Pick a clean Cline / OpenCode install if
You have 3+ slash commands, tuned subagents, or a `CLAUDE.md` you’ve iterated on for weeks	You are starting a new project with no Claude Code investment
Your team standardized on Claude Code’s UI and switching tools means re-onboarding engineers	Your other tooling (lint, telemetry) speaks OpenAI-style requests
You want to A/B GLM 5.2 against your current Claude workflow without burning a sprint day	You hit the tool-result bridging issue above and the workaround is more friction than a tool switch

Revert path (do this before you commit)

unset ANTHROPIC_BASE_URL ANTHROPIC_AUTH_TOKEN ANTHROPIC_MODEL and restart Claude Code. The claude CLI picks up Anthropic’s defaults again. No state inside your project is touched by the swap — it lives entirely in the shell environment.

Common Errors During Setup

Error	Likely cause	Fix
`401 invalid_api_key`	Key scoped to wrong product or pasted with whitespace	Regenerate with “Coding Plan” scope; copy-paste through a non-stripping clipboard
`model not found` for `glm-5.2` or `glm-5.2[1m]`	Z.ai uses `glm-5.2` for the standard context window and the `[1m]` suffix as a model alias that switches the request to the 1M-context configuration	Use `glm-5.2[1m]` when you need the full 1M window; plain `glm-5.2` for default-context calls. Both are valid model IDs against the Coding Plan endpoint
`429 Too Many Requests` after a few minutes of work	Lite tier quota (~400 prompts/week) burned by an agent loop	Upgrade to Pro, or reduce agent iteration loops via `max_iterations`
Empty response body, no error	Thinking budget exceeded `max_tokens`	Raise `max_tokens` to ≥ 4096; thinking models stream reasoning then answer
Tool-use call returned as raw JSON in the assistant text	Z.ai’s OpenAI compat doesn’t auto-parse tool_use unless `tools` field present in request	Pass `tools` array even on the first turn; or use Anthropic-compatible endpoint if your client supports it
504 / timeout on multi-file refactor	Long-context (>500K tokens) first-token latency can exceed default client timeout	Raise the CLI’s `requestTimeoutMs` to 600000 (10 min) for 1M-context calls

Team / Multi-Developer Configuration

If 3+ developers need to share a quota, the Team tier of the Coding Plan does seat-based pooling — but the setup pattern differs from solo:

One API key per developer, billed against the same org wallet — never share a single key across machines (it’s the fastest way to burn quota on something you can’t trace)
A shared .env.team checked into a private secrets repo, containing only OPENAI_BASE_URL=https://api.z.ai/api/coding/paas/v4 and OPENAI_MODEL=glm-5.2[1m] — keep API keys out of git
A budget guard in CI: have your coding-agent CI step abort if the per-PR completion token count exceeds N (your call — start at 200K and adjust by Friday)
Quota observability: Z.ai’s dashboard shows per-key usage; for programmatic polling, the Coding Plan exposes a quota endpoint at https://api.z.ai/api/monitor/usage/quota/limit covering the 5-hour token cycle, weekly quota, and monthly MCP usage — pull it into your existing observability stack (Datadog, Honeycomb)

If your org cannot route through a Chinese API endpoint (egress control, compliance), the practical pattern is to mirror the same OpenAI-compatible config against a different upstream — see Alternatives.

Advanced: The MIT Open-Weights Plan

Zhipu’s launch announcement commits the MIT-licensed weights for “next week” — meaning the week of June 22, 2026, the same window the Z.ai Coding Plan API opens. The HF org is huggingface.co/zai-org; track the GLM-5.2 repo for the actual drop.

What MIT actually buys you:

Commercial use, modification, redistribution — no usage caps, no per-token fees once you’re hosting it
Fine-tuning rights — you can train LoRAs or full fine-tunes on your own codebase and ship the result
Forks — if Zhipu disables a feature you depend on (or, more likely, raises prices), community forks remain operable

What MIT does not buy you:

A free lunch on inference compute. At 753B total parameters (per the HF model card), full-precision production throughput is still in 8x-H100 territory, with strong dependence on quantization quality
Future model updates — the MIT release is point-in-time; GLM 5.3 may or may not be open
Anthropic-quality safety tuning — Z.ai’s RLHF is its own house style, expect different refusal boundaries

The realistic path for most teams: stay on the hosted Coding Plan for the next 30-60 days, watch the community quantize the weights into 4-bit and 2-bit variants, then re-evaluate self-hosting once a single-node config exists.

Alternatives: Managed Open-Weights Coding Models on ofox

If you want a single OpenAI-compatible endpoint that already covers managed Chinese coding models — without waiting for the GLM 5.2 weights drop or building your own H100 cluster — ofox lists three solid alternatives as of June 15, 2026:

Model	ofox API ID	Strength	When to pick over GLM 5.2
DeepSeek V4 Pro	`deepseek/deepseek-v4-pro`	Coding-tuned flagship, broad community track record	You want a model with published benchmarks (DeepSeek’s are public; GLM 5.2’s aren’t yet)
Qwen3 Coder Next	`bailian/qwen3-coder-next`	Latest Alibaba coding-specific tier, multilingual code	You’re shipping to a multilingual Chinese/Japanese codebase and want first-party Qwen support
Kimi K2.6	`moonshotai/kimi-k2.6`	Long-context with strong recall	You need long-context that’s verified, not “claimed but un-benchmarked”

You wire any of these in with the same config shape used for GLM 5.2 — just swap the base URL and model ID:

# Same Cline / OpenCode config, different upstream
export OPENAI_BASE_URL="https://api.ofox.io/v1"
export OPENAI_MODEL="deepseek/deepseek-v4-pro"

This is the single-endpoint pattern: one key, many models, no per-vendor signup. See the ofox model catalog for current pricing and capability flags. When GLM 5.2 lands on ofox (it isn’t yet — verified June 15, 2026), you’ll switch by changing one string.

Watching Z.ai Status and Quota

Two things to wire in week one:

Z.ai status page — bookmark it the day you sign up; a new product’s first 30 days always include at least one rate-limit tuning or quota-counting bug
Per-PR usage instrumentation — log usage.total_tokens from every API response into your existing PR-level telemetry (Datadog, Honeycomb, your call). Coding agents drift toward burning quota on rabbit-hole refactors, and the only way you’ll catch it is at the PR level

References

Codersera: “GLM 5.2 Just Launched: 1M Context, Coding-First, Open Weights Next Week (Day-One Brief)” — https://codersera.com/blog/glm-5-2-release-1m-context-coding-2026/
AI Weekly: “Zhipu Deploys GLM 5.2 to All GLM Coding Plan Tiers With 1M-Token Context” — https://aiweekly.co/node/2946
Agent-Wars: “Zhipu ships GLM 5.2 with a 1M-token context and no benchmarks” — https://agent-wars.com/news/2026-06-14-glm-5-2-million-token-context
ofox model catalog snapshot — https://ofox.io/en/models
Hugging Face org for weights — https://huggingface.co/zai-org (GLM-5.2 repo pending publication as of June 15, 2026)
Jie Tang on X — “GLM-5.2 is Fully Open, Frontier Intelligence Belongs to Everyone” — https://x.com/jietang/status/2065784751345287314 (June 13, 2026; ~898K views by June 15)
Fortune: “A warning from Amazon led the White House to shut down Anthropic’s Mythos model” — https://fortune.com/2026/06/14/how-a-warning-from-amazon-led-the-white-house-to-shut-down-anthropics-mythos-model/
Semafor: “White House move to limit Anthropic linked to concerns about Chinese access to Mythos” — https://www.semafor.com/article/06/13/2026/white-house-move-to-limit-anthropic-linked-to-concerns-about-chinese-access-to-mythos
Tom’s Hardware: US government warned Anthropic that Fable 5 had been jailbroken — https://www.tomshardware.com/tech-industry/artificial-intelligence/trump-adviser-david-sacks-says-anthropic-refused-to-fix-fable-5-jailbreak-before-us-export-controls

The thing that makes this drop different isn’t the million-token context — Anthropic and Google were already there. It’s that GLM 5.2 is the first frontier-class coding model where you can read the weights, audit the training license under MIT, and run a fork on your own metal — without giving up sub-second response times on the hosted plan while you migrate. The next 30 days will tell us whether the benchmarks back the marketing.