GLM-5.2 Open Source: 750B Params, MIT License, 1M Context

wpnews.pro

cd /news/large-language-models/glm-5-2-open-source-750b-params-mit-… · home › topics › large-language-models › article

[ARTICLE · art-42330] src=byteiota.com ↗ pub=2026-06-28T07:08Z topic=large-language-models verified=true sentiment=↑ positive

GLM-5.2 Open Source: 750B Params, MIT License, 1M Context

Z.ai open-sourced GLM-5.2 on June 17 under an MIT license, a 744B-parameter sparse MoE model with a 1M-token context window that outperforms GPT-5.5 on multiple coding benchmarks while costing about one-sixth the price. The model scores 62.1 on SWE-bench Pro versus GPT-5.5's 58.6, and its API costs $2.40 per million tokens blended compared to GPT-5.5's $13.33, offering a viable open-source alternative for coding agents.

read4 min views1 publishedJun 28, 2026

GLM-5.2 Open Source: 750B Params, MIT License, 1M Context — Image: Byteiota (auto-discovered)

Z.ai open-sourced GLM-5.2 on June 17 under an MIT license — full commercial use, no royalties, no acceptable-use restrictions. The model scores 62.1 on SWE-bench Pro against GPT-5.5’s 58.6, and the API costs $2.40 per million tokens blended versus $13.33 for GPT-5.5. If you are running coding agents at OpenAI prices, you now have a real alternative you can download, self-host, and fine-tune on your own data today.

What GLM-5.2 Actually Is #

GLM-5.2 is a 744B-parameter sparse Mixture-of-Experts model — roughly 40B parameters activate per token, keeping inference costs well below what the headline number implies. It has a 1-million-token context window built for long-horizon agentic tasks: large codebase analysis, full-repo debugging, regulatory document review. Z.ai built it explicitly as a coding agent flagship, and the benchmarks back that up.

The technical feature that makes the 1M context economically viable is IndexShare — a sparse attention optimization that reuses the same token index across every four layers instead of recomputing it per layer. This cuts per-token FLOPs by 2.9x at 1M context. The result is that running a million-token prompt does not cost disproportionately more than a short one, which has historically killed long-context adoption at scale.

The Benchmark Numbers #

Here is how GLM-5.2 compares against GPT-5.5 on the benchmarks that matter for agentic work:

Benchmark	GLM-5.2	GPT-5.5	Winner
SWE-bench Pro	62.1	58.6	GLM-5.2
FrontierSWE	74.4%	72.6%	GLM-5.2
PostTrainBench	34.3%	25.0%	GLM-5.2
MCP-Atlas (tool use)	77.0	75.3	GLM-5.2
Terminal-Bench 2.1	81.0	84.0	GPT-5.5

SWE-bench Pro tests against real GitHub issues with full repository context — not synthetic puzzles. GLM-5.2 leads on all four agentic coding benchmarks and trails only on Terminal-Bench, which skews toward general-purpose terminal tasks. For agent-driven coding specifically, GLM-5.2 now holds the lead on most open benchmarks.

The Cost Gap Is the Real Story #

GLM-5.2’s API runs at $1.40 per million input tokens and $4.40 output — blended at a 2:1 ratio, that is $2.40 per million. GPT-5.5 comes in at $5.00 input and $30.00 output, or $13.33 blended. At 100,000 requests per day on average 3,000-token prompts, that works out to $21,600 per month versus $120,000. At scale, that difference changes the economics of AI-powered products.

Self-hosting removes the per-token cost entirely. The FP8 weights are on HuggingFace at zai-org/GLM-5.2-FP8 and run on vLLM, SGLang, or transformers. You will need around 800GB of NVMe storage. The MIT license means you can fine-tune on proprietary data, run air-gapped, and commercialize the output with no royalties and no approval from Z.ai. If Z.ai changes its pricing tomorrow, your self-hosted deployment is unaffected.

huggingface-cli download zai-org/GLM-5.2-FP8 --local-dir ./glm5-2-fp8 --repo-type model

Drop-In Compatibility With Your Current Tools #

Z.ai ships an OpenAI-compatible API endpoint. If you are already using Claude Code, Cline, Roo Code, Goose, OpenCode, Crush, OpenClaw, or Kilo Code, switching to GLM-5.2 is a base-URL change in your config — no SDK swap, no code rewrite. Vercel integrated it into their AI Gateway within three days of the June 13 release. Guillermo Rauch described the coding output as “genuinely impressed, almost shocked.” A three-day turnaround from open-source release to production integration is not a normal thing.

What It Does Not Do #

GLM-5.2 has no vision support — text and code only. If your workflows depend on image input or multimodal reasoning, it is not a replacement for GPT-4o or Claude Opus 4.8 in those scenarios. The model has significant Chinese-language training data; for tasks requiring deep linguistic nuance in European languages, test it against your specific workload before committing. And self-hosting 744B parameters is not a weekend project — you need real infrastructure to support it.

The Bigger Pattern #

GLM-5.2 is the third open-source release in 18 months to genuinely close the gap with frontier proprietary models — after DeepSeek R1 for reasoning and DSpark for inference speed. Each follows the same pattern: a lab open-sources something that should not be free at that quality level, the developer community stress-tests it within days, and proprietary providers respond with price cuts. That cycle is accelerating, and GLM-5.2 makes the case that you do not need to pay premium closed-model prices to run competitive coding agents. The weights are available now.

source & further reading

byteiota.com — original article Exploitarium: 130 0-Days Dropped—Two Are Critical Now Next.js 16.3: Instant Navigations and Agent DevTools Are Here LLM Model Routing in 2026: Cut AI Costs 70% With Smart Model Selection

~/api · this article 200

$curl api.wpnews.pro/v1/news/glm-5-2-open-source-750b…

Read original on byteiota.com → byteiota.com/glm-5-2-open-source-750b-params-mit…

mentioned entities

Z.ai

GLM-5.2

GPT-5.5

HuggingFace

Vercel

Guillermo Rauch

MIT

IndexShare

metadata

slugglm-5-2-open-source-750b-params-mit-license-1m-context

topic#large-language-models

secondary3 topics

sentimentpositive

canonicalbyteiota.com

navigation

← prevKorea, Japan defense chiefs agre…

── more in #large-language-models 4 stories · sorted by recency

letsdatascience.com · 27 Jun · #large-language-models

Chinese Models Narrow Gap With Anthropic and OpenAI

byteiota.com · 25 Jun · #large-language-models

GLM-5.2 Beats GPT-5.5 at Coding for One-Sixth the Price

github.com · 28 Jun · #large-language-models

AI Berkshire

byteiota.com · 28 Jun · #large-language-models

LLM Model Routing in 2026: Cut AI Costs 70% With Smart Model Selection

── more on @z.ai 3 stories trending now

wpnews · 25 May · #artificial-intelligence

Maia-3: free and open source

wpnews · 28 May · #ai-startups

[AINews] Cognition raises $1B in $26B Series D

wpnews · 5 Jun · #ai-agents

Miasma Worm Targets AI Coding Agents via GitHub Repos

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required