{"slug": "glm-5-2-open-source-750b-params-mit-license-1m-context", "title": "GLM-5.2 Open Source: 750B Params, MIT License, 1M Context", "summary": "Z.ai open-sourced GLM-5.2 on June 17 under an MIT license, a 744B-parameter sparse MoE model with a 1M-token context window that outperforms GPT-5.5 on multiple coding benchmarks while costing about one-sixth the price. The model scores 62.1 on SWE-bench Pro versus GPT-5.5's 58.6, and its API costs $2.40 per million tokens blended compared to GPT-5.5's $13.33, offering a viable open-source alternative for coding agents.", "body_md": "Z.ai open-sourced GLM-5.2 on June 17 under an MIT license — full commercial use, no royalties, no acceptable-use restrictions. The model scores 62.1 on SWE-bench Pro against GPT-5.5’s 58.6, and the API costs $2.40 per million tokens blended versus $13.33 for GPT-5.5. If you are running coding agents at OpenAI prices, you now have a real alternative you can download, self-host, and fine-tune on your own data today.\n\n## What GLM-5.2 Actually Is\n\nGLM-5.2 is a 744B-parameter sparse Mixture-of-Experts model — roughly 40B parameters activate per token, keeping inference costs well below what the headline number implies. It has a 1-million-token context window built for long-horizon agentic tasks: large codebase analysis, full-repo debugging, regulatory document review. Z.ai built it explicitly as a coding agent flagship, and the benchmarks back that up.\n\nThe technical feature that makes the 1M context economically viable is [IndexShare](https://sebastianraschka.com/blog/2026/glm-5-2-indexshare.html) — a sparse attention optimization that reuses the same token index across every four layers instead of recomputing it per layer. This cuts per-token FLOPs by 2.9x at 1M context. The result is that running a million-token prompt does not cost disproportionately more than a short one, which has historically killed long-context adoption at scale.\n\n## The Benchmark Numbers\n\nHere is how GLM-5.2 compares against GPT-5.5 on the benchmarks that matter for agentic work:\n\n| Benchmark | GLM-5.2 | GPT-5.5 | Winner |\n|---|---|---|---|\n| SWE-bench Pro | 62.1 | 58.6 | GLM-5.2 |\n| FrontierSWE | 74.4% | 72.6% | GLM-5.2 |\n| PostTrainBench | 34.3% | 25.0% | GLM-5.2 |\n| MCP-Atlas (tool use) | 77.0 | 75.3 | GLM-5.2 |\n| Terminal-Bench 2.1 | 81.0 | 84.0 | GPT-5.5 |\n\n[SWE-bench Pro](https://groundy.com/articles/glm-5-2-benchmarks-what-62-1-swe-bench-pro-and-99-2-aime-actually-mean/) tests against real GitHub issues with full repository context — not synthetic puzzles. GLM-5.2 leads on all four agentic coding benchmarks and trails only on Terminal-Bench, which skews toward general-purpose terminal tasks. For agent-driven coding specifically, GLM-5.2 now holds the lead on most open benchmarks.\n\n## The Cost Gap Is the Real Story\n\nGLM-5.2’s API runs at $1.40 per million input tokens and $4.40 output — blended at a 2:1 ratio, that is $2.40 per million. GPT-5.5 comes in at $5.00 input and $30.00 output, or $13.33 blended. At 100,000 requests per day on average 3,000-token prompts, that works out to $21,600 per month versus $120,000. At scale, that difference changes the economics of AI-powered products.\n\nSelf-hosting removes the per-token cost entirely. The FP8 weights are on [HuggingFace at zai-org/GLM-5.2-FP8](https://huggingface.co/zai-org/GLM-5.2-FP8) and run on vLLM, SGLang, or transformers. You will need around 800GB of NVMe storage. The MIT license means you can fine-tune on proprietary data, run air-gapped, and commercialize the output with no royalties and no approval from Z.ai. If Z.ai changes its pricing tomorrow, your self-hosted deployment is unaffected.\n\n```\nhuggingface-cli download zai-org/GLM-5.2-FP8 --local-dir ./glm5-2-fp8 --repo-type model\n```\n\n## Drop-In Compatibility With Your Current Tools\n\nZ.ai ships an OpenAI-compatible API endpoint. If you are already using Claude Code, Cline, Roo Code, Goose, OpenCode, Crush, OpenClaw, or Kilo Code, switching to GLM-5.2 is a base-URL change in your config — no SDK swap, no code rewrite. [Vercel integrated it into their AI Gateway within three days](https://venturebeat.com/technology/z-ais-open-weights-glm-5-2-beats-gpt-5-5-on-multiple-long-horizon-coding-benchmarks-for-1-6th-the-cost/) of the June 13 release. Guillermo Rauch described the coding output as “genuinely impressed, almost shocked.” A three-day turnaround from open-source release to production integration is not a normal thing.\n\n## What It Does Not Do\n\nGLM-5.2 has no vision support — text and code only. If your workflows depend on image input or multimodal reasoning, it is not a replacement for GPT-4o or Claude Opus 4.8 in those scenarios. The model has significant Chinese-language training data; for tasks requiring deep linguistic nuance in European languages, test it against your specific workload before committing. And self-hosting 744B parameters is not a weekend project — you need real infrastructure to support it.\n\n## The Bigger Pattern\n\nGLM-5.2 is the third open-source release in 18 months to genuinely close the gap with frontier proprietary models — after DeepSeek R1 for reasoning and DSpark for inference speed. Each follows the same pattern: a lab open-sources something that should not be free at that quality level, the developer community stress-tests it within days, and proprietary providers respond with price cuts. That cycle is accelerating, and GLM-5.2 makes the case that you do not need to pay premium closed-model prices to run competitive coding agents. [The weights are available now.](https://huggingface.co/zai-org/GLM-5.2)", "url": "https://wpnews.pro/news/glm-5-2-open-source-750b-params-mit-license-1m-context", "canonical_source": "https://byteiota.com/glm-5-2-open-source-750b-params-mit-license-1m-context/", "published_at": "2026-06-28 07:08:29+00:00", "updated_at": "2026-06-28 07:11:15.806159+00:00", "lang": "en", "topics": ["large-language-models", "ai-products", "ai-tools", "ai-agents"], "entities": ["Z.ai", "GLM-5.2", "GPT-5.5", "HuggingFace", "Vercel", "Guillermo Rauch", "MIT", "IndexShare"], "alternates": {"html": "https://wpnews.pro/news/glm-5-2-open-source-750b-params-mit-license-1m-context", "markdown": "https://wpnews.pro/news/glm-5-2-open-source-750b-params-mit-license-1m-context.md", "text": "https://wpnews.pro/news/glm-5-2-open-source-750b-params-mit-license-1m-context.txt", "jsonld": "https://wpnews.pro/news/glm-5-2-open-source-750b-params-mit-license-1m-context.jsonld"}}