Zhipu AI has launched GLM-4.6, a significant replace to the GLM sequence, specializing in agent workflows, long-context inference, and sensible coding duties. The mannequin raises the enter window 200k token in 128K most outputtargets token consumption within the utilized job, Open weight For native deployment.

So, what precisely is it new?
- Context + Output Restrict: 200k Enter context and 128K Most output token.
- Actual-world coding outcomes: With growth CC Bench (Multi-turn duties carried out by human evaluators in remoted Docker environments), GLM-4.6 has been reported Practically parity with Claude Sonnet 4 (48.6% win) and makes use of ~15% token and GLM-4.5 Full the duty. Activity prompts and agent trajectories are revealed for inspection.
- Benchmark positioning: Zhipu summarises “clear advantages” over GLM-4.5 throughout eight public benchmarks with Claude Sonnet 4/4.6 and state parity. I will level that out too GLM-4.6 delays Sonnet 4.5 in coding– Notes that will help you choose a mannequin.
- Ecosystem availability: GLM-4.6 is obtainable through Z.AI API and OpenRouter;Integrates with standard coding brokers (Claude Code, Cline, Roo Code, Kilo Code), and current coding plan customers can improve by switching to mannequin identify
glm-4.6. - Open Weights + License:Hugging the face mannequin card listing License: MIT and Mannequin measurement: 355b parameters (MOE) Comes with BF16/F32 tensor. (MOE “Complete Parameter” is just not equal to the energetic parameter per token. For 4.6 on the cardboard, the energetic param diagram is just not listed.)
- Native reasoning: vllm and sglang Native servings are supported. The burden is on Hugging my face and ModelScope.


abstract
In GLM-4.6, incremental is the fabric step. 200k context window, about 15% token discount on CC bench vs. GLM-4.5, parity job wind fee with Claude Sonnet 4, instant availability through Z.AI, OpenRouter and open weight artifacts for native serving.


FAQ
1) What are the boundaries for the context and output token?
GLM-4.6 helps a 200k Enter context and 128K Most output token.
2) Is open weight out there and below which license is it out there?
sure. Embracing face mannequin card listing Open weight and License: MIT And a 357B-Parameter MOE Configuration (BF16/F32 tensor).
3) How does GLM-4.6 examine with GLM-4.5 and Claude Sonnet 4 for utilized duties?
With growth CC Bench,GLM-4.6 Report ~15% token and GLM-4.5 and Privilege with Claude Sonnet 4 (48.6% victory).
4) Can I run glm-4.6 regionally?
sure. Zhipu gives weight Hugging face/mannequin scope and doc native inferences vllm and sglang;Workstation-class {hardware} has emerged with the quantization of group.
Please verify github page, Embracing face model card and Technical details. Please be at liberty to verify GitHub pages for tutorials, code and notebooks. Additionally, please be at liberty to comply with us Twitter And do not forget to hitch us 100k+ ml subreddit And subscribe Our Newsletter.
Asif Razzaq is CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, ASIF is dedicated to leveraging the probabilities of synthetic intelligence for social advantages. His newest efforts are the launch of MarkTechPost, a man-made intelligence media platform. That is distinguished by its detailed protection of machine studying and deep studying information, and is simple to know by a technically sound and large viewers. The platform has over 2 million views every month, indicating its reputation amongst viewers.
🔥[Recommended Read] Nvidia AI Open-Sources Vipe (Video Pause Engine): A robust and versatile 3D video annotation device for spatial AI

