At present we’re introducing the Gemma 4 12B, our newest mannequin designed to convey agent multimodal intelligence on to your laptop computer. Gemma 4 12B packages highly effective options inside a decreased reminiscence footprint to bridge the hole between the edge-friendly E4B and the extra superior 26B Combination of Consultants (MoE). It is also the primary mid-sized mannequin to function native audio enter.
Because of our developer neighborhood, gemma 4 The mannequin presently has over 150 million downloads. you constructed all the things wearable robot arm Present bodily assist for enterprise-grade AI safety. We won’t wait to see what you construct with this newest addition.
This is a abstract of what makes Gemma 4 12B distinctive:
- Novel integration structure: There is no such thing as a multimodal encoder. Imaginative and prescient and audio inputs circulation on to the LLM spine.
- Superior reasoning: It achieves benchmark efficiency near the 26B mannequin and allows highly effective multi-step inference and agent workflows.
- Laptop computer suitable: Sufficiently small to run domestically with simply 16 GB of VRAM or unified reminiscence.
- Open and accessible: Launched underneath the Apache 2.0 license with assist from your entire developer ecosystem.
- For drafters: Gemma 4 12B is provided with a multi-token prediction (MTP) drafter to cut back latency.
Collectively, these options convey superior multimodal performance to on a regular basis {hardware} with out sacrificing pace or inference. Now let’s take a more in-depth take a look at how the Gemma 4 12B achieves this.
Run state-of-the-art brokers domestically
The Gemma 4 12B provides efficiency near the bigger 26B MoE mannequin on normal benchmarks, however with lower than half the overall reminiscence footprint. It is sufficiently small to run domestically on a client laptop computer with 16 GB of RAM, enabling highly effective multimodal and agent experiences on the machine.

