Right now we’re releasing the secure model of Gemini 2.5 Flash-Lite, the quickest and lowest value ($0.10/M enter, $0.40/M output) mannequin within the Gemini 2.5 mannequin household. We constructed 2.5 Flash-Lite to push the frontiers of intelligence per greenback with native inference capabilities that may be optionally toggled for extra demanding use circumstances. Constructing on the momentum of two.5 Professional and a pair of.5 Flash, this mannequin completes the set of two.5 fashions prepared to be used in large-scale manufacturing environments.
Most cost-effective and quickest 2.5 mannequin ever
Gemini 2.5 Flash-Lite balances efficiency and price with out compromising high quality, particularly for delay-sensitive duties resembling translation and classification.
The traits are as follows.
- Finest-in-class pace: Gemini 2.5 Flash-Lite has decrease latency than each 2.0 Flash-Lite and a pair of.0 Flash throughout a large pattern of prompts.
- Value effectivity: That is our lowest value 2.5 mannequin thus far, with enter tokens of $0.10/$1 million and output tokens of $0.40, permitting us to deal with giant numbers of requests affordably. We have additionally decreased the worth of audio inputs by 40% for the reason that preview launch.
- Good and small: It reveals general larger high quality than 2.0 Flash-Lite throughout a variety of benchmarks together with coding, math, science, reasoning, and multimodal understanding.
- Full performance: Whenever you construct with 2.5 Flash-Lite, you get entry to a 1 million token context window, a controllable thought price range, and help for native instruments like grounding with Google search, code execution, and URL context.
Gemini 2.5 Flash-Lite in motion
Because the launch of two.5 Flash-Lite, we’ve got already seen some extremely profitable deployments. Listed below are a few of our favorites.
- satrit is reworking the way in which satellite tv for pc information is processed, constructing a distributed area computing platform for real-time summarization of on-orbit telemetry, autonomous job administration, and intersatellite communications evaluation. 2.5 Flash-Lite pace allows 45% discount in latency Vital onboard diagnostics and 30% discount in energy consumption in comparison with the baseline mannequin.
- Hey Gene Use AI to create avatars to your video content material, leverage Gemini 2.5 Flash-Lite to automate video planning, and analyze and optimize your content material. Translate movies into over 180 languages. This permits us to supply a world and personalised expertise to our customers.
- document hound Convert your product demos into documentation utilizing Gemini 2.5 Flash-Lite. Course of lengthy movies and extract 1000’s of screenshots With low latency. This turns footage into complete documentation and coaching information for AI brokers a lot sooner than conventional strategies.
- evertune It helps you perceive how your model is represented throughout AI fashions. Gemini 2.5 Flash-Lite is their revolutionary product that dramatically quickens evaluation and report technology. Its quick efficiency lets you shortly scan, synthesize, and ship giant quantities of mannequin output to your shoppers. Dynamic and well timed insights.
You can begin utilizing 2.5 Flash-Lite by specifying “gemini-2.5-flash-lite” in your code. In case you are utilizing the preview model, you’ll be able to change to “gemini-2.5-flash-lite” which is identical underlying mannequin. Flash-Lite preview aliases will probably be eliminated on August twenty fifth.
Prepared to start out constructing? Strive the secure model of Gemini 2.5 Flash-Lite at this time Google AI Studio and Vertex AI.

