in december For the primary time, we’re bringing Gemini 2.0 Flash’s native picture output to a trusted tester. In the present day we’re making this out there for developer experimentation. all regions At present supported by Google AI Studio. You possibly can take a look at this new function utilizing the experimental model of Gemini 2.0 Flash (gemini-2.0-flash-exp) by way of Google AI Studio and Gemini API.
Gemini 2.0 Flash combines multimodal enter, enhanced inference, and pure language understanding to create pictures.
2.0 Listed below are some examples of how Flash’s multimodal output shines.
1. Textual content and pictures collectively
Use Gemini 2.0 Flash to inform your story, illustrate it with footage, and maintain your characters and settings constant. While you give suggestions, the mannequin retells the story or modifications its drawing type.
Producing tales and illustrations in Google AI Studio
2. Conversational picture modifying
Gemini 2.0 Flash helps you edit pictures with pure language dialogue again and again, excellent for iterating on the proper picture or exploring totally different concepts collectively.
Multi-turn dialog picture modifying that maintains the context of your entire dialog in Google AI Studio
3. World understanding
Not like many different picture technology fashions, Gemini 2.0 Flash leverages world data and enhanced reasoning to proper picture. That is excellent for creating life like, detailed pictures like those who illustrate recipes. As with all language fashions, we try for accuracy, however our data is broad and basic, not absolute or full.
Interleaved textual content and picture output for recipes in Google AI Studio
4. Rendering textual content
Most picture technology fashions battle to precisely render lengthy textual content sequences, typically leading to poorly formatted, illegible, or misspelled characters. Inner benchmarks present that 2.0 Flash has stronger rendering in comparison with main competing fashions, making it splendid for creating adverts, social posts, and even invites.
Picture output by rendering lengthy textual content in Google AI Studio
Begin creating pictures with Gemini right now
Get began with Gemini 2.0 Flash by way of Gemini API. Study extra about picture technology right here. document.
from google import genai
from google.genai import sorts
consumer = genai.Consumer(api_key="GEMINI_API_KEY")
response = consumer.fashions.generate_content(
mannequin="gemini-2.0-flash-exp",
contents=(
"Generate a narrative a few cute child turtle in a 3d digital artwork type. "
"For every scene, generate a picture."
),
config=sorts.GenerateContentConfig(
response_modalities=["Text", "Image"]
),
)
python
Whether or not you are constructing an AI agent, growing an app with lovely visuals like an illustrated interactive story, or brainstorming visible concepts throughout a dialog, Gemini 2.0 Flash permits you to add textual content and picture technology in a single mannequin. We look ahead to seeing what builders create with native picture output, and to your creations. feedback It helps you get a production-ready model very quickly.

