Skip to content

Gemini Provider

Image generation via Google's Gemini native generateContent API with responseModalities=["IMAGE"]. Good for general-purpose generation with a generous free tier.

Setup

Set your Google API key:

bash IMAGE_GENERATION_MCP_GOOGLE_API_KEY=AIza...

The provider registers automatically when this variable is set. Get a key at Google AI Studio.

Supported models

Model Notes
gemini-2.5-flash-image Default — stable GA release
gemini-3.1-flash-image-preview Preview — successor to 2.5 flash; faster
gemini-3-pro-image-preview Preview — highest quality; best for complex scenes

Use list_providers to see which models are available on your API key.

Aspect ratios and sizes

Gemini natively supports 14 aspect ratios — all are passed through directly:

Aspect ratio Notes
1:1 Square (default)
16:9 Landscape
9:16 Portrait
3:2 Photo landscape
2:3 Photo portrait
3:4 Portrait
4:3 Landscape
4:5 Portrait
5:4 Landscape
4:1 Ultra-wide banner
1:4 Ultra-tall banner
8:1 Extreme panorama
1:8 Extreme vertical
21:9 Cinematic ultra-wide

Quality levels

The quality parameter controls resolution, model reasoning, and response modalities:

Quality Image size Thinking Response modalities Cost
standard 1K Minimal (default) Image only Free tier
hd 2K High (model reasons about composition before rendering) Text + Image Thinking tokens billed

How hd works: When quality="hd" is set, the provider enables thinking_level="High" on supported models (gemini-3.1-flash-image-preview, gemini-3-pro-image-preview). The model reasons through the prompt, plans composition, and may generate interim images before producing the final result. This significantly improves output quality for complex prompts with multiple elements, layouts, or text.

Note: gemini-2.5-flash-image does not support thinking — hd still increases resolution to 2K but skips the thinking configuration for this model.

Negative prompts

Gemini does not have native negative prompt support. When a negative_prompt is provided, it is appended to the prompt as:

```

Avoid: {negative_prompt} ```

Background transparency

Not supported. The background parameter is silently ignored — all images are generated with an opaque background.

Prompt style

Gemini works best with natural language descriptions:

A professional product photo of white sneakers on a clean white background, studio lighting, sharp focus, commercial photography style

Avoid CLIP-style tag lists (those work better with Stable Diffusion).

Per-call model selection

The model parameter on generate_image overrides the provider's default model for a single request:

generate_image(prompt="...", provider="gemini", model="gemini-2.5-flash-image")

Use list_providers to discover available models and their capabilities.

Capability discovery

At startup, the provider returns a static list of known image-capable Gemini models. Unlike OpenAI, the Gemini models.list() API does not reliably filter to image-generation models, so the known model list is maintained in the provider code.

Cost

Gemini has a generous free tier (check Google AI pricing for current limits). The provider is not in paid_providers by default — no confirmation prompt is shown before use. Set IMAGE_GENERATION_MCP_PAID_PROVIDERS=gemini,openai if you want cost confirmation for Gemini.

SynthID watermark

All Gemini-generated images include an invisible SynthID watermark added by Google. This is automatic and cannot be disabled.

Error handling

Error Cause Resolution
Content policy rejection Prompt violates Gemini safety policy Modify the prompt to comply with Google's usage policies
Connection error Cannot reach Gemini API Check network connectivity and API key validity
No image in response Model returned text instead of image Try rephrasing the prompt or use a different model
API error (HTTP 429) Rate limited Wait and retry; consider reducing request frequency