Sonu Sahani logo
Sonusahani.com

Gemini API Pricing Calculator

Instantly estimate the cost of Google's Gemini models in real-time. Choose your preferred unit and enter exactly what you plan to send.

Highest performance for reasoning, agents, and complex coding.
What are you sending to Gemini?
What is Gemini generating?
Context Caching (Advanced)

Leave at 0 if you are sending standard prompts without the cache API.

Estimated Cost

Total Input Cost$0.00267
Total Output Cost$0.00800
Caching / Storage$0.00000
Total
$0.0107

Prices update in real-time. Results under $0.0001 are rounded up. Official Google AI usage rates apply.

All Gemini Models

Gemini 3.1 Pro Preview

The latest performance, intelligence, and usability improvements intended for multimodal understanding, agentic capabilities, and vibe-coding.

Gemini 3.1 Flash-Lite Preview

The most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.

Gemini 3.1 Flash Live Preview

Low-latency, audio-to-audio model optimized for real-time dialogue with acoustic nuance detection, numeric precision, and multimodal awareness.

Gemini 3.1 Flash Image Preview

Designed for speed and efficiency, this image generation model is effective for quick, interactive responses and high throughput.

Gemini 3.1 Flash TTS Preview

A 3.1 Flash Text-to-Speech audio model optimized for price-performant, low-latency, and controllable speech generation.

Gemini 3 Flash Preview

An intelligent model built for speed, combining frontier intelligence with superior search and grounding.

Gemini 3 Pro Image Preview

A native image generation model optimized for speed, flexibility, and contextual understanding.

Gemini 2.5 Pro

State-of-the-art multipurpose model that excels at coding and complex reasoning tasks.

Gemini 2.5 Flash

A hybrid reasoning model which supports a 1M token context window and has thinking budgets.

Gemini 2.5 Flash-Lite

The smallest and most cost-effective model built for at-scale usage.

Gemini 2.5 Flash-Lite Preview

The latest version of 2.5 Flash-Lite optimized for cost-efficiency, high throughput, and high quality.

Gemini 2.5 Flash Native Audio

Live API native audio models optimized for natural audio outputs with better pacing, verbosity, and mood.

Gemini 2.5 Flash Image

Native image generation model optimized for contextual understanding.

Gemini 2.5 Flash Preview TTS

Text-to-speech audio model optimized for low-latency generation.

Gemini 2.5 Pro Preview TTS

Pro text-to-speech audio model optimized for powerful, natural speech generation formats.

Gemini 2.0 Flash & Flash-Lite

Deprecated legacy models that will be shut down natively on June 1, 2026.

Imagen 4

The latest image generation architecture featuring significantly better text rendering and overall image quality.

Veo 3.1

The latest video generation model available to developers.

Veo 3 & Veo 2

Stable video generation models available on the paid tier of the API.

Lyria 3

Google's family of music generation models offering clip previews and full songs.

Gemini Embedding 2 Preview

The first multimodal embedding model mapping text, images, video, audio, and PDFs into a unified embedding space.

Gemini Embedding

Classic embeddings model for text-only use cases.

Gemini Robotics-ER 1.6 & 1.5

Embodied Reasoning thinking models that enhance robot abilities to understand and interact with the physical world.

Gemini 2.5 Computer Use

Model optimized for building browser control agents that automate tasks autonomously.

Gemma 4

A lightweight, state-of the art open model built from the same core technology.

Understanding Token Constraints

Tokens, Words, and Characters

LLMs process text in "tokens". The API charges per 1 Million tokens. A token is roughly 4 characters in English, meaning 100 tokens is about 75 words. This calculator converts your standard words or characters directly into token billing rates behind the scenes so you don't have to guess.

The 200k Context Tier

For high-tier models like Gemini 3.1 Pro and 2.5 Pro, the price per token doubles once your prompt exceeds 200,000 tokens. Processing massive context windows (like full codebases or books) simultaneously requires significantly more GPU power per attention-head.

Images & Multimodality

If you upload images alongside text, they are mapped to fixed token sizes before being billed. A standard image consumes roughly 258 tokens for smaller models, but for precise vision tasks, high-res images (like 4K) can consume up to 2,520 tokens per image.

Audio Billing

Gemini processes audio natively without transcribing it to text first. Audio is billed in standard tokens just like textโ€”Google treats 1 second of audio as exactly 25 tokens. The calculator automates this conversion.

Frequently Asked Questions

Details based on official Google AI Studio billing policies.

Are grounding tools like Google Search and Google Maps free?

They offer generous free tiers. For Gemini 3 models, you receive 5,000 free grounding prompts per month (shared across models). After you exhaust the limit, it costs $14.00 per 1,000 search queries. Note that a single prompt might trigger multiple independent search queries under the hood.

How does Context Caching save me money?

If you repeatedly send the exact same large system prompt, codebase, or document context, you can 'cache' those tokens. You pay a heavily discounted rate to process them the first time, and then a flat hourly storage fee (ranging from $1.00 to $4.50 depending on the model tier). Every chat request after that avoids paying full input bounds for that cached text, which can reduce costs by over 80%.

Do I pay for Code Execution features?

Code Execution itself is completely free to run on Google's sandbox. However, the exact Python code the model generates is billed as standard Output Tokens, and the terminal result passed back from the sandbox is billed as standard Input Tokens as the model continues its iterative reasoning process.

How much do the AI video/image generations (Veo / Imagen 4) cost?

The Veo models are billed by the second: Veo 3.1 Standard is $0.40 per second of video. Imagen 4 generates standard images at $0.04 per image, while the Ultra generation tier operates at $0.06 per image.

Is it completely free to experiment in Google AI Studio?

Yes! Google AI Studio usage remains free of charge for developers operating within the standard free limits. The calculated pricing primarily applies if you enable billing or are routing production API workload through Google Cloud Vertex AI or paid AI Studio projects.

Are the Gemini 2.0 models still supported?

Google has marked older versions like Gemini 2.0 Flash and 2.0 Flash-Lite as deprecated. They provide backward compatibility right now, but they will be permanently shut down on June 1, 2026. It is highly recommended to migrate to the 2.5 or 3.1 families.