
Instantly estimate the cost of Google's Gemini models in real-time. Choose your preferred unit and enter exactly what you plan to send.
Leave at 0 if you are sending standard prompts without the cache API.
Prices update in real-time. Results under $0.0001 are rounded up. Official Google AI usage rates apply.
The latest performance, intelligence, and usability improvements intended for multimodal understanding, agentic capabilities, and vibe-coding.
The most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing.
Low-latency, audio-to-audio model optimized for real-time dialogue with acoustic nuance detection, numeric precision, and multimodal awareness.
Designed for speed and efficiency, this image generation model is effective for quick, interactive responses and high throughput.
A 3.1 Flash Text-to-Speech audio model optimized for price-performant, low-latency, and controllable speech generation.
An intelligent model built for speed, combining frontier intelligence with superior search and grounding.
A native image generation model optimized for speed, flexibility, and contextual understanding.
State-of-the-art multipurpose model that excels at coding and complex reasoning tasks.
A hybrid reasoning model which supports a 1M token context window and has thinking budgets.
The smallest and most cost-effective model built for at-scale usage.
The latest version of 2.5 Flash-Lite optimized for cost-efficiency, high throughput, and high quality.
Live API native audio models optimized for natural audio outputs with better pacing, verbosity, and mood.
Native image generation model optimized for contextual understanding.
Text-to-speech audio model optimized for low-latency generation.
Pro text-to-speech audio model optimized for powerful, natural speech generation formats.
Deprecated legacy models that will be shut down natively on June 1, 2026.
The latest image generation architecture featuring significantly better text rendering and overall image quality.
The latest video generation model available to developers.
Stable video generation models available on the paid tier of the API.
Google's family of music generation models offering clip previews and full songs.
The first multimodal embedding model mapping text, images, video, audio, and PDFs into a unified embedding space.
Classic embeddings model for text-only use cases.
Embodied Reasoning thinking models that enhance robot abilities to understand and interact with the physical world.
Model optimized for building browser control agents that automate tasks autonomously.
A lightweight, state-of the art open model built from the same core technology.
LLMs process text in "tokens". The API charges per 1 Million tokens. A token is roughly 4 characters in English, meaning 100 tokens is about 75 words. This calculator converts your standard words or characters directly into token billing rates behind the scenes so you don't have to guess.
For high-tier models like Gemini 3.1 Pro and 2.5 Pro, the price per token doubles once your prompt exceeds 200,000 tokens. Processing massive context windows (like full codebases or books) simultaneously requires significantly more GPU power per attention-head.
If you upload images alongside text, they are mapped to fixed token sizes before being billed. A standard image consumes roughly 258 tokens for smaller models, but for precise vision tasks, high-res images (like 4K) can consume up to 2,520 tokens per image.
Gemini processes audio natively without transcribing it to text first. Audio is billed in standard tokens just like textโGoogle treats 1 second of audio as exactly 25 tokens. The calculator automates this conversion.
Details based on official Google AI Studio billing policies.
They offer generous free tiers. For Gemini 3 models, you receive 5,000 free grounding prompts per month (shared across models). After you exhaust the limit, it costs $14.00 per 1,000 search queries. Note that a single prompt might trigger multiple independent search queries under the hood.
If you repeatedly send the exact same large system prompt, codebase, or document context, you can 'cache' those tokens. You pay a heavily discounted rate to process them the first time, and then a flat hourly storage fee (ranging from $1.00 to $4.50 depending on the model tier). Every chat request after that avoids paying full input bounds for that cached text, which can reduce costs by over 80%.
Code Execution itself is completely free to run on Google's sandbox. However, the exact Python code the model generates is billed as standard Output Tokens, and the terminal result passed back from the sandbox is billed as standard Input Tokens as the model continues its iterative reasoning process.
The Veo models are billed by the second: Veo 3.1 Standard is $0.40 per second of video. Imagen 4 generates standard images at $0.04 per image, while the Ultra generation tier operates at $0.06 per image.
Yes! Google AI Studio usage remains free of charge for developers operating within the standard free limits. The calculated pricing primarily applies if you enable billing or are routing production API workload through Google Cloud Vertex AI or paid AI Studio projects.
Google has marked older versions like Gemini 2.0 Flash and 2.0 Flash-Lite as deprecated. They provide backward compatibility right now, but they will be permanently shut down on June 1, 2026. It is highly recommended to migrate to the 2.5 or 3.1 families.