Sonu Sahani logo
Sonusahani.com

OpenAI API Pricing Calculator

Calculate generation limits and costs dynamically across OpenAI's API. Supporting GPT-5.4, o3 reasoning models, and standard GPT-4 architectures.

Input: $2.5 per 1M tokensCached: $0.25Output: $15

What are you sending to OpenAI?
Discounts apply for reused context
What is OpenAI generating?

Estimated Cost

Total Input Cost$0.0033
Cached Input Cost$0.00
Output Generation$0.01
Total Cost$0.0133

Prices update in real-time. Results under $0.0001 are rounded up.

Overview of Core Models

GPT-5.4 (<272K context)

Category: Flagship

Input: $2.5 | Cached: $0.25 | Output: $15

GPT-5.4 Mini

Category: Flagship

Input: $0.75 | Cached: $0.075 | Output: $4.5

GPT-5.4 Nano

Category: Flagship

Input: $0.2 | Cached: $0.02 | Output: $1.25

GPT-5.4 Pro

Category: Flagship

Input: $30 | Cached: N/A | Output: $180

GPT-5.2

Category: Flagship

Input: $1.75 | Cached: $0.175 | Output: $14

GPT-5.1

Category: Flagship

Input: $1.25 | Cached: $0.125 | Output: $10

GPT-5

Category: Flagship

Input: $1.25 | Cached: $0.125 | Output: $10

GPT-5 Mini

Category: Flagship

Input: $0.25 | Cached: $0.025 | Output: $2

o3

Category: Reasoning

Input: $2 | Cached: $0.5 | Output: $8

o3 Pro

Category: Reasoning

Input: $20 | Cached: N/A | Output: $80

o3 Mini

Category: Reasoning

Input: $1.1 | Cached: $0.55 | Output: $4.4

o4 Mini

Category: Reasoning

Input: $1.1 | Cached: $0.275 | Output: $4.4

o1

Category: Reasoning

Input: $15 | Cached: $7.5 | Output: $60

o1 Pro

Category: Reasoning

Input: $150 | Cached: N/A | Output: $600

o1 Mini

Category: Reasoning

Input: $1.1 | Cached: $0.55 | Output: $4.4

GPT-4o

Category: Previous Gen

Input: $2.5 | Cached: $1.25 | Output: $10

GPT-4o Mini

Category: Previous Gen

Input: $0.15 | Cached: $0.075 | Output: $0.6

GPT-4 Turbo

Category: Previous Gen

Input: $10 | Cached: N/A | Output: $30

GPT-3.5 Turbo

Category: Previous Gen

Input: $0.5 | Cached: N/A | Output: $1.5

o3 Deep Research

Category: Specialized

Input: $10 | Cached: $2.5 | Output: $40

o4 Mini Deep Research

Category: Specialized

Input: $2 | Cached: $0.5 | Output: $8

Computer Use Preview

Category: Specialized

Input: $3 | Cached: N/A | Output: $12

Understanding OpenAI Pricing

Regional Processing

Regional processing endpoints (data residency requirements) typically incur a 10% price uplift for Flagship generation models like GPT-5.4, GPT-5.4 Mini, and GPT-5.4 Pro.

Reasoning Tokens

Models like o1, o3, and their mini variants consume invisible "reasoning tokens" during execution. These are billed at standard output rates, meaning your generated response cost includes both the visible content and the underlying thought process steps.

Web Search Preview

Connecting generation tasks to web search incurs an additional flat rate per 1k iterations plus standard execution costs. For Reasoning models, search preview costs $10.00 / 1k calls. Non-reasoning models run at $25.00 / 1k calls.

Frequently Asked Questions

How does context caching work for OpenAI?

For capable models like GPT-4o and GPT-5.4, OpenAI can reuse cached blocks of tokens across multiple identical prompt fragments, applying massive discounts to the input cost. For instance, GPT-5.4 features a 90% discount (input drops from $2.50 to $0.25) when matching cached data.

Are tool calls billed separately?

The tokens consumed to understand, generate, and process tool calls are tracked precisely alongside input and output parameters, mapped at the selected model's per-token rates. However, specialized tools like File Search incur discrete "Tool call" access costs (e.g., $2.50 / 1k calls).

How are images and multimodal inputs priced?

Unlike standard text, multimodal features such as sending images or video frames are mapped individually using grid constraints. A standard High-Res image tile is roughly 170 tokens each plus an 85 token base footprint constraint per image.

What is the "Computer Use" model?

OpenAI offers a specialized computer-use-preview interface priced affordably at $3 / 1M input tokens, supporting automated web actions and software-layer integrations autonomously guided by visual context grids.