Grok API Pricing Calculator

Dynamically estimate xAI billing metrics across Grok-4.20 reasoning limits, Batch API logic, Vision/Audio execution, and Server-Side autonomous Tool bindings natively.

1. Choose Grok Endpoint

Input: $2 / 1MCached: $0.2Output: $6 / 1M

2. Select Value Unit

LLM Context Execution

Standard Input Base

Output & Reasoning

Cached Prompt Tokens (Hits)Reusable context costs $0.2 per 1M mapping rather than $2.

Batch API Discount (-50%)

Processes queries universally at explicitly half off standard rates autonomously inside 24 hours. Does not affect Tools.

Server-Side Tool Invocations

Enter the estimated total occurrences agent autonomous functions are called inside your pipeline runs:

Web Search

$5/1k

X Search (Tweets)

$5/1k

Code Execution (Sandboxed)

$5/1k

File Attachments Access

$10/1k

Collections Search (RAG)

$2.50/1k

API Estimation Matrix

Standard Base Query$0.0027

Cached Reads Offset$0.00

Generative Output + Reasoning$0.004

Automated Tooling Integrations$0.00

Total Pipeline Cost^$0.0067

Grok Infrastructure & Pricing Policies

Grok-4 Reasoning Tokens

When running the "Reasoning" variants of Grok-4 protocols natively, the internal cognitive steps calculating the solution are tracked invisibly and billed against your Output completion context rate equivalently before returning your final visible response format.

Cached Execution Mechanics

xAI caches repetitive payloads actively inside your session automatically without manual cache parameters defined against the endpoint payload headers. Reading mapped segments inside Grok-4.20 lowers your query price from $2 down directly proportionally to $0.20 per 1M context iterations globally.

Server-Side Tool Calls

Unlike OpenAI mapping web logic simply against raw tokens, executing a Web Search or accessing direct X Posts (Twitter context data) adds a distinct static flat charge directly (E.g. $5 per 1,000 autonomous tool firings) while appending that context directly back against your internal query loop concurrently.

Frequently Asked Questions

Does the Batch API format apply discounts natively across all models?

No! The Batch structural discount exclusively maps the 50% markdown against standard text LLMs execution. Multimedia rendering outputs (Images/Videos) explicitly remain billed natively at their explicit baseline parameter regardless of latency queuing boundaries.

What are Usage Guideline Violation fees?

If an agent passes execution payloads deemed out of bounds natively during pre-generation structural safety hooks mapped by xAI filters natively inside their cloud bounds, it triggers an absolute $0.05 static block enforcement fee immediately globally overriding standard context logic outputs.

How does Voice Execution billing aggregate context?

For real-time WebSocket connection bindings actively streaming against the "Voice Agent API" (e.g. bidirectional speaking callbacks mapping tools), endpoints are strictly aggregated linearly based structurally on full chronological transmission blocks aggregating data up to the nearest minute increment ($3.00/Hour).

Grok API Pricing Calculator

API Estimation Matrix

Related Calculators

Grok Infrastructure & Pricing Policies

Grok-4 Reasoning Tokens

Cached Execution Mechanics

Server-Side Tool Calls

Frequently Asked Questions