Sonu Sahani logo
Sonusahani.com

Grok API Pricing Calculator

Dynamically estimate xAI billing metrics across Grok-4.20 reasoning limits, Batch API logic, Vision/Audio execution, and Server-Side autonomous Tool bindings natively.

Input: $2 / 1MCached: $0.2Output: $6 / 1M

LLM Context Execution
Reusable context costs $0.2 per 1M mapping rather than $2.
Server-Side Tool Invocations

Enter the estimated total occurrences agent autonomous functions are called inside your pipeline runs:

Web Search
$5/1k
X Search (Tweets)
$5/1k
Code Execution (Sandboxed)
$5/1k
File Attachments Access
$10/1k
Collections Search (RAG)
$2.50/1k

API Estimation Matrix

Standard Base Query$0.0027
Cached Reads Offset$0.00
Generative Output + Reasoning$0.004
Automated Tooling Integrations$0.00
Total Pipeline Cost$0.0067

Grok Infrastructure & Pricing Policies

Grok-4 Reasoning Tokens

When running the "Reasoning" variants of Grok-4 protocols natively, the internal cognitive steps calculating the solution are tracked invisibly and billed against your Output completion context rate equivalently before returning your final visible response format.

Cached Execution Mechanics

xAI caches repetitive payloads actively inside your session automatically without manual cache parameters defined against the endpoint payload headers. Reading mapped segments inside Grok-4.20 lowers your query price from $2 down directly proportionally to $0.20 per 1M context iterations globally.

Server-Side Tool Calls

Unlike OpenAI mapping web logic simply against raw tokens, executing a Web Search or accessing direct X Posts (Twitter context data) adds a distinct static flat charge directly (E.g. $5 per 1,000 autonomous tool firings) while appending that context directly back against your internal query loop concurrently.

Frequently Asked Questions

Does the Batch API format apply discounts natively across all models?

No! The Batch structural discount exclusively maps the 50% markdown against standard text LLMs execution. Multimedia rendering outputs (Images/Videos) explicitly remain billed natively at their explicit baseline parameter regardless of latency queuing boundaries.

What are Usage Guideline Violation fees?

If an agent passes execution payloads deemed out of bounds natively during pre-generation structural safety hooks mapped by xAI filters natively inside their cloud bounds, it triggers an absolute $0.05 static block enforcement fee immediately globally overriding standard context logic outputs.

How does Voice Execution billing aggregate context?

For real-time WebSocket connection bindings actively streaming against the "Voice Agent API" (e.g. bidirectional speaking callbacks mapping tools), endpoints are strictly aggregated linearly based structurally on full chronological transmission blocks aggregating data up to the nearest minute increment ($3.00/Hour).