AI API Cost Estimator
Calculate your monthly AI agent costs with precision.
Based on your usage below
Model Selection
Prices shown per 1M tokens · Input / Output
Daily Usage
Back-and-forth chats your agent handles each day — customer questions, internal pings, anything that looks like a thread.
Heavier jobs where the agent gathers, reads, and summarizes information — competitor scans, market briefs, prospect digs.
Sessions where the agent writes, edits, or reviews code. Long context windows make these the most expensive runs.
Tiny scheduled pings that keep the agent awake and watching for triggers. Small individually, big in aggregate.
Scheduled background runs — daily reports, nightly syncs, weekly cleanups. Predictable and easy to budget for.
Times the main agent delegates work to a specialist sub-agent. Use a cheaper model here without sacrificing main-agent quality.
Cost Breakdown
- Conversations$28.8034%
- Coding$19.8023%
- Research$16.2019%
- Cron$10.1312%
- Heartbeats$5.406%
- Sub-agents$4.055%
Glossary
- Token
- A chunk of text the model reads or writes. Roughly 4 characters or three-quarters of a word in English. “Hello, world!” is about 4 tokens.
- Input Tokens
- Everything you send the model: your message, the system prompt, conversation history, tool definitions, and any documents in context. Priced lower than output.
- Output Tokens
- Everything the model generates back. Usually 3–5× the price of input tokens because generation is more compute-intensive than reading.
- MTok
- Short for one million tokens. Pricing is shown per MTok — e.g., “$3 / MTok” means $3 per million input tokens.
- Prompt Caching
- Storing the unchanging part of your prompt (system instructions, long documents) so the model skips re-processing it on every request. Cuts the input bill by ~90% on the cached portion.
- Context Window
- The maximum number of tokens a model can consider at one time. Bigger window = handles longer threads and larger documents, but more tokens billed per call.
- Main Agent
- The primary model your assistant uses for user-facing work — conversations, research, coding. Quality matters most here.
- Sub-Agent
- A secondary model the main agent delegates narrower tasks to. Usually a cheaper model since the work is bounded and well-defined.
- Heartbeat
- A lightweight scheduled ping that keeps an agent loop alive between user inputs. Small per-call, but adds up at scale.
- Cron Job
- A task that runs automatically on a schedule — hourly, daily, weekly. Named after the Unix cron utility.
- Batch API
- A discounted lane for non-realtime work. Submit a batch, get results within 24 hours, pay 50% less. Great for nightly jobs and back-office automation.
- Sub-Agent Spawn
- Each time the main agent kicks off a sub-agent to handle a specific task. One spawn = one delegated job.
Estimates only. Real costs depend on your actual token usage and provider rates.