AI API Cost Estimator | Apollo[Claw]

Estimated Monthly Cost

$84

Based on your usage below

Model Selection

Main AgentDrives conversations, research, coding, cron

Sub-AgentSpawned for delegated tasks

HeartbeatLightweight check-ins

Prices shown per 1M tokens · Input / Output

Optimizations

Daily Usage

Start from:

Daily Conversations20

Back-and-forth chats your agent handles each day — customer questions, internal pings, anything that looks like a thread.

01000

Research Tasks3

Heavier jobs where the agent gathers, reads, and summarizes information — competitor scans, market briefs, prospect digs.

0200

Coding Sessions2

Sessions where the agent writes, edits, or reviews code. Long context windows make these the most expensive runs.

0150

Heartbeats/Day20

Tiny scheduled pings that keep the agent awake and watching for triggers. Small individually, big in aggregate.

02000

Cron Jobs5

Scheduled background runs — daily reports, nightly syncs, weekly cleanups. Predictable and easy to budget for.

0500

Sub-agent Spawns3

Times the main agent delegates work to a specialist sub-agent. Use a cheaper model here without sacrificing main-agent quality.

0300

Cost Breakdown

Current Setup

$84 /mo

Main Claude Sonnet 4.6 · Sub Claude Haiku 4.5 · Heartbeat Claude Sonnet 4.6

Optimized Setup

$77 /mo

Sub-agents on Flash, Heartbeats on Nano

Save $7.66/mo (9%)

Where the money goesMonthly

Conversations$28.8034%
Coding$19.8023%
Research$16.2019%
Cron$10.1312%
Heartbeats$5.406%
Sub-agents$4.055%

Glossary

Token: A chunk of text the model reads or writes. Roughly 4 characters or three-quarters of a word in English. “Hello, world!” is about 4 tokens.
Input Tokens: Everything you send the model: your message, the system prompt, conversation history, tool definitions, and any documents in context. Priced lower than output.
Output Tokens: Everything the model generates back. Usually 3–5× the price of input tokens because generation is more compute-intensive than reading.
MTok: Short for one million tokens. Pricing is shown per MTok — e.g., “$3 / MTok” means $3 per million input tokens.
Prompt Caching: Storing the unchanging part of your prompt (system instructions, long documents) so the model skips re-processing it on every request. Cuts the input bill by ~90% on the cached portion.
Context Window: The maximum number of tokens a model can consider at one time. Bigger window = handles longer threads and larger documents, but more tokens billed per call.
Main Agent: The primary model your assistant uses for user-facing work — conversations, research, coding. Quality matters most here.
Sub-Agent: A secondary model the main agent delegates narrower tasks to. Usually a cheaper model since the work is bounded and well-defined.
Heartbeat: A lightweight scheduled ping that keeps an agent loop alive between user inputs. Small per-call, but adds up at scale.
Cron Job: A task that runs automatically on a schedule — hourly, daily, weekly. Named after the Unix cron utility.
Batch API: A discounted lane for non-realtime work. Submit a batch, get results within 24 hours, pay 50% less. Great for nightly jobs and back-office automation.
Sub-Agent Spawn: Each time the main agent kicks off a sub-agent to handle a specific task. One spawn = one delegated job.

Talk to Apollo Claw

Spending more than you'd like?

We've cut client AI bills 40-70% through smarter model routing, prompt caching, and right-sizing every agent for the job. Book a 30-minute discovery call and we'll show you exactly where the savings are.