Introduction
LLM Cost Tracker Documentation
LLM Cost Tracker is a drop-in SDK and dashboard that answers "where is my AI bill coming from" in under 5 minutes. You wrap your existing LLM calls with one function, and immediately get per-group, per-user, and per-model cost attribution in a live dashboard. You can set monthly spend targets per group, and enforce hard spend limits by user or tier automatically in the SDK.
New here? Start with the Quickstart — you'll have data flowing in under 5 minutes.
How it works
LLM Cost Tracker has three capabilities:
- Observe — a live view of your LLM spend broken down by group, user, model, and time. Know which part of your product is driving the bill and why.
- Plan — set monthly spend targets per group. Track actual vs plan. Get alerted when something is drifting before the month ends.
- Enforce — set per-user or per-tier spend limits. The SDK checks and enforces them on every call. No homegrown counter required.
All three are powered by a single lightweight SDK wrapper — one import, one function call, zero change to your response handling.
What it tracks
Every tracked call captures:
- Model used
- Input and output token counts
- Cost in USD (calculated automatically per model)
- Latency in milliseconds
- Group — what part of your app made the call. Use a feature name, team, workflow, client, or any string that makes sense for your product.
- User ID — which of your users triggered the call
- Tier — your pricing tier for this user, used for budget template enforcement
- Prompt version — label your prompt variants to compare cost before and after changes
Supported models
LLM Cost Tracker automatically calculates cost for the following models:
- Anthropic:
claude-opus-4-7,claude-sonnet-4-6,claude-haiku-4-5, and legacy Claude 4.x models - OpenAI:
gpt-5.5,gpt-5.4,gpt-5.4-mini,gpt-5.4-nano,gpt-5,gpt-5-mini,gpt-4.1,gpt-4.1-mini,gpt-4.1-nano,o3,o3-pro,o4-mini - Google:
gemini-3.1-pro-preview,gemini-3.1-flash-lite-preview,gemini-3-flash-preview,gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-lite - xAI:
grok-4.20,grok-4,grok-4-fast-reasoning,grok-4-fast-non-reasoning,grok-code-fast-1 - DeepSeek:
deepseek-v4-flash,deepseek-v4-pro - Meta:
meta-llama/llama-4-scout,meta-llama/llama-4-maverick,meta-llama/llama-3.3-70b-instruct - Perplexity:
sonar,sonar-pro,sonar-reasoning-pro,sonar-deep-research
If a model string is not recognised, the event is still logged with token counts — cost will show as unknown. New models are added regularly. Pricing last verified 2026-05-15.
Next steps
- Quickstart — get data flowing in 5 minutes
- SDK reference — full API documentation
- Dashboard guide — understanding your data
- Budget enforcement — set spend limits by user or tier