Introduction

LLM Cost Tracker Documentation

LLM Cost Tracker is a drop-in SDK and dashboard that answers "where is my AI bill coming from" in under 5 minutes. You wrap your existing LLM calls with one function, and immediately get per-group, per-user, and per-model cost attribution in a live dashboard. You can set monthly spend targets per group, and enforce hard spend limits by user or tier automatically in the SDK.

New here? Start with the Quickstart — you'll have data flowing in under 5 minutes.

How it works

LLM Cost Tracker has three capabilities:

  • Observe — a live view of your LLM spend broken down by group, user, model, and time. Know which part of your product is driving the bill and why.
  • Plan — set monthly spend targets per group. Track actual vs plan. Get alerted when something is drifting before the month ends.
  • Enforce — set per-user or per-tier spend limits. The SDK checks and enforces them on every call. No homegrown counter required.

All three are powered by a single lightweight SDK wrapper — one import, one function call, zero change to your response handling.

What it tracks

Every tracked call captures:

  • Model used
  • Input and output token counts
  • Cost in USD (calculated automatically per model)
  • Latency in milliseconds
  • Group — what part of your app made the call. Use a feature name, team, workflow, client, or any string that makes sense for your product.
  • User ID — which of your users triggered the call
  • Tier — your pricing tier for this user, used for budget template enforcement
  • Prompt version — label your prompt variants to compare cost before and after changes

Supported models

LLM Cost Tracker automatically calculates cost for the following models:

  • Anthropic: claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5, and legacy Claude 4.x models
  • OpenAI: gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5, gpt-5-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o3-pro, o4-mini
  • Google: gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview, gemini-3-flash-preview, gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite
  • xAI: grok-4.20, grok-4, grok-4-fast-reasoning, grok-4-fast-non-reasoning, grok-code-fast-1
  • DeepSeek: deepseek-v4-flash, deepseek-v4-pro
  • Meta: meta-llama/llama-4-scout, meta-llama/llama-4-maverick, meta-llama/llama-3.3-70b-instruct
  • Perplexity: sonar, sonar-pro, sonar-reasoning-pro, sonar-deep-research

If a model string is not recognised, the event is still logged with token counts — cost will show as unknown. New models are added regularly. Pricing last verified 2026-05-15.


Next steps