Observe

Your AI invoice is one line item.
Your dashboard shouldn't be.

Every LLM call tagged by group, user, and model. See exactly which part of your product is driving your bill — in real time, with zero infrastructure to manage.

Real-time data
No proxy
Zero latency added
What you can see

Every dimension that matters.

Tag your calls once. The dashboard breaks down spend across every axis automatically.

Gr
Cost by group
Tag calls with any string — a feature name, team, workflow, or client. The dashboard shows total cost, avg cost per call, avg tokens, and latency for every group.
Us
Cost by user
Rank every user by total spend. The top 3% often drive 35% of your bill. Click any user to see their full call history, cost by group, and spend over time.
Md
Cost by model
See which models are running and what they cost. Compare GPT-4o vs Claude Sonnet side-by-side — tokens, latency, and dollars in one view.
W!
Wasted spend
Calls that returned zero output tokens — errors, timeouts, empty responses. Flagged automatically on every call log row. Know your real effective cost.
Δv
Deploy deltas
Tag your deploys with a promptVersion string. See exactly how each prompt change affected cost per call, token usage, and latency — before and after, side by side.
Latency tracking
Wall-clock ms from call initiation to response complete, captured on every event. See avg latency per group and spot slowdowns tied to model or prompt changes.
How it works

One import.
Everything changes.

The SDK wraps your existing LLM client. Your API calls go directly to the provider — no proxy, no extra hop. After the call resolves, the SDK reads token counts and metadata from the response object and posts the event asynchronously. Zero latency added. Zero new failure modes.

1
Install the SDK
npm install @llmcosttracker/sdk — no other dependencies.
2
Wrap your call
Replace your existing call with trackedCall(). Same return value, nothing else changes.
3
Tag with a group
Pass group: "search" — or any string. Your first insight appears within seconds.
search.ts
// Before — no visibility
const res = await anthropic.messages.create((params)

// After — full cost attribution
import { trackedCall } from '@llmcosttracker/sdk'

const res = await trackedCall(({
  client: anthropic,
  group: 'search',
  userId: session.userId,
  apiKey: process.env.LLMCOSTTRACKER_API_KEY,
  params,
})

// res is identical — nothing else changes
Real origin story

"Within 10 minutes of turning it on, we spotted a 5× cost variance between two searches."

DGA search
$0.0123
3,700 tokens
Teamsters search
$0.0608
18,500 tokens
B
Brian
Founder, contractclues.com
The insight that started this

You can't fix what you can't see.

contractclues.com was a RAG-based contract reference tool for film/TV production. The Anthropic bill kept climbing. Without per-call attribution, there was no way to know why.

Ten minutes after adding cost tracking, the answer was obvious: one search was pulling 18,500 input tokens against another's 3,700. Same search feature, 5× the cost. A context window issue that would have been invisible forever without per-call data.

Find your variance →
Supported providers

Works with every major LLM provider.

One SDK. Automatic cost calculation for every model, every provider.

Anthropic
Claude Opus, Sonnet, Haiku
OpenAI
GPT-4o, GPT-4 Turbo, o3, o4-mini
Google
Gemini 2.5 Pro, Flash, Flash-Lite
xAI
Grok 4, Grok Code
DeepSeek
DeepSeek V4 Flash, Pro
Meta
Llama 4 Scout, Maverick
Perplexity
Sonar, Sonar Pro, Deep Research
OpenRouter
Any OpenAI-compatible endpoint

Unknown models are still logged with token counts — cost shows as unknown.

See your first insight in 5 minutes.

One import. No infrastructure. Your LLM costs broken down by group, user, and model before your next standup.

Start free →View quickstart
© 2026 LLM COST TRACKERhello@llmcosttracker.com