Observe

Your AI invoice is one line item.
Your dashboard shouldn't be.

Every LLM call tagged by group, user, and model. See exactly which part of your product is driving your bill — in real time, with zero infrastructure to manage.

Start free →View quickstart

Real-time data

No proxy

Zero latency added

What you can see

Every dimension that matters.

Tag your calls once. The dashboard breaks down spend across every axis automatically.

Cost by group

Tag calls with any string — a feature name, team, workflow, or client. The dashboard shows total cost, avg cost per call, avg tokens, and latency for every group.

Cost by user

Rank every user by total spend. The top 3% often drive 35% of your bill. Click any user to see their full call history, cost by group, and spend over time.

Cost by model

See which models are running and what they cost. Compare GPT-4o vs Claude Sonnet side-by-side — tokens, latency, and dollars in one view.

Wasted spend

Calls that returned zero output tokens — errors, timeouts, empty responses. Flagged automatically on every call log row. Know your real effective cost.

Δv

Deploy deltas

Tag your deploys with a promptVersion string. See exactly how each prompt change affected cost per call, token usage, and latency — before and after, side by side.

⏱

Latency tracking

Wall-clock ms from call initiation to response complete, captured on every event. See avg latency per group and spot slowdowns tied to model or prompt changes.

How it works

One import.
Everything changes.

The SDK wraps your existing LLM client. Your API calls go directly to the provider — no proxy, no extra hop. After the call resolves, the SDK reads token counts and metadata from the response object and posts the event asynchronously. Zero latency added. Zero new failure modes.

Install the SDK

npm install @llmcosttracker/sdk — no other dependencies.

Wrap your call

Replace your existing call with trackedCall(). Same return value, nothing else changes.

Tag with a group

Pass group: "search" — or any string. Your first insight appears within seconds.

search.ts

// Before — no visibility
const res = await anthropic.messages.create((params)

// After — full cost attribution
import { trackedCall } from '@llmcosttracker/sdk'

const res = await trackedCall(({
  client: anthropic,
  group: 'search',
  userId: session.userId,
  apiKey: process.env.LLMCOSTTRACKER_API_KEY,
  params,
})

// res is identical — nothing else changes

Real origin story

"Within 10 minutes of turning it on, we spotted a 5× cost variance between two searches."

DGA search

$0.0123

3,700 tokens

Teamsters search

$0.0608

18,500 tokens

Brian

Founder, contractclues.com

The insight that started this

You can't fix what you can't see.

contractclues.com was a RAG-based contract reference tool for film/TV production. The Anthropic bill kept climbing. Without per-call attribution, there was no way to know why.

Ten minutes after adding cost tracking, the answer was obvious: one search was pulling 18,500 input tokens against another's 3,700. Same search feature, 5× the cost. A context window issue that would have been invisible forever without per-call data.

Find your variance →

Supported providers

Works with every major LLM provider.

One SDK. Automatic cost calculation for every model, every provider.

Anthropic

Claude Opus, Sonnet, Haiku

OpenAI

GPT-4o, GPT-4 Turbo, o3, o4-mini

Google

Gemini 2.5 Pro, Flash, Flash-Lite

xAI

Grok 4, Grok Code

DeepSeek

DeepSeek V4 Flash, Pro

See your first insight in 5 minutes.

One import. No infrastructure. Your LLM costs broken down by group, user, and model before your next standup.