Enforce

Set a spend limit per user or tier.
The SDK enforces it automatically.

SDK-level spend enforcement lets you set hard limits per user and per tier — enforced before the API call is made, with zero infrastructure to manage and no new failure modes introduced.

Pre-call enforcement
No infrastructure
Zero new failure modes
The problem

Alerts tell you after the fact.
Enforcement stops it before it happens.

A single misconfigured prompt, a retry storm, or a heavy free-tier user can blow a monthly budget in hours. You need a ceiling, not just a notification.

Problem 01
Provider alerts fire too late
Anthropic and OpenAI alert you when your account-level bill hits a threshold — after the spend has already happened. By the time the email arrives, the damage is done. You need enforcement at the call level, not the invoice level.
Problem 02
Building it yourself takes weeks
A spend counter, a reset job, a per-user lookup on every call, a dashboard to monitor it, an alerting system — that's a small project on its own. It doesn't ship features. And you'll rebuild it every time your pricing model changes.
Problem 03
Free tier abuse is invisible without it
Free users explore aggressively. Without per-user enforcement, one power user on a free plan can generate more LLM spend than ten paying customers. You won't see it until the bill arrives.
Three actions

Choose how enforcement behaves.

Every limit has an action. Start with warn while you're getting set up. Move to block once you've added error handling.

Start here
warn
Call proceeds. You get notified.
The LLM call goes through. The onBudgetWarning callback fires with current spend, limit, and budget config. An enforcement event is logged in your dashboard. Recommended default — safe to deploy without any error handling changes.
When to use: Start here. Deploy warn first, monitor the dashboard, then switch to block once you've validated your config.
block
Call is stopped. Error is thrown.
The LLM call never happens. LLMBudgetExceededError is thrown before the API request is made. Must be caught in your application code — return a graceful message to your user. The call costs you nothing.
When to use: Use once you've added LLMBudgetExceededError handling. The safest option for free-tier users where margin matters.
dry_run
Call proceeds. Console warning logged.
Behaves like warn but logs to console instead of firing the callback. The call always goes through. No error thrown. Use this in staging or when validating a new budget config before going live in production.
When to use: Use in staging or when you want to validate that your budget config is correct before enabling real enforcement.
handling-blocks.tsaction: block
import { trackedCall, LLMBudgetExceededError } from '@llmcosttracker/sdk'

try {
  const response = await trackedCall(({
    client, params, apiKey,
    userId: session.userId,
    tier: session.plan, // enforcement looks up this tier's limit
  }))
  return response
} catch (err) {
  if (err instanceof LLMBudgetExceededError) {
    // call never happened — return gracefully
    return { error: 'limit_reached', message: 'Monthly AI limit reached.' }
  }
  throw err // re-throw anything else
}
Precedence model

Configure once per tier.
Override per user.

Tier templates apply to every user on that tier automatically — configure free at $5/month and every free user gets that limit without any per-user setup.

When a specific user needs a different limit — a power user you want to accommodate, a suspicious account you want to cap tighter — create a per-user override. It takes precedence over the tier template automatically, with no code changes required.

1
Per-user budget
Exact userId match — highest priority
2
Tier template
Matches the tier tag on the call
3
No enforcement
Call proceeds normally
Tier templates
TierLimit / moAction
free$5.00block
pro$50.00warn
enterprise$500.00dry_run
Per-user overridesoverrides tier template
User IDLimit / moAction
user_4421abuse suspect
$0.50block
user_0089power user
$200.00warn
The safety guarantee

Our enforcement failing never
breaks your app.

Enforcement is pre-call and local. If our service has an issue, your LLM calls proceed normally — we fail open, not closed.

Enforcement is pre-call
The spend check happens in the SDK before the API request is made. If the limit is reached, the call is blocked locally — it never touches the provider. If the limit is not reached, the call goes directly to the provider as normal.
Our downtime is never your downtime
Spend counters are maintained in memory in the SDK. They sync to the dashboard asynchronously. If our servers are unreachable, enforcement still works — and your LLM calls still go through. We are never in your critical path.
Logging failures are silent
Cost events are posted to the dashboard asynchronously after the call resolves. If that post fails for any reason, the error is caught silently. A logging failure will never throw, never surface to your users, and never affect the response.
How to add it

One field.
Full enforcement.

If you're already passing userId on your calls, add tier and you're done. Configure the limits in the dashboard — no deploy required to change them.

1
Configure limits in the dashboard
Go to Enforcement → New template. Set a tier label, spend limit, window, and action. Takes 60 seconds.
2
Add the tier tag to your calls
Pass tier: session.plan on every trackedCall. That's the only code change required.
3
Start with warn
Deploy with action: warn first. Monitor the enforcement log. Add LLMBudgetExceededError handling. Then switch to block.
4
Override per user as needed
Any user that needs a different limit gets a per-user override in the dashboard. No code changes, no deploys.
Step 1 · warnSafe to deploy now
await trackedCall(({
  client, params, apiKey,
  userId: session.userId,
  tier: session.plan,
  // no other changes needed
})

Configure the tier template in the dashboard. Calls proceed, onBudgetWarning fires, events log. Monitor for a few days.

once you've validated
Step 2 · blockAdd error handling first
try {
  await trackedCall(({ ..., tier: session.plan }))
} catch (err) {
  if (err instanceof LLMBudgetExceededError)
    return { error: 'limit_reached' }
  throw err
}

Switch the dashboard config to block. No redeploy of your app needed — the SDK picks up the new action automatically.

Stop the next runaway bill
before it starts.

Start with warn. Add your tier templates. Switch to block when you're ready. The whole setup takes under 10 minutes.

Start free →Read the docs
© 2026 LLM COST TRACKERhello@llmcosttracker.com