Enforce

Set a spend limit per user or tier.
The SDK enforces it automatically.

SDK-level spend enforcement lets you set hard limits per user and per tier — enforced before the API call is made, with zero infrastructure to manage and no new failure modes introduced.

Start free →View quickstart

Pre-call enforcement

No infrastructure

Zero new failure modes

The problem

Alerts tell you after the fact.
Enforcement stops it before it happens.

A single misconfigured prompt, a retry storm, or a heavy free-tier user can blow a monthly budget in hours. You need a ceiling, not just a notification.

Problem 01

Provider alerts fire too late

Anthropic and OpenAI alert you when your account-level bill hits a threshold — after the spend has already happened. By the time the email arrives, the damage is done. You need enforcement at the call level, not the invoice level.

Problem 02

Building it yourself takes weeks

A spend counter, a reset job, a per-user lookup on every call, a dashboard to monitor it, an alerting system — that's a small project on its own. It doesn't ship features. And you'll rebuild it every time your pricing model changes.

Problem 03

Free tier abuse is invisible without it

Free users explore aggressively. Without per-user enforcement, one power user on a free plan can generate more LLM spend than ten paying customers. You won't see it until the bill arrives.

Three actions

Choose how enforcement behaves.

Every limit has an action. Start with warn while you're getting set up. Move to block once you've added error handling.

Start here

warn

Call proceeds. You get notified.

The LLM call goes through. The onBudgetWarning callback fires with current spend, limit, and budget config. An enforcement event is logged in your dashboard. Recommended default — safe to deploy without any error handling changes.

When to use: Start here. Deploy warn first, monitor the dashboard, then switch to block once you've validated your config.

block

Call is stopped. Error is thrown.

The LLM call never happens. LLMBudgetExceededError is thrown before the API request is made. Must be caught in your application code — return a graceful message to your user. The call costs you nothing.

When to use: Use once you've added LLMBudgetExceededError handling. The safest option for free-tier users where margin matters.

dry_run

Call proceeds. Console warning logged.

Behaves like warn but logs to console instead of firing the callback. The call always goes through. No error thrown. Use this in staging or when validating a new budget config before going live in production.

When to use: Use in staging or when you want to validate that your budget config is correct before enabling real enforcement.

handling-blocks.tsaction: block

import { trackedCall, LLMBudgetExceededError } from '@llmcosttracker/sdk'

try {
  const response = await trackedCall(({
    client, params, apiKey,
    userId: session.userId,
    tier: session.plan, // enforcement looks up this tier's limit
  }))
  return response
} catch (err) {
  if (err instanceof LLMBudgetExceededError) {
    // call never happened — return gracefully
    return { error: 'limit_reached', message: 'Monthly AI limit reached.' }
  }
  throw err // re-throw anything else
}

Precedence model

Configure once per tier.
Override per user.

Tier templates apply to every user on that tier automatically — configure free at $5/month and every free user gets that limit without any per-user setup.

When a specific user needs a different limit — a power user you want to accommodate, a suspicious account you want to cap tighter — create a per-user override. It takes precedence over the tier template automatically, with no code changes required.

Per-user budget

Exact userId match — highest priority

Tier template

Matches the tier tag on the call

No enforcement

Call proceeds normally

Tier templates

TierLimit / moAction

free$5.00block

pro$50.00warn

enterprise$500.00dry_run

Per-user overridesoverrides tier template

User IDLimit / moAction

user_4421abuse suspect

$0.50block

user_0089power user

$200.00warn

The safety guarantee

Our enforcement failing never
breaks your app.

Enforcement is pre-call and local. If our service has an issue, your LLM calls proceed normally — we fail open, not closed.

Enforcement is pre-call

The spend check happens in the SDK before the API request is made. If the limit is reached, the call is blocked locally — it never touches the provider. If the limit is not reached, the call goes directly to the provider as normal.

Our downtime is never your downtime

Spend counters are maintained in memory in the SDK. They sync to the dashboard asynchronously. If our servers are unreachable, enforcement still works — and your LLM calls still go through. We are never in your critical path.

Logging failures are silent

Cost events are posted to the dashboard asynchronously after the call resolves. If that post fails for any reason, the error is caught silently. A logging failure will never throw, never surface to your users, and never affect the response.

How to add it

One field.
Full enforcement.

If you're already passing userId on your calls, add tier and you're done. Configure the limits in the dashboard — no deploy required to change them.

Configure limits in the dashboard

Go to Enforcement → New template. Set a tier label, spend limit, window, and action. Takes 60 seconds.

Add the tier tag to your calls

Pass tier: session.plan on every trackedCall. That's the only code change required.

Start with warn

Deploy with action: warn first. Monitor the enforcement log. Add LLMBudgetExceededError handling. Then switch to block.

Override per user as needed

Any user that needs a different limit gets a per-user override in the dashboard. No code changes, no deploys.

Step 1 · warnSafe to deploy now

await trackedCall(({
  client, params, apiKey,
  userId: session.userId,
  tier: session.plan,
  // no other changes needed
})

Configure the tier template in the dashboard. Calls proceed, onBudgetWarning fires, events log. Monitor for a few days.

once you've validated

Step 2 · blockAdd error handling first

try {
  await trackedCall(({ ..., tier: session.plan }))
} catch (err) {
  if (err instanceof LLMBudgetExceededError)
    return { error: 'limit_reached' }
  throw err
}

Switch the dashboard config to block. No redeploy of your app needed — the SDK picks up the new action automatically.

Stop the next runaway bill
before it starts.

Start with warn. Add your tier templates. Switch to block when you're ready. The whole setup takes under 10 minutes.

Start free →Read the docs

Set a spend limit per user or tier.The SDK enforces it automatically.

Alerts tell you after the fact.Enforcement stops it before it happens.

Choose how enforcement behaves.

Configure once per tier.Override per user.

Our enforcement failing neverbreaks your app.

One field.Full enforcement.