SDK

trackedStream()

Wraps a streaming LLM call and logs cost data once the stream completes. Supports Anthropic and OpenAI-compatible providers (OpenAI, DeepSeek, xAI, Perplexity). Returns the identical stream as the underlying call.

Token usage data isn't available until a stream completes. For Anthropic, LLM Cost Tracker uses stream.finalMessage() to capture usage after the last chunk. For OpenAI-compatible providers, it reads usage from the final chunk with stream_options: { include_usage: true }. Both happen automatically.

Usage

import { trackedStream } from '@llmcosttracker/sdk'const stream = await trackedStream({client: anthropic, group: 'chat', userId: session.userId, tier: session.plan, apiKey: 'lct_live_your_key_here', params: {model: 'claude-sonnet-4-6', messages, max_tokens: 1024, stream: true,},})// Use the stream exactly as normalfor await (const chunk of stream) {process.stdout.write(chunk.delta?.text ?? '')}

Parameters

Identical to trackedCall() — same options, same API key, same group, userId, tier, and budget tags.

Budget enforcement with streaming

Budget enforcement works the same way as with trackedCall() with one important difference — the spend counter is incremented after the stream completes, not before. This means enforcement checks the counter state at the time the call is initiated, not mid-stream.

In practice this means a user at 99% of their limit can start a stream. The block fires on their next call once the counter has been updated. If this matters for your use case, use action: "warn" and handle approaching limits proactively via onBudgetWarning.

If the budget action is block and the limit has already been exceeded, trackedStream() throws LLMBudgetExceededError before initiating the stream:

import { trackedStream, LLMBudgetExceededError } from '@llmcosttracker/sdk'try {const stream = await trackedStream({ client, params, apiKey, userId, tier })for await (const chunk of stream) {process.stdout.write(chunk.delta?.text ?? '')}} catch (err) {if (err instanceof LLMBudgetExceededError) {// return a graceful response to your userreturn { error: 'limit_reached', message: 'Monthly AI usage limit reached.' }}throw err}

See Handling blocks for full details.

Returns

The exact stream from the underlying provider call. Use it exactly as you would without tracking.

Google Gemini streaming is not yet supported by trackedStream(). Use trackedCall() for Gemini calls.

Next: Configuration →