Budget Tracker Patterns
Awareness-based tracking for AI generation tool calls. Tracks context consumption, detects runaway patterns, provides visibility. Does NOT block tool calls - existing step limits handle that.
BudgetState Interface
Core state tracked during generation:
typescript
1// From packages/backend/convex/lib/budgetTracker.ts
2export interface BudgetState {
3 maxTokens: number;
4 usedTokens: number;
5 toolCallCount: number;
6 searchHistory: Array<{
7 query: string;
8 resultCount: number;
9 topScore: number;
10 }>;
11 toolCallCounts: Record<string, number>; // Per-tool counts for rate limiting
12}
Initialize:
typescript
1const budget = createBudgetState(maxTokens);
Update immutably:
typescript
1budget = recordUsage(budget, tokens);
2budget = recordToolCall(budget, toolName);
3budget = recordSearch(budget, query, resultCount, topScore);
Wall-clock timeout wrapper prevents stuck tools:
typescript
1// From packages/backend/convex/lib/budgetTracker.ts
2export async function withTimeout<T>(
3 promise: Promise<T>,
4 timeoutMs: number,
5 operation: string,
6): Promise<T> {
7 let timeoutId: ReturnType<typeof setTimeout>;
8
9 const timeoutPromise = new Promise<never>((_, reject) => {
10 timeoutId = setTimeout(() => {
11 reject(new TimeoutError(operation, timeoutMs));
12 }, timeoutMs);
13 });
14
15 try {
16 const result = await Promise.race([promise, timeoutPromise]);
17 clearTimeout(timeoutId!);
18 return result;
19 } catch (error) {
20 clearTimeout(timeoutId!);
21 throw error;
22 }
23}
Tool timeouts (milliseconds):
typescript
1export const TOOL_TIMEOUTS: Record<string, number> = {
2 searchAll: 30000, // 30s - multiple parallel searches
3 searchFiles: 15000,
4 searchNotes: 15000,
5 searchTasks: 15000,
6 searchKnowledgeBank: 15000,
7 queryHistory: 15000,
8 urlReader: 120000, // 2min - external fetch slow
9 codeExecution: 120000, // 2min - code execution takes time
10 youtubeVideo: 300000, // 5min - video processing slow
11 weather: 60000, // 1min - external API
12 calculator: 5000,
13 datetime: 1000,
14 default: 30000,
15};
Usage:
typescript
1const timeout = getToolTimeout(toolName);
2const result = await withTimeout(toolPromise, timeout, `${toolName} call`);
Prevent tool abuse while allowing reasonable usage:
typescript
1// From packages/backend/convex/lib/budgetTracker.ts
2const TOOL_RATE_LIMITS: Record<string, number> = {
3 searchAll: 5,
4 searchFiles: 5,
5 searchNotes: 5,
6 searchTasks: 5,
7 searchKnowledgeBank: 5,
8 queryHistory: 5,
9 urlReader: 3,
10 codeExecution: 2,
11 weather: 3,
12 default: 10,
13};
Check before calling:
typescript
1const { limited, message } = isToolRateLimited(budget, toolName);
2if (limited) {
3 return { error: message };
4}
5budget = recordToolCall(budget, toolName);
Returns:
typescript
1{
2 limited: true,
3 message: "searchAll limit reached (5/5). Try a different approach."
4}
Token Estimation
Rough estimates for context tracking (chars/4 approximation):
typescript
1// From packages/backend/convex/lib/budgetTracker.ts
2const TOOL_TOKEN_ESTIMATES: Record<string, number> = {
3 searchAll: 800,
4 searchFiles: 400,
5 searchNotes: 300,
6 searchTasks: 300,
7 searchKnowledgeBank: 500,
8 queryHistory: 400,
9 urlReader: 1500,
10 codeExecution: 600,
11 calculator: 100,
12 datetime: 50,
13 weather: 200,
14 default: 300,
15};
Get estimate before tool execution:
typescript
1const estimatedCost = estimateToolCost(toolName);
Record actual usage after:
typescript
1const actualTokens = countTokens(toolResult);
2budget = recordUsage(budget, actualTokens);
Context Getting Full Detection
Tiered warnings based on usage:
typescript
1// From packages/backend/convex/lib/budgetTracker.ts
2export function formatStatus(state: BudgetState): string {
3 const percentUsed = getContextPercent(state);
4 const toolCount = state.toolCallCount;
5
6 if (percentUsed >= 70) {
7 return `[Budget Critical: ~${percentUsed}% context, ${toolCount} tools]
8Answer now with current info or ask user for clarification.`;
9 }
10
11 if (percentUsed >= 50) {
12 return `[Budget: ~${percentUsed}% context, ${toolCount} tools]
13Prioritize essential searches only.`;
14 }
15
16 return `[Context: ${toolCount} tool calls, ~${percentUsed}% of context used]`;
17}
Check programmatically:
typescript
1if (isContextGettingFull(budget)) {
2 // >= 50% used - inject warning into prompt
3}
Truncation Strategy
Preserve structure while managing context:
typescript
1// From packages/backend/convex/lib/budgetTracker.ts
2export const MIN_TOOL_CALLS_FOR_TRUNCATION = 2; // Don't truncate first tool call
3
4export function truncateToolResult(
5 result: unknown,
6 maxChars: number = 500,
7): unknown {
8 const str = JSON.stringify(result);
9 if (str.length <= maxChars) return result;
10
11 // Arrays: keep first 3 items
12 if (Array.isArray(result)) {
13 return result
14 .slice(0, 3)
15 .map((item) => truncateToolResult(item, Math.floor(maxChars / 3)));
16 }
17
18 // Strings: truncate with marker
19 if (typeof result === "string") {
20 return `${result.slice(0, maxChars)}... [truncated]`;
21 }
22
23 // Objects: truncate string values
24 if (typeof result === "object" && result !== null) {
25 const truncated: Record<string, unknown> = {};
26 const keys = Object.keys(result);
27 if (keys.length === 0) return result;
28 const charPerKey = Math.floor(maxChars / keys.length);
29 for (const key of keys) {
30 truncated[key] = truncateToolResult(
31 (result as Record<string, unknown>)[key],
32 charPerKey,
33 );
34 }
35 return truncated;
36 }
37
38 return result;
39}
Apply after MIN_TOOL_CALLS_FOR_TRUNCATION:
typescript
1if (budget.toolCallCount >= MIN_TOOL_CALLS_FOR_TRUNCATION) {
2 toolResult = truncateToolResult(toolResult);
3}
Search Quality Detection
Detect diminishing returns from repeated searches:
typescript
1// From packages/backend/convex/lib/budgetTracker.ts
2export const LOW_QUALITY_SCORE_THRESHOLD = 0.7;
3
4export function formatSearchWarning(state: BudgetState): string | null {
5 const { searchHistory } = state;
6 if (searchHistory.length === 0) return null;
7
8 const latest = searchHistory[searchHistory.length - 1];
9
10 // Repeated query
11 const isDuplicate = searchHistory
12 .slice(0, -1)
13 .some((h) => h.query.toLowerCase().trim() === latest.query.toLowerCase().trim());
14 if (isDuplicate) {
15 return `Already searched "${latest.query}". Try different terms or answer with current info.`;
16 }
17
18 // Decreasing quality (3+ searches)
19 if (searchHistory.length >= 3) {
20 const last3 = searchHistory.slice(-3).map((h) => h.topScore);
21 if (last3[0] > last3[1] && last3[1] > last3[2] && last3[2] < 0.5) {
22 return "Search quality declining. Consider different approach or ask user.";
23 }
24 }
25
26 // Many searches without good results
27 if (searchHistory.length >= 4) {
28 return "Multiple searches performed. Consider answering with current info.";
29 }
30
31 return null;
32}
Check for stuck patterns:
typescript
1export function shouldSuggestAskUser(state: BudgetState): boolean {
2 const { searchHistory } = state;
3
4 // 3+ searches with all low quality results
5 if (searchHistory.length >= 3) {
6 const recentScores = searchHistory.slice(-3).map((h) => h.topScore);
7 if (recentScores.every((s) => s < 0.7)) return true;
8 }
9
10 // Budget critical (>= 70%)
11 if (getContextPercent(state) >= 70) return true;
12
13 return false;
14}
Key Files
packages/backend/convex/lib/budgetTracker.ts - Budget tracking, timeouts, truncation
packages/backend/convex/generation/tools.ts - Tool building with budget integration
Pass budget state to tools for rate limiting:
typescript
1// From packages/backend/convex/generation/tools.ts
2export interface BuildToolsConfig {
3 ctx: ActionCtx;
4 userId: Id<"users">;
5 conversationId: Id<"conversations">;
6 budgetState?: {
7 current: BudgetState;
8 update: (newState: BudgetState) => void;
9 };
10 // ... other config
11}
12
13export function buildTools(config: BuildToolsConfig): Record<string, unknown> {
14 const { budgetState } = config;
15
16 // Pass budgetState to tools that need it (searchAll, etc.)
17 tools.searchAll = createSearchAllTool(
18 ctx,
19 userId,
20 conversationId,
21 searchCache,
22 budgetState, // Tool can check limits and update state
23 );
24}
Avoid
- Don't block tool calls based on budget - use for awareness only
- Don't skip timeout wrapper - prevents generation hangs
- Don't ignore rate limits - prevents tool abuse
- Don't truncate first tool call - need full initial context
- Don't ignore search quality warnings - indicates stuck patterns