-
Notifications
You must be signed in to change notification settings - Fork 69
feat(google): track and expose cached input tokens #1077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Extract and parse cachedContentTokenCount from usageMetadata - Include cached_tokens details in transformStreamingToOpenai output This enables detailed tracking of cached content tokens for better usage analytics. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
❗ Preview Environment deployment failed on BunnyshellSee: Environment Details | Pipeline Logs Available commands (reply to this comment):
|
WalkthroughThe pull request extends token usage tracking across three token-handling modules to capture Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
apps/gateway/src/chat/tools/extract-token-usage.ts(1 hunks)apps/gateway/src/chat/tools/parse-provider-response.ts(1 hunks)apps/gateway/src/chat/tools/transform-streaming-to-openai.ts(3 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Always use top-level import; never use require() or dynamic import()
Files:
apps/gateway/src/chat/tools/parse-provider-response.tsapps/gateway/src/chat/tools/extract-token-usage.tsapps/gateway/src/chat/tools/transform-streaming-to-openai.ts
apps/{gateway,api}/**/*.ts
📄 CodeRabbit inference engine (AGENTS.md)
apps/{gateway,api}/**/*.ts: Use Hono for HTTP routing in Gateway and API services
Use Zod schemas for request/response validation in server routes
Files:
apps/gateway/src/chat/tools/parse-provider-response.tsapps/gateway/src/chat/tools/extract-token-usage.tsapps/gateway/src/chat/tools/transform-streaming-to-openai.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Never useanyoras anyin this TypeScript project unless absolutely necessary
Always use top-levelimport; do not userequireor dynamicimport()
Files:
apps/gateway/src/chat/tools/parse-provider-response.tsapps/gateway/src/chat/tools/extract-token-usage.tsapps/gateway/src/chat/tools/transform-streaming-to-openai.ts
{apps/{api,gateway}/**/*.ts,packages/db/**/*.ts}
📄 CodeRabbit inference engine (CLAUDE.md)
For read operations, use
db().query.<table>.findMany()ordb().query.<table>.findFirst()
Files:
apps/gateway/src/chat/tools/parse-provider-response.tsapps/gateway/src/chat/tools/extract-token-usage.tsapps/gateway/src/chat/tools/transform-streaming-to-openai.ts
🔇 Additional comments (2)
apps/gateway/src/chat/tools/extract-token-usage.ts (1)
27-27: LGTM! Correct null coalescing operator used.The use of
?? nullcorrectly preserves0as a valid cached token count, only defaulting tonullwhen the field isundefinedornull. This is consistent with the surrounding token extraction logic (lines 23, 24, 26).apps/gateway/src/chat/tools/transform-streaming-to-openai.ts (1)
299-303: LGTM! Consistent pattern for optional token details.The conditional spreading correctly follows the existing pattern used for
reasoning_tokens(lines 296-298, 350-352, 390-392), where optional fields are only included when present and non-zero. This maintains backward compatibility and aligns with OpenAI's convention of omitting optional fields when not applicable.The identical block appears in three streaming contexts (content delta, finish reason, and fallback), which is appropriate given the different scenarios being handled.
Also applies to: 353-357, 393-397
| promptTokens = json.usageMetadata?.promptTokenCount || null; | ||
| completionTokens = json.usageMetadata?.candidatesTokenCount || null; | ||
| reasoningTokens = json.usageMetadata?.thoughtsTokenCount || null; | ||
| cachedTokens = json.usageMetadata?.cachedContentTokenCount || null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent nullish handling - use ?? null instead of || null.
The || null operator treats 0 as falsy and converts it to null, which is incorrect for token counts. A cached token count of 0 (no cached tokens) is distinct from null (not available) and should be preserved for accurate token accounting.
In extract-token-usage.ts line 27, the same field extraction correctly uses ?? null. Consider aligning this file with that pattern.
Apply this diff:
- cachedTokens = json.usageMetadata?.cachedContentTokenCount || null;
+ cachedTokens = json.usageMetadata?.cachedContentTokenCount ?? null;Note: This same pattern appears throughout this file (lines 181-183, 215-218, etc.), so consider addressing it more broadly for consistency with extract-token-usage.ts.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cachedTokens = json.usageMetadata?.cachedContentTokenCount || null; | |
| cachedTokens = json.usageMetadata?.cachedContentTokenCount ?? null; |
🤖 Prompt for AI Agents
In apps/gateway/src/chat/tools/parse-provider-response.ts around line 184 (and
similarly lines ~181-183, 215-218), the code uses `|| null` which converts valid
zero counts to null; change these uses to the nullish coalescing operator `??
null` so that 0 is preserved while undefined/null become null, and scan the file
to replace other occurrences of `|| null` for token-count or usage fields to
match the pattern used in extract-token-usage.ts.
Summary
Changes
Token extraction
Provider response parsing
Streaming payload transformation
Rationale
Test plan
🌿 Generated by Terry
ℹ️ Tag @terragon-labs to ask questions and address PR feedback
📎 Task: https://www.terragonlabs.com/task/637c3997-7ae3-43a6-9fdf-e95d69b2bfeb
Summary by CodeRabbit
New Features