feat(google): track and expose cached input tokens #1077

steebchen · 2025-10-26T18:28:00Z

Summary

Tracks and exposes cached input tokens (cachedContentTokenCount) for Google model usage
Propagates cached tokens through extraction, parsing, and payload transformation

Changes

Token extraction

Updated extract-token-usage.ts to read usageMetadata.cachedContentTokenCount and store it as cachedTokens

Provider response parsing

Updated parse-provider-response.ts to read json.usageMetadata.cachedContentTokenCount and store it as cachedTokens

Streaming payload transformation

Updated transform-streaming-to-openai.ts to include prompt_tokens_details.cached_tokens when a cached token count is present, across all relevant payload branches

Rationale

Improves accuracy of token accounting by including cached content tokens from Google models

Test plan

Verify that when Google's model returns cachedContentTokenCount, it is captured in token usage stats
Verify OpenAI payload includes prompt_tokens_details.cached_tokens when available
Ensure behavior remains unchanged when cachedContentTokenCount is absent

🌿 Generated by Terry

ℹ️ Tag @terragon-labs to ask questions and address PR feedback

📎 Task: https://www.terragonlabs.com/task/637c3997-7ae3-43a6-9fdf-e95d69b2bfeb

Summary by CodeRabbit

New Features

Token usage tracking now includes cached content token counts from Google and Vertex AI providers, exposing cached token information alongside existing prompt and total token counts in usage metrics.

- Extract and parse cachedContentTokenCount from usageMetadata - Include cached_tokens details in transformStreamingToOpenai output This enables detailed tracking of cached content tokens for better usage analytics. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

bunnyshell · 2025-10-26T18:28:07Z

❗ Preview Environment deployment failed on Bunnyshell

See: Environment Details | Pipeline Logs

Available commands (reply to this comment):

🚀 /bns:deploy to redeploy the environment
❌ /bns:delete to remove the environment

coderabbitai · 2025-10-26T18:28:10Z

Walkthrough

The pull request extends token usage tracking across three token-handling modules to capture cachedContentTokenCount from Google and Vertex AI provider responses, making cached token metrics available in token usage metadata and OpenAI-formatted streaming responses.

Changes

Cohort / File(s)	Summary
Google cached token count support `apps/gateway/src/chat/tools/extract-token-usage.ts`, `parse-provider-response.ts`, `transform-streaming-to-openai.ts`	Adds extraction of `cachedContentTokenCount` from Google provider metadata and surfaces it as `cachedTokens` in token usage structures. In `transform-streaming-to-openai.ts`, cached token details are conditionally embedded into `usage.tokens.prompt_tokens_details.cached_tokens` when the field is present in usage metadata.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Changes follow a consistent, homogeneous pattern of extracting and passing through a new field across the token pipeline
All modifications are additive and non-breaking, gated on field existence
Low logic density with straightforward data flow transformations

Possibly related PRs

refactor(token): use nullish coalescing #861: Modifies token extraction logic in extractTokenUsage for handling token field assignments and defaults
feat(chat): simplify Google streaming data parsing #536: Updates chat token-usage extraction and Google streaming parsing paths to handle Google-specific token fields
feat(models): enhance Google streaming and token support #533: Modifies the same token extraction files to add support for provider-specific token metrics (reasoning tokens)

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "feat(google): track and expose cached input tokens" directly and accurately summarizes the main change across all three modified files. The changes consistently focus on capturing cached token counts (cachedContentTokenCount) from Google's responses and propagating this data through the token extraction, parsing, and streaming transformation layers. The title is concise, uses conventional commit format, includes the relevant provider scope (Google), and clearly conveys the primary objective without vagueness or unnecessary details.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch terragon/fix-google-model-cached-tokens-71qih2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09f8a8f and 4fe1e56.

📒 Files selected for processing (3)

apps/gateway/src/chat/tools/extract-token-usage.ts (1 hunks)
apps/gateway/src/chat/tools/parse-provider-response.ts (1 hunks)
apps/gateway/src/chat/tools/transform-streaming-to-openai.ts (3 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

Always use top-level import; never use require() or dynamic import()

Files:

apps/gateway/src/chat/tools/parse-provider-response.ts
apps/gateway/src/chat/tools/extract-token-usage.ts
apps/gateway/src/chat/tools/transform-streaming-to-openai.ts

apps/{gateway,api}/**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

apps/{gateway,api}/**/*.ts: Use Hono for HTTP routing in Gateway and API services
Use Zod schemas for request/response validation in server routes

Files:

apps/gateway/src/chat/tools/parse-provider-response.ts
apps/gateway/src/chat/tools/extract-token-usage.ts
apps/gateway/src/chat/tools/transform-streaming-to-openai.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Never use any or as any in this TypeScript project unless absolutely necessary
Always use top-level import; do not use require or dynamic import()

Files:

apps/gateway/src/chat/tools/parse-provider-response.ts
apps/gateway/src/chat/tools/extract-token-usage.ts
apps/gateway/src/chat/tools/transform-streaming-to-openai.ts

{apps/{api,gateway}/**/*.ts,packages/db/**/*.ts}

📄 CodeRabbit inference engine (CLAUDE.md)

For read operations, use db().query.<table>.findMany() or db().query.<table>.findFirst()

Files:

apps/gateway/src/chat/tools/parse-provider-response.ts
apps/gateway/src/chat/tools/extract-token-usage.ts
apps/gateway/src/chat/tools/transform-streaming-to-openai.ts

🔇 Additional comments (2)

apps/gateway/src/chat/tools/extract-token-usage.ts (1)

27-27: LGTM! Correct null coalescing operator used.

The use of ?? null correctly preserves 0 as a valid cached token count, only defaulting to null when the field is undefined or null. This is consistent with the surrounding token extraction logic (lines 23, 24, 26).

apps/gateway/src/chat/tools/transform-streaming-to-openai.ts (1)

299-303: LGTM! Consistent pattern for optional token details.

The conditional spreading correctly follows the existing pattern used for reasoning_tokens (lines 296-298, 350-352, 390-392), where optional fields are only included when present and non-zero. This maintains backward compatibility and aligns with OpenAI's convention of omitting optional fields when not applicable.

The identical block appears in three streaming contexts (content delta, finish reason, and fallback), which is appropriate given the different scenarios being handled.

Also applies to: 353-357, 393-397

coderabbitai · 2025-10-26T18:30:53Z

apps/gateway/src/chat/tools/parse-provider-response.ts

 			promptTokens = json.usageMetadata?.promptTokenCount || null;
 			completionTokens = json.usageMetadata?.candidatesTokenCount || null;
 			reasoningTokens = json.usageMetadata?.thoughtsTokenCount || null;
+			cachedTokens = json.usageMetadata?.cachedContentTokenCount || null;


⚠️ Potential issue | 🟡 Minor

Inconsistent nullish handling - use ?? null instead of || null.

The || null operator treats 0 as falsy and converts it to null, which is incorrect for token counts. A cached token count of 0 (no cached tokens) is distinct from null (not available) and should be preserved for accurate token accounting.

In extract-token-usage.ts line 27, the same field extraction correctly uses ?? null. Consider aligning this file with that pattern.

Apply this diff:

- cachedTokens = json.usageMetadata?.cachedContentTokenCount || null; + cachedTokens = json.usageMetadata?.cachedContentTokenCount ?? null;

Note: This same pattern appears throughout this file (lines 181-183, 215-218, etc.), so consider addressing it more broadly for consistency with extract-token-usage.ts.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cachedTokens = json.usageMetadata?.cachedContentTokenCount || null;

cachedTokens = json.usageMetadata?.cachedContentTokenCount ?? null;

🤖 Prompt for AI Agents

In apps/gateway/src/chat/tools/parse-provider-response.ts around line 184 (and similarly lines ~181-183, 215-218), the code uses `|| null` which converts valid zero counts to null; change these uses to the nullish coalescing operator `?? null` so that 0 is preserved while undefined/null become null, and scan the file to replace other occurrences of `|| null` for token-count or usage fields to match the pattern used in extract-token-usage.ts.

github-actions bot changed the title ~~Track and expose cached input tokens for Google model~~ feat(google): track and expose cached input tokens Oct 26, 2025

coderabbitai bot reviewed Oct 26, 2025

View reviewed changes

Merge branch 'main' into terragon/fix-google-model-cached-tokens-71qih2

f1102d0

steebchen force-pushed the main branch from d27bb7e to e6e72f4 Compare November 17, 2025 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(google): track and expose cached input tokens #1077

feat(google): track and expose cached input tokens #1077

steebchen commented Oct 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

bunnyshell bot commented Oct 26, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 26, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	cachedTokens = json.usageMetadata?.cachedContentTokenCount \|\| null;
	cachedTokens = json.usageMetadata?.cachedContentTokenCount ?? null;

feat(google): track and expose cached input tokens #1077

Are you sure you want to change the base?

feat(google): track and expose cached input tokens #1077

Conversation

steebchen commented Oct 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Token extraction

Provider response parsing

Streaming payload transformation

Rationale

Test plan

Summary by CodeRabbit

New Features

Uh oh!

bunnyshell bot commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❗ Preview Environment deployment failed on Bunnyshell

Uh oh!

coderabbitai bot commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

steebchen commented Oct 26, 2025 •

edited by coderabbitai bot

Loading

bunnyshell bot commented Oct 26, 2025 •

edited

Loading

coderabbitai bot commented Oct 26, 2025 •

edited

Loading