Skip to content

feat: support bedrock prompt caching #1059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 25, 2025

Conversation

b4s36t4
Copy link
Contributor

@b4s36t4 b4s36t4 commented Apr 24, 2025

Code Quality type: new feature

Author Description

Description

🔄 What Changed

This PR adds support for prompt caching in Bedrock models by refactoring the existing Anthropic caching mechanism to be more generic and implementing cache point markers and token usage tracking for Bedrock providers.

🔍 Impact of the Change

Enables prompt caching for Bedrock models which can improve performance and reduce token usage costs by allowing specific parts of prompts to be cached and tracking cache read/write token usage.

📁 Total Files Changed

3 files modified with 92 additions and 26 deletions:

  • src/providers/anthropic/chatComplete.ts
  • src/providers/bedrock/chatComplete.ts
  • src/types/requestBody.ts

🧪 Test Added

N/A - No tests were added in this PR.

🔒 Security Vulnerabilities

N/A - No security vulnerabilities were introduced.

Motivation

To extend the existing caching mechanism to work with Bedrock providers, improving performance and reducing costs.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)

How Has This Been Tested?

  • Unit Tests
  • Integration Tests
  • Manual Testing

Screenshots (if applicable)

N/A

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Related Issues

N/A

Quality Recommendations

  1. Add unit tests to verify the caching functionality works as expected

  2. Add documentation comments explaining the caching mechanism and how to use it

  3. Add error handling for potential issues with cache point insertion

  4. Consider adding a validation function to ensure cache_control is properly formatted

Sequence Diagram

sequenceDiagram
    participant Client
    participant Gateway
    participant BedrockProvider
    participant AnthropicProvider
    participant ModelAPI

    Client->>Gateway: Send request with cache_control
    Note over Gateway: Request contains messages or tools with cache_control
    
    alt Anthropic Provider
        Gateway->>AnthropicProvider: Process request
        AnthropicProvider->>AnthropicProvider: Transform messages with PromptCache
        AnthropicProvider->>ModelAPI: Send request with cache points
    else Bedrock Provider
        Gateway->>BedrockProvider: Process request
        BedrockProvider->>BedrockProvider: getMessageTextContentArray() with cache points
        BedrockProvider->>BedrockProvider: getMessageContent() with cache points
        BedrockProvider->>BedrockProvider: Transform tools with cache points
        BedrockProvider->>ModelAPI: Send request with cache points
        ModelAPI->>BedrockProvider: Return response with cache token usage
        BedrockProvider->>BedrockProvider: BedrockChatCompleteResponseTransform()
        Note over BedrockProvider: Add cache_read_input_tokens and cache_creation_input_tokens
        BedrockProvider->>Gateway: Return formatted response with cache metrics
    end
    
    Gateway->>Client: Return response with cache usage information
Loading

@b4s36t4
Copy link
Contributor Author

b4s36t4 commented Apr 24, 2025

/matter review

Copy link

I encountered an error while trying to review this PR. Please try again later.

@VisargD VisargD merged commit 51430c9 into Portkey-AI:main Apr 25, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants