Skip to content

Conversation

@b4s36t4
Copy link
Contributor

@b4s36t4 b4s36t4 commented Apr 24, 2025

Code Quality type: new feature

Author Description

Description

🔄 What Changed

This PR adds support for prompt caching in Bedrock models by refactoring the existing Anthropic caching mechanism to be more generic and implementing cache point markers and token usage tracking for Bedrock providers.

🔍 Impact of the Change

Enables prompt caching for Bedrock models which can improve performance and reduce token usage costs by allowing specific parts of prompts to be cached and tracking cache read/write token usage.

📁 Total Files Changed

3 files modified with 92 additions and 26 deletions:

  • src/providers/anthropic/chatComplete.ts
  • src/providers/bedrock/chatComplete.ts
  • src/types/requestBody.ts

🧪 Test Added

N/A - No tests were added in this PR.

🔒 Security Vulnerabilities

N/A - No security vulnerabilities were introduced.

Motivation

To extend the existing caching mechanism to work with Bedrock providers, improving performance and reducing costs.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)

How Has This Been Tested?

  • Unit Tests
  • Integration Tests
  • Manual Testing

Screenshots (if applicable)

N/A

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Related Issues

N/A

Quality Recommendations

  1. Add unit tests to verify the caching functionality works as expected

  2. Add documentation comments explaining the caching mechanism and how to use it

  3. Add error handling for potential issues with cache point insertion

  4. Consider adding a validation function to ensure cache_control is properly formatted

Sequence Diagram

sequenceDiagram
    participant Client
    participant Gateway
    participant BedrockProvider
    participant AnthropicProvider
    participant ModelAPI

    Client->>Gateway: Send request with cache_control
    Note over Gateway: Request contains messages or tools with cache_control
    
    alt Anthropic Provider
        Gateway->>AnthropicProvider: Process request
        AnthropicProvider->>AnthropicProvider: Transform messages with PromptCache
        AnthropicProvider->>ModelAPI: Send request with cache points
    else Bedrock Provider
        Gateway->>BedrockProvider: Process request
        BedrockProvider->>BedrockProvider: getMessageTextContentArray() with cache points
        BedrockProvider->>BedrockProvider: getMessageContent() with cache points
        BedrockProvider->>BedrockProvider: Transform tools with cache points
        BedrockProvider->>ModelAPI: Send request with cache points
        ModelAPI->>BedrockProvider: Return response with cache token usage
        BedrockProvider->>BedrockProvider: BedrockChatCompleteResponseTransform()
        Note over BedrockProvider: Add cache_read_input_tokens and cache_creation_input_tokens
        BedrockProvider->>Gateway: Return formatted response with cache metrics
    end
    
    Gateway->>Client: Return response with cache usage information
Loading

@b4s36t4
Copy link
Contributor Author

b4s36t4 commented Apr 24, 2025

/matter review

@matter-code-review
Copy link
Contributor

I encountered an error while trying to review this PR. Please try again later.

@VisargD VisargD merged commit 51430c9 into Portkey-AI:main Apr 25, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants