- 
                Notifications
    You must be signed in to change notification settings 
- Fork 772
anthropic pdf #1074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Merged
      
      
    
                
     Merged
            
            anthropic pdf #1074
Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    | @narengogi - Can you please fix the examples added in the description. Not sure if they were updated by Matter bot or were present in the original description. | 
| here are request bodies to test {
    "model": "claude-3-5-sonnet-20240620",
    "max_tokens": 200,
    "stream": false,
    "messages": [
        {
            "role": "system",
            "content": "Say Hi"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "what is in this pdf"
                },
                {
                    "type": "file",
                    "file": {
                        "file_url": "https://pdfobject.com/pdf/sample.pdf",
                        "mime_type": "application/pdf"
                    }
                }
            ]
        }
    ]
}pdf base64 {
    "model": "claude-3-5-sonnet-20240620",
    "max_tokens": 200,
    "stream": false,
    "messages": [
        {
            "role": "system",
            "content": "Say Hi"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "what is in this pdf"
                },
                {
                    "type": "file",
                    "file": {
                        "mime_type": "application/pdf",
                        "file_data": "BASE_64_PDF"
                    }
                }
            ]
        }
    ]
}plain text document {
    "model": "claude-3-5-sonnet-20240620",
    "max_tokens": 200,
    "stream": false,
    "messages": [
        {
            "role": "system",
            "content": "Say Hi"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "what is in this pdf"
                },
                {
                    "type": "file",
                    "file": {
                        "mime_type": "text/plain",
                        "file_data": "hello how are you sir"
                    }
                }
            ]
        }
    ]
} | 
              
                    VisargD
  
              
              approved these changes
              
                  
                    May 7, 2025 
                  
              
              
            
            
| Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
 | 
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Author Description
so couple of things in this PR,
I've added two new parameters in the chat completions request body file content part
I've tested with anthropic, vertex anthropic and bedrock anthropic
note: vertex anthropic does not support plain text and pdf urls, bedrock does not support plain text, and only supports s3urls
here are request bodies to test
pdf url
{ "model": "claude-3-5-sonnet-20240620", "max_tokens": 200, "stream": false, "messages": } ] }pdf base64
{ "model": "claude-3-5-sonnet-20240620", "max_tokens": 200, "stream": false, "messages": } ] }plain text document
{ "model": "claude-3-5-sonnet-20240620", "max_tokens": 200, "stream": false, "messages": } ] }Summary By MatterAI
🔄 What Changed
This PR adds support for PDF files in Anthropic and Bedrock providers by introducing two new parameters in the chat completions request body file content part:
file_urlandmime_type. The implementation supports both PDF URLs and base64-encoded PDFs, as well as plain text documents.🔍 Impact of the Change
This enhancement allows users to send PDF documents to Anthropic and Bedrock models for analysis. Different providers have varying support: Vertex Anthropic doesn't support plain text and PDF URLs, while Bedrock only supports S3 URLs for PDFs.
📁 Total Files Changed
3 files changed with 120 additions and 3 deletions:
src/providers/anthropic/chatComplete.ts: Added PDF handling for Anthropicsrc/providers/bedrock/chatComplete.ts: Added PDF handling for Bedrocksrc/types/requestBody.ts: Updated content type interfaces🧪 Test Added
Manual testing was performed with Anthropic, Vertex Anthropic, and Bedrock Anthropic providers using various request bodies (PDF URL, PDF base64, plain text).
🔒 Security Vulnerabilities
N/A
Type of Change
How Has This Been Tested?
Screenshots (if applicable)
N/A
Checklist
Related Issues
N/A
Quality Recommendations
Add error handling for unsupported file types or formats
Add validation for file_url and file_data to ensure they are properly formatted
Consider adding unit tests to verify the PDF handling functionality
Add documentation comments for the new interfaces and functions
Sequence Diagram
sequenceDiagram participant Client participant Gateway participant AnthropicProvider participant BedrockProvider Client->>Gateway: POST /chat/completions Note over Client,Gateway: Request with file_url or file_data alt Anthropic Provider Gateway->>AnthropicProvider: transformAndAppendFileContentItem() Note over AnthropicProvider: Process file content based on type alt PDF URL AnthropicProvider-->>AnthropicProvider: Create AnthropicUrlPdfContentItem Note over AnthropicProvider: { type: 'document', source: { type: 'url', url: file_url } } else PDF Base64 AnthropicProvider-->>AnthropicProvider: Create AnthropicBase64PdfContentItem Note over AnthropicProvider: { type: 'document', source: { type: 'base64', data: file_data, media_type: mime_type } } else Plain Text AnthropicProvider-->>AnthropicProvider: Create AnthropicPlainTextContentItem Note over AnthropicProvider: { type: 'document', source: { type: 'text', data: file_data, media_type: mime_type } } end else Bedrock Provider Gateway->>BedrockProvider: getMessageContent() Note over BedrockProvider: Process file content based on type alt File URL (S3) BedrockProvider-->>BedrockProvider: Create document with s3Location Note over BedrockProvider: { document: { format: fileFormat, name: UUID, source: { s3Location: { uri: file_url } } } } else File Data BedrockProvider-->>BedrockProvider: Create document with bytes Note over BedrockProvider: { document: { format: fileFormat, name: UUID, source: { bytes: file_data } } } end end Gateway-->>Client: Return completion response#1075