Skip to content

Fix/cohere embed handle null encoding format #1213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

narengogi
Copy link
Collaborator

@narengogi narengogi commented Jul 7, 2025

#1212

Testing done:

  • made requests to cohere embeddings with the following values for encoding_format
    • null
    • no param
    • "float"

Copy link

matter-code-review bot commented Jul 7, 2025

Code Quality

Description

Summary By MatterAI MatterAI logo

🔄 What Changed

This pull request modifies the transform function within the encoding_format configuration for BedrockCohereEmbedConfig, BedrockTitanEmbedConfig, and CohereEmbedConfig. The change specifically enhances the handling of the encoding_format parameter. Previously, it primarily expected an array or implicitly converted a single string. The updated logic now explicitly checks if params.encoding_format is an array or a string. If it's a string, it's wrapped in an array. If it's neither an array nor a string (e.g., null or undefined), the function now correctly returns undefined, aligning with the updated return type string[] | undefined.

🔍 Impact of the Change

This change improves the robustness and type safety of the embedding configurations. It prevents potential runtime errors or unexpected behavior when the encoding_format parameter is provided as a single string or is null/undefined. By explicitly handling these cases, the system becomes more resilient to varied input formats, ensuring consistent and predictable behavior for embedding requests.

📁 Total Files Changed

3 files were modified in this pull request.

🧪 Test Added

While no specific tests were detailed in the PR description, this type of bug fix typically requires unit tests to verify the correct handling of different encoding_format input types (array, string, null, undefined). It's assumed that existing or new unit tests would cover these scenarios to ensure the fix is effective and does not introduce regressions.

🔒Security Vulnerabilities

No security vulnerabilities were detected in the changes introduced by this pull request.

Motivation

This change addresses an issue where the encoding_format parameter might not be consistently handled when provided as a single string or when it's null/undefined, leading to potential runtime issues or incorrect behavior in embedding requests.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)

How Has This Been Tested?

  • Unit Tests
  • Integration Tests
  • Manual Testing

Screenshots (if applicable)

N/A

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Related Issues

#1212

Sequence Diagram

sequenceDiagram
participant EmbedAPI as "Embed API Caller"
participant TransformFunction as "encoding_format.transform"

EmbedAPI->>+TransformFunction: transform(params: { encoding_format: any })
Note over TransformFunction: Handles 'embedding_types' or 'embeddingTypes'

alt params.encoding_format is Array
    TransformFunction-->>-EmbedAPI: string[] (e.g., ["int8", "binary"])
else params.encoding_format is String
    TransformFunction-->>-EmbedAPI: string[] (e.g., ["int8"])
else params.encoding_format is Null/Undefined/Other
    TransformFunction-->>-EmbedAPI: undefined
end

Copy link

@matter-code-review matter-code-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixes an important issue with handling string encoding_format values in both Bedrock and Cohere embedding providers. The changes look good overall, but I've identified a few improvements to make the code more robust.

@narengogi narengogi requested a review from VisargD July 7, 2025 07:13
@narengogi narengogi linked an issue Jul 9, 2025 that may be closed by this pull request
@VisargD VisargD merged commit b8d0ba5 into Portkey-AI:main Jul 10, 2025
1 of 2 checks passed
Copy link

Important

PR Review Skipped

PR review skipped as per the configuration setting. Run a manually review by commenting /matter review

💡Tips to use Matter AI

Command List

  • /matter summary: Generate AI Summary for the PR
  • /matter review: Generate AI Reviews for the latest commit in the PR
  • /matter review-full: Generate AI Reviews for the complete PR
  • /matter release-notes: Generate AI release-notes for the PR
  • /matter : Chat with your PR with Matter AI Agent
  • /matter remember : Generate AI memories for the PR
  • /matter explain: Get an explanation of the PR
  • /matter help: Show the list of available commands and documentation
  • Need help? Join our Discord server: https://discord.gg/fJU5DvanU3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nulls are not handled properly for cohere encoding_format
2 participants