Skip to content

Conversation

hopeyen
Copy link
Contributor

@hopeyen hopeyen commented Mar 11, 2025

Why are these changes needed?

https://eigenlabs.slack.com/archives/C06JZQHN5R7/p1741641849473049

Checks

  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@hopeyen hopeyen marked this pull request as ready for review March 11, 2025 13:56
@hopeyen hopeyen requested a review from jianoaix March 11, 2025 13:56
Copy link
Contributor

@jianoaix jianoaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to drop the s3 blobstore based existence check function?

if s.blobStore.CheckBlobExists(ctx, blobKey) {

// check if blob already exists
_, err = s.blobMetadataStore.GetBlobMetadata(ctx, blobKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetBlobMetadata will actually fetch the data, we may make it more efficient by just checking the existence of keys without fetching data.
I think DynamoDB can do this by just projecting the pk, something like

queryInput := &dynamodb.QueryInput{
    TableName: aws.String(s.tableName),
    KeyConditionExpression: ...
    ExpressionAttributeValues: ...
    Limit: aws.Int32(1), // Only check for existence
    ProjectionExpression: aws.String("PK"), // Only fetch the key attribute
}

May need to add an interface to BlobMetadata store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, I've added something like this!

return fmt.Errorf("blob already exists: %s", blobKey.Hex())
}

// Check if the error is NOT "metadata not found" - which would be a real error
if err != nil && !errors.Is(err, common.ErrMetadataNotFound) {
return fmt.Errorf("failed to check blob existence: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it's not an InvalidArgument error, but an Internal error. Note all the errors from this validateDispersalRequest are treated as InvalidArgument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I moved the blob existence check out of validateDispersalRequest, and typed the errors with more granularity. lmk if that looks okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that looks fine

@hopeyen hopeyen requested a review from jianoaix March 12, 2025 05:53
Copy link
Contributor

@jianoaix jianoaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -265,6 +265,33 @@ func (s *BlobMetadataStore) GetBlobMetadata(ctx context.Context, blobKey corev2.
return metadata, nil
}

// DoesBlobExist checks if a blob exists without fetching the entire metadata.
func (s *BlobMetadataStore) DoesBlobExist(ctx context.Context, blobKey corev2.BlobKey) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the previous name in s3 store may fit well CheckBlobExists

Limit: aws.Int32(1),
}

result, err := s.dynamoDBClient.QueryWithInput(ctx, queryInput)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use GetItem? Might be slightly more efficient than Query

// CheckBlobExists checks if a blob exists without fetching the entire metadata.
func (s *BlobMetadataStore) CheckBlobExists(ctx context.Context, blobKey corev2.BlobKey) (bool, error) {
// Use GetItem with ProjectionExpression to minimize data transfer
item, err := s.dynamoDBClient.GetItem(ctx, s.tableName, map[string]types.AttributeValue{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use projection as well for GetItem I think: https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/dynamodb#GetItemInput

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current dynamodb client interface doesn't support projection but I've added it:)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG! Can we make it GetItemWithInput (similar to QueryWithInput) and passing in GetItemInput as param, so in the future if there are other needs to customize, it'll be easy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, makes sense! updated

Copy link
Contributor

@jianoaix jianoaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@hopeyen hopeyen merged commit 54a44b3 into master Mar 14, 2025
12 checks passed
@hopeyen hopeyen deleted the hope/blob-key-check-metadata branch March 14, 2025 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants