-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Add s3 storage-based registry store in Go feature server #5336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add s3 storage-based registry store in Go feature server #5336
Conversation
_, err := r.getRegistryProto() | ||
if err != nil { | ||
if _, ok := r.registryStore.(*FileRegistryStore); ok { // S3에는 굳이 연동할 필요 없어 보임. 오히려 이로 인해 정상적이던 s3 내의 레지스트리 파일이 초기화된 파일로 덮어쓰기 될 수도 있지 않음? | ||
if _, ok := r.registryStore.(*FileRegistryStore); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't apply if statement
in case that registry store is S3RegistryStore
. because this initialized empty registry store is likely to overwrite normal registry file in s3 storage. Of course, this case is when the err
of r.getRegistryProto()
method is the network error not abnormal file in s3 storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you mind adding a test for this please?
@franciscojavierarceo thanks for review. But I think about how to write test-code for s3 registry store. As you konw, for testing s3 registry, any s3 bucket needs to exist. Of course, in my localhost test, my own s3 bucket exists but in the test of this public repository, I thins that a single s3 bucket will be public.. it is right? Or, how to make mock s3 bucket? Can you advise to me about how to add test-code for s3 registry store? |
Hi @iamcodingcat , thank you for this feature. I know add unit test/integration test here is difficult. by any chance, can you post any results/logs from your local run here as a reference for us? So far, the only concern from my side is the authentication and the credential handling for accessing the S3. |
import (
"bytes"
"context"
"errors"
"io"
"testing"
"github.com/aws/aws-sdk-go-v2/service/s3"
"github.com/aws/aws-sdk-go-v2/service/s3/types"
"github.com/feast-dev/feast/go/internal/feast/registry"
"github.com/stretchr/testify/assert"
)
// Define a mock S3 client
type MockS3Client struct {
GetObjectFn func(ctx context.Context, input *s3.GetObjectInput) (*s3.GetObjectOutput, error)
DeleteObjectFn func(ctx context.Context, input *s3.DeleteObjectInput) (*s3.DeleteObjectOutput, error)
}
func (m *MockS3Client) GetObject(ctx context.Context, input *s3.GetObjectInput) (*s3.GetObjectOutput, error) {
if m.GetObjectFn != nil {
return m.GetObjectFn(ctx, input)
}
return nil, errors.New("not implemented")
}
func (m *MockS3Client) DeleteObject(ctx context.Context, input *s3.DeleteObjectInput) (*s3.DeleteObjectOutput, error) {
if m.DeleteObjectFn != nil {
return m.DeleteObjectFn(ctx, input)
}
return nil, errors.New("not implemented")
} func NewMockS3RegistryStore(mockClient *MockS3Client, filePath string) *registry.S3RegistryStore {
return ®istry.S3RegistryStore{
filePath: filePath,
s3Client: mockClient,
}
} |
Hi. @shuchu thanks for review. I attached my console std output after running ![]() |
@franciscojavierarceo thanks for your guide. I try to add test-code and if I have a question, I will ask for it. |
@shuchu @franciscojavierarceo
At present, when this error occurs, the server is still running not panic. Of course, when that error occurs, it seems that the health check endpoint (/health) also does not return a normal response. So even if the Go feature server is deployed in a Kubernetes environment as a Deployment, the readinessProbe would fail, and traffic likely wouldn’t be routed to the pods with the error during a rolling update. Still, would it be better to explicitly panic in order to clearly indicate the error? |
ifeq ($(PB_ARCH), aarch64) | ||
PB_ARCH=aarch_64 | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In linux based arm64 architecture platform, output of uname -m
command is aarch64
. If this if-statement doesn't exist, PB_ARCH
value is aarch64
and this cause error when downloading protobuf release download. Because arm architecture name of protobuf release file name is aarch_64
not aarch64
. please refer below link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shuchu @franciscojavierarceo Hi! I add test-code for s3 registry store with mocking s3 client. I ask for you to review it.
}) | ||
if err != nil { | ||
return nil, err | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I don't apply panic
, a failed test case seems to normal.
}) | ||
if err != nil { | ||
return err | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I don't apply panic
, a failed test case seems to normal.
This is automatically resolved.. I am confused at... anyway it is normal. you ignore this comment! |
@iamcodingcat can you fix the DCO? you just have to click button in UI. |
Signed-off-by: iamcodingcat <[email protected]>
Signed-off-by: iamcodingcat <[email protected]>
Signed-off-by: iamcodingcat <[email protected]>
Signed-off-by: iamcodingcat <[email protected]>
Signed-off-by: iamcodingcat <[email protected]>
Signed-off-by: iamcodingcat <[email protected]>
Signed-off-by: iamcodingcat <[email protected]>
9e54a97
to
4862622
Compare
@franciscojavierarceo thnaks for review approval. I fixed DCO. I request your review again :) |
Hi @iamcodingcat it looks like this PR broke the master branch, would you mind reopening it so I can trigger the integration tests? So sorry about that 😢 |
@franciscojavierarceo Sure. I reopened #5352 . please check it :) |
…-dev#5336) * fix: upgrade protobuf version, make `protos` directory beforehand Signed-off-by: iamcodingcat <[email protected]> * feat: add aws s3 storage based registry store Signed-off-by: iamcodingcat <[email protected]> * chore: add aws s3 api related pkgs Signed-off-by: iamcodingcat <[email protected]> * style: remove my custom comment Signed-off-by: iamcodingcat <[email protected]> * refact: separate s3 registry file from `local.go` Signed-off-by: iamcodingcat <[email protected]> * feat: add if-statement in Makefile on linux arm64 os-platform Signed-off-by: iamcodingcat <[email protected]> * feat: add test-code for s3 registry store Signed-off-by: iamcodingcat <[email protected]> --------- Signed-off-by: iamcodingcat <[email protected]> Co-authored-by: 조영훈 <[email protected]> Signed-off-by: Jacob Weinhold <[email protected]>
…r" (feast-dev#5351) Revert "feat: Add s3 storage-based registry store in Go feature server (feast-dev#5336)" This reverts commit abe18df. Signed-off-by: Jacob Weinhold <[email protected]>
# [0.50.0](v0.49.0...v0.50.0) (2025-06-30) ### Bug Fixes * Add asyncio to integration test ([#5418](#5418)) ([6765515](6765515)) * Add clickhouse to OFFLINE_STORE_CLASS_FOR_TYPE map ([#5251](#5251)) ([9ed2ffa](9ed2ffa)) * Add missing conn.commit() in SnowflakeOnlineStore.online_write_batch ([#5432](#5432)) ([a83dd85](a83dd85)) * Add transformers in required dependencies ([8cde460](8cde460)) * Allow custom annotations on Operator installed objects ([#5339](#5339)) ([44c7a76](44c7a76)) * Dask pulling of latest data ([#5229](#5229)) ([571d81f](571d81f)) * **dask:** preserve remote URIs (e.g. s3://) in DaskOfflineStore path resolution ([2561cfc](2561cfc)) * Fix Event loop is closed error on dynamodb test ([#5480](#5480)) ([fe0f671](fe0f671)) * Fix lineage entity filtering ([#5321](#5321)) ([0d05701](0d05701)) * Fix list saved dataset api ([833696c](833696c)) * Fix NumPy - PyArrow array type mapping in Trino offline store ([#5393](#5393)) ([9ba9ded](9ba9ded)) * Fix pandas 2.x compatibility issue of Trino offline store caused by removed Series.iteritems() method ([#5345](#5345)) ([61e3e02](61e3e02)) * Fix polling mechanism for TestApplyAndMaterialize ([#5451](#5451)) ([b512a74](b512a74)) * Fix remote rbac integration tests ([#5473](#5473)) ([10879ec](10879ec)) * Fix Trino offline store SQL in Jinja template ([#5346](#5346)) ([648c53d](648c53d)) * Fixed CurlGeneratorTab github theme type ([#5425](#5425)) ([5f15329](5f15329)) * Increase the Operator Manager memory limits and requests ([#5441](#5441)) ([6c94dbf](6c94dbf)) * Method signature for push_async is out of date ([#5413](#5413)) ([28c3379](28c3379)), closes [#5410](#5410) [#006BB4](https://github.com/feast-dev/feast/issues/006BB4) * Operator - support securityContext override at Pod level ([#5325](#5325)) ([33ea0f5](33ea0f5)) * Pybuild-deps throws errors w/ latest pip version ([#5311](#5311)) ([f2d6a67](f2d6a67)) * Reopen for integration test about add s3 storage-based registry store in Go feature server ([#5352](#5352)) ([ef75f61](ef75f61)) * resolve Python logger warnings ([#5361](#5361)) ([37d5c19](37d5c19)) * The ignore_paths not taking effect duration feast apply ([#5353](#5353)) ([e4917ca](e4917ca)) * Update generate_answer function to provide correct parameter format to retrieve function ([dc5b2af](dc5b2af)) * Update milvus connect function to work with remote instance ([#5382](#5382)) ([7e5e7d5](7e5e7d5)) * Updating milvus connect function to work with remote instance ([#5401](#5401)) ([b89fadd](b89fadd)) * Upperbound limit for protobuf generation ([#5309](#5309)) ([a114aae](a114aae)) ### Features * Add CLI, SDK, and API documentation page to Feast UI ([#5337](#5337)) ([203e888](203e888)) * Add dark mode toggle to Feast UI ([#5314](#5314)) ([ad02e46](ad02e46)) * Add data labeling tabs to UI ([#5410](#5410)) ([389ceb7](389ceb7)), closes [#006BB4](https://github.com/feast-dev/feast/issues/006BB4) * Add Decimal to allowed python scalar types ([#5367](#5367)) ([4777c03](4777c03)) * Add feast rag retriver functionality ([#5405](#5405)) ([0173033](0173033)) * Add feature view curl generator ([#5415](#5415)) ([7a5b48f](7a5b48f)) * Add feature view lineage tab and filtering to home page lineage ([#5308](#5308)) ([308255d](308255d)) * Add feature view tags to dynamo tags ([#5291](#5291)) ([3a787ac](3a787ac)) * Add HybridOnlineStore for multi-backend online store routing ([#5423](#5423)) ([ebd67d1](ebd67d1)) * Add max_file_size to Snowflake config ([#5377](#5377)) ([e8cdf5d](e8cdf5d)) * Add MCP (Model Context Protocol) support for Feast feature server ([#5406](#5406)) ([de650de](de650de)), closes [#5398](#5398) [#5382](#5382) [#5389](#5389) [#5401](#5401) * Add rag project to default dev UI ([#5323](#5323)) ([3b3e1c8](3b3e1c8)) * Add s3 storage-based registry store in Go feature server ([#5336](#5336)) ([abe18df](abe18df)) * Add support for data labeling in UI ([#5409](#5409)) ([d183c4b](d183c4b)), closes [#27](#27) * Added Lineage APIs to get registry objects relationships ([#5472](#5472)) ([be004ef](be004ef)) * Added rest-apis serving option for registry server ([#5342](#5342)) ([9740fd1](9740fd1)) * Added torch.Tensor as option for online and offline retrieval ([#5381](#5381)) ([0b4ae95](0b4ae95)) * Adding feast delete to CLI ([#5344](#5344)) ([19fe3ac](19fe3ac)) * Adding permissions to UI and refactoring some things ([#5320](#5320)) ([6f1b0cc](6f1b0cc)) * Allow to set registry server rest/grpc mode in operator ([#5364](#5364)) ([99afd6d](99afd6d)) * Allow to use env variable FEAST_FS_YAML_FILE_PATH and FEATURE_REPO_DIR ([#5420](#5420)) ([6a1b33a](6a1b33a)) * Enable materialization for ODFV Transform on Write ([#5459](#5459)) ([3d17892](3d17892)) * Improve search results formatting ([#5326](#5326)) ([18cbd7f](18cbd7f)) * Improvements to Lambda materialization engine ([#5379](#5379)) ([b486f29](b486f29)) * Make batch_source optional in PushSource ([#5440](#5440)) ([#5454](#5454)) ([ae7e20e](ae7e20e)) * Refactor materialization engine ([#5354](#5354)) ([f5c5360](f5c5360)) * Remote Write to Online Store completes client / server architecture ([#5422](#5422)) ([2368f42](2368f42)) * Serialization version 2 and below removed ([#5435](#5435)) ([9e50e18](9e50e18)) * SQLite online retrieval. Add timezone info into timestamp. ([#5386](#5386)) ([6b05153](6b05153)) * Support dual-mode REST and gRPC for Feast Registry Server ([#5396](#5396)) ([fd1f448](fd1f448)) * Support DynamoDB as online store in Go feature server ([#5464](#5464)) ([40d25c6](40d25c6)) * Update Spark Compute read source node to be able to use other data sources ([#5445](#5445)) ([a93d300](a93d300)) ### Reverts * Feat: Add CLI, SDK, and API documentation page to Feast UI" ([#5341](#5341)) ([b492f14](b492f14)), closes [#5337](#5337) * Revert "feat: Add s3 storage-based registry store in Go feature server" ([#5351](#5351)) ([d5d6766](d5d6766)), closes [#5336](#5336) * Revert "fix: Update milvus connect function to work with remote instance" ([#5398](#5398)) ([434dd92](434dd92)), closes [#5382](#5382)
# [0.50.0](v0.49.0...v0.50.0) (2025-07-01) ### Bug Fixes * Add asyncio to integration test ([#5418](#5418)) ([6765515](6765515)) * Add clickhouse to OFFLINE_STORE_CLASS_FOR_TYPE map ([#5251](#5251)) ([9ed2ffa](9ed2ffa)) * Add missing conn.commit() in SnowflakeOnlineStore.online_write_batch ([#5432](#5432)) ([a83dd85](a83dd85)) * Add transformers in required dependencies ([8cde460](8cde460)) * Allow custom annotations on Operator installed objects ([#5339](#5339)) ([44c7a76](44c7a76)) * Dask pulling of latest data ([#5229](#5229)) ([571d81f](571d81f)) * **dask:** preserve remote URIs (e.g. s3://) in DaskOfflineStore path resolution ([2561cfc](2561cfc)) * Fix Event loop is closed error on dynamodb test ([#5480](#5480)) ([fe0f671](fe0f671)) * Fix lineage entity filtering ([#5321](#5321)) ([0d05701](0d05701)) * Fix list saved dataset api ([833696c](833696c)) * Fix NumPy - PyArrow array type mapping in Trino offline store ([#5393](#5393)) ([9ba9ded](9ba9ded)) * Fix pandas 2.x compatibility issue of Trino offline store caused by removed Series.iteritems() method ([#5345](#5345)) ([61e3e02](61e3e02)) * Fix polling mechanism for TestApplyAndMaterialize ([#5451](#5451)) ([b512a74](b512a74)) * Fix remote rbac integration tests ([#5473](#5473)) ([10879ec](10879ec)) * Fix Trino offline store SQL in Jinja template ([#5346](#5346)) ([648c53d](648c53d)) * Fixed CurlGeneratorTab github theme type ([#5425](#5425)) ([5f15329](5f15329)) * Increase the Operator Manager memory limits and requests ([#5441](#5441)) ([6c94dbf](6c94dbf)) * Method signature for push_async is out of date ([#5413](#5413)) ([28c3379](28c3379)), closes [#5410](#5410) [#006BB4](https://github.com/feast-dev/feast/issues/006BB4) * Operator - support securityContext override at Pod level ([#5325](#5325)) ([33ea0f5](33ea0f5)) * Pybuild-deps throws errors w/ latest pip version ([#5311](#5311)) ([f2d6a67](f2d6a67)) * Reopen for integration test about add s3 storage-based registry store in Go feature server ([#5352](#5352)) ([ef75f61](ef75f61)) * resolve Python logger warnings ([#5361](#5361)) ([37d5c19](37d5c19)) * The ignore_paths not taking effect duration feast apply ([#5353](#5353)) ([e4917ca](e4917ca)) * Update generate_answer function to provide correct parameter format to retrieve function ([dc5b2af](dc5b2af)) * Update milvus connect function to work with remote instance ([#5382](#5382)) ([7e5e7d5](7e5e7d5)) * Updating milvus connect function to work with remote instance ([#5401](#5401)) ([b89fadd](b89fadd)) * Upperbound limit for protobuf generation ([#5309](#5309)) ([a114aae](a114aae)) ### Features * Add CLI, SDK, and API documentation page to Feast UI ([#5337](#5337)) ([203e888](203e888)) * Add dark mode toggle to Feast UI ([#5314](#5314)) ([ad02e46](ad02e46)) * Add data labeling tabs to UI ([#5410](#5410)) ([389ceb7](389ceb7)), closes [#006BB4](https://github.com/feast-dev/feast/issues/006BB4) * Add Decimal to allowed python scalar types ([#5367](#5367)) ([4777c03](4777c03)) * Add feast rag retriver functionality ([#5405](#5405)) ([0173033](0173033)) * Add feature view curl generator ([#5415](#5415)) ([7a5b48f](7a5b48f)) * Add feature view lineage tab and filtering to home page lineage ([#5308](#5308)) ([308255d](308255d)) * Add feature view tags to dynamo tags ([#5291](#5291)) ([3a787ac](3a787ac)) * Add HybridOnlineStore for multi-backend online store routing ([#5423](#5423)) ([ebd67d1](ebd67d1)) * Add max_file_size to Snowflake config ([#5377](#5377)) ([e8cdf5d](e8cdf5d)) * Add MCP (Model Context Protocol) support for Feast feature server ([#5406](#5406)) ([de650de](de650de)), closes [#5398](#5398) [#5382](#5382) [#5389](#5389) [#5401](#5401) * Add rag project to default dev UI ([#5323](#5323)) ([3b3e1c8](3b3e1c8)) * Add s3 storage-based registry store in Go feature server ([#5336](#5336)) ([abe18df](abe18df)) * Add support for data labeling in UI ([#5409](#5409)) ([d183c4b](d183c4b)), closes [#27](#27) * Added Lineage APIs to get registry objects relationships ([#5472](#5472)) ([be004ef](be004ef)) * Added rest-apis serving option for registry server ([#5342](#5342)) ([9740fd1](9740fd1)) * Added torch.Tensor as option for online and offline retrieval ([#5381](#5381)) ([0b4ae95](0b4ae95)) * Adding feast delete to CLI ([#5344](#5344)) ([19fe3ac](19fe3ac)) * Adding permissions to UI and refactoring some things ([#5320](#5320)) ([6f1b0cc](6f1b0cc)) * Allow to set registry server rest/grpc mode in operator ([#5364](#5364)) ([99afd6d](99afd6d)) * Allow to use env variable FEAST_FS_YAML_FILE_PATH and FEATURE_REPO_DIR ([#5420](#5420)) ([6a1b33a](6a1b33a)) * Enable materialization for ODFV Transform on Write ([#5459](#5459)) ([3d17892](3d17892)) * Improve search results formatting ([#5326](#5326)) ([18cbd7f](18cbd7f)) * Improvements to Lambda materialization engine ([#5379](#5379)) ([b486f29](b486f29)) * Make batch_source optional in PushSource ([#5440](#5440)) ([#5454](#5454)) ([ae7e20e](ae7e20e)) * Refactor materialization engine ([#5354](#5354)) ([f5c5360](f5c5360)) * Remote Write to Online Store completes client / server architecture ([#5422](#5422)) ([2368f42](2368f42)) * Serialization version 2 and below removed ([#5435](#5435)) ([9e50e18](9e50e18)) * SQLite online retrieval. Add timezone info into timestamp. ([#5386](#5386)) ([6b05153](6b05153)) * Support dual-mode REST and gRPC for Feast Registry Server ([#5396](#5396)) ([fd1f448](fd1f448)) * Support DynamoDB as online store in Go feature server ([#5464](#5464)) ([40d25c6](40d25c6)) * Update Spark Compute read source node to be able to use other data sources ([#5445](#5445)) ([a93d300](a93d300)) ### Reverts * Chore Release "chore(release): release 0.50.0" ([#5483](#5483)) ([0eef391](0eef391)) * Feat: Add CLI, SDK, and API documentation page to Feast UI" ([#5341](#5341)) ([b492f14](b492f14)), closes [#5337](#5337) * Revert "feat: Add s3 storage-based registry store in Go feature server" ([#5351](#5351)) ([d5d6766](d5d6766)), closes [#5336](#5336) * Revert "fix: Update milvus connect function to work with remote instance" ([#5398](#5398)) ([434dd92](434dd92)), closes [#5382](#5382)
What this PR does / why we need it:
This pull request integrates S3 storage as a registry store supported by the Go Feature Server, which is currently in alpha. At the moment, the Go Feature Server only supports a registry store based on the local file system. To make it suitable for production use in a service environment, we developed support for a registry store based on AWS S3 — a widely adopted and popular cloud storage solution.
I developed below things
*For your reference, the commit username younghun-jo-ilevit-com is also me.
Which issue(s) this PR fixes:
No. it is just for feature enhancement.
Misc