-
Notifications
You must be signed in to change notification settings - Fork 3
Msudhir/add vector update functionality #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Danny C <[email protected]>
update Catalog argument in athena connector Signed-off-by: Gyumin Lee <[email protected]> Co-authored-by: Gyumin Lee <[email protected]>
Signed-off-by: Danny C <[email protected]>
ensure correct precedence with the two operators Signed-off-by: Ben Fletcher <[email protected]>
Signed-off-by: Jiwon Park <[email protected]>
…east-dev#3677) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…east-dev#3630) * sql.py data_sources.data_source_name String(255) Extend the limit of the data_source_name field from 50 to 255. Signed-off-by: Ross Donnachie <[email protected]>
…east-dev#3680) feat: Optimize bytes processed when retrieving entity df schema to 0 Signed-off-by: Hai Nguyen <[email protected]>
…tore.plan() on python (feast-dev#3640) * fix! KeyError: __dummy on entityless fv Signed-off-by: williamfoschiera <[email protected]> * fix! join_keys typing. Signed-off-by: williamfoschiera <[email protected]> --------- Signed-off-by: williamfoschiera <[email protected]> Co-authored-by: williamfoschiera <[email protected]>
Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.1 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.1...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…east-dev#3675) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.2 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.2...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…-dev#3679) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.47.0 to 1.53.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.47.0...v1.53.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd))
* Add fully-qualified-table-name Redshift prop Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * Docstring Signed-off-by: Robin Neufeld <[email protected]> * Test fully_qualified_table_name Signed-off-by: Robin Neufeld <[email protected]> * Simplify logic Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * Test offline_write_batch Signed-off-by: Robin Neufeld <[email protected]> * Bump to trigger CI Signed-off-by: Robin Neufeld <[email protected]> * another bump for ci Signed-off-by: Robin Neufeld <[email protected]> --------- Signed-off-by: Robin Neufeld <[email protected]>
…SA role (feast-dev#3696) Add aws-sts dependency in java sdk Signed-off-by: harmeet-singh-discovery <[email protected]>
| collection_available = utility.has_collection(table_to_keep.name) | ||
| try: | ||
| if collection_available: | ||
| logger.error(f"Collection {table_to_keep.name} already exists.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make this log statement on "info" level. If a collection exists it is not an error and is expected.
|
|
||
|
|
||
| @dataclass | ||
| class MockFeatureView: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it really necessary to use a mock? Can you use a VectorFeatureView instead?
|
|
||
| # Assert that logger.info was called with expected messages | ||
| expected_log_calls = [ | ||
| mocker.call("Closing the connection to Milvus"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verification through logs is not a good practice in unit tests. Look into methods of "connections" like get_connection_addr() https://github.com/milvus-io/pymilvus/blob/master/pymilvus/orm/connections.py#L396 to use for the assert statements
| assert len(utility.list_collections()) == 1 | ||
| assert utility.has_collection(self.collection_to_write) is True | ||
| assert ( | ||
| f"Collection {self.collection_to_write} has been created successfully." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to assert a specific log statement here
| partial=None, | ||
| ) | ||
|
|
||
| assert f"Collection {self.collection_to_write} already exists." in caplog.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert with list_collections() == 1 instead
| ) | ||
|
|
||
| online_store_creator.teardown() | ||
| MilvusOnlineStore().update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using the milvus online store here makes this test dependent on update() working as expected. Instead creating a collection via the pymilvus sdk directly would be better here to simulate that a collection already exists.
| if collection_available: | ||
| logger.error(f"Collection {table_to_keep.name} already exists.") | ||
| else: | ||
| Collection(name=table_to_keep.name, schema=table_to_keep.schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just catching this. FeatureView schemas are not the same as the schema that Milvus expects. We will need to write logic to translate that.
| ] | ||
| ) | ||
|
|
||
| MilvusOnlineStore().update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here I would also rather create the collection first through the pymilvus sdk so that we do not introduce tests that are interdependent.
| assert utility.has_collection(self.collection_to_delete) is False | ||
| assert len(utility.list_collections()) == 0 | ||
| assert ( | ||
| f"Collection {self.collection_to_delete} has been deleted successfully." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not assert log-statements. If the wording changes one has to go and update the tests.
| ) | ||
|
|
||
| assert ( | ||
| f"Collection {self.collection_to_delete} does not exist or is already deleted." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check with assert connections.has_connection(self.collection_to_delete) instead
…vus readable schema
| f"Collection {table_to_delete.name} has been deleted successfully." | ||
| ) | ||
| else: | ||
| return logger.error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove return statement
| field_name = field.name | ||
| description = field.tags.get("description") | ||
| is_primary = boolean_mapping_from_string.get(field.tags.get("is_primary")) | ||
| dimension = field.tags.get("dimension") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dimensions have to be retrieved from VectorFeatureView. Also, the name of the column holding the vector is defined separately in VectorFeatureView ("vector_field")
| is_primary = boolean_mapping_from_string.get(field.tags.get("is_primary")) | ||
| dimension = field.tags.get("dimension") | ||
|
|
||
| if dimension is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
|
|
||
| # inheriting from FeatureView wouldn't work due to issue with conflicting proto classes | ||
| # therefore using composition instead | ||
| name: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feature_view already has an attribute name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It did not reflect and threw an error when I tried to add name
|
|
||
| def setup_method(self, milvus_online_setup): | ||
| # Ensuring that the collections created are dropped before the tests are run | ||
| with PymilvusConnectionContext(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's ok to re-use MilvusConnectionManager in order to not duplicate code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to keep the test not dependent on any functionality in Milvus online store. This context manager is a very simple open and close connection, purely for testing purposes. Will not add in username, password, etc.
| tags={ | ||
| "is_primary": "False", | ||
| "description": "float32", | ||
| "dimension": "128", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking about it this is actually a great suggestion to use a tag to pass along the number of dimensions. I think we can do likewise with index algorithm and eliminate the need for a VectorFeatureView altogether. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thats a good idea
| ) | ||
|
|
||
| # Here we want to open and add a collection using pymilvus directly and close the connection, we need to check if the collection count remains 1 and exists. | ||
| with PymilvusConnectionContext(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the embedded Milvus actually persist data after it is stopped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does
| partial=None, | ||
| ) | ||
|
|
||
| assert "Collection abc does not exist or is already deleted." in caplog.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I'd rather assert against Milvus not having any collection. The update() method should really have a return value, but that is an issue in Feast itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work!
Signed-off-by: Francisco Javier Arceo <[email protected]>
# [0.49.0](feast-dev/feast@v0.48.0...v0.49.0) (2025-04-29) ### Bug Fixes * Adding brackets to unit tests ([c46fea3](feast-dev@c46fea3)) * Adding logic back for a step ([2bb240b](feast-dev@2bb240b)) * Adjustment for unit test action ([a6f78ae](feast-dev@a6f78ae)) * Allow get_historical_features with only On Demand Feature View ([feast-dev#5256](feast-dev#5256)) ([0752795](feast-dev@0752795)) * CI adjustment ([3850643](feast-dev@3850643)) * Embed Query configuration breaks when switching between DataFrame and SQL ([feast-dev#5257](feast-dev#5257)) ([32375a5](feast-dev@32375a5)) * Fix for proto issue in utils ([1b291b2](feast-dev@1b291b2)) * Fix milvus online_read ([feast-dev#5233](feast-dev#5233)) ([4b91f26](feast-dev@4b91f26)) * Fix tests ([431d9b8](feast-dev@431d9b8)) * Fixed Permissions object parameter in example ([feast-dev#5259](feast-dev#5259)) ([045c100](feast-dev@045c100)) * Java CI [#12](feast-dev#12) ([d7e44ac](feast-dev@d7e44ac)) * Java PR [#15](feast-dev#15) ([a5da3bb](feast-dev@a5da3bb)) * Java PR [#16](feast-dev#16) ([e0320fe](feast-dev@e0320fe)) * Java PR [#17](feast-dev#17) ([49da810](feast-dev@49da810)) * Materialization logs ([feast-dev#5243](feast-dev#5243)) ([4aa2f49](feast-dev@4aa2f49)) * Moving to custom github action for checking skip tests ([caf312e](feast-dev@caf312e)) * Operator - remove default replicas setting from Feast Deployment ([feast-dev#5294](feast-dev#5294)) ([e416d01](feast-dev@e416d01)) * Patch java pr [#14](feast-dev#14) ([592526c](feast-dev@592526c)) * Patch update for test ([a3e8967](feast-dev@a3e8967)) * Remove conditional from steps ([995307f](feast-dev@995307f)) * Remove misleading HTTP prefix from gRPC endpoints in logs and doc ([feast-dev#5280](feast-dev#5280)) ([0ee3a1e](feast-dev@0ee3a1e)) * removing id ([268ade2](feast-dev@268ade2)) * Renaming workflow file ([5f46279](feast-dev@5f46279)) * Resolve `no pq wrapper` import issue ([feast-dev#5240](feast-dev#5240)) ([d5906f1](feast-dev@d5906f1)) * Update actions to remove check skip tests ([feast-dev#5275](feast-dev#5275)) ([b976f27](feast-dev@b976f27)) * Update docling demo ([446efea](feast-dev@446efea)) * Update java pr [#13](feast-dev#13) ([fda7db7](feast-dev@fda7db7)) * Update java_pr ([fa138f4](feast-dev@fa138f4)) * Update repo_config.py ([6a59815](feast-dev@6a59815)) * Update unit tests workflow ([06486a0](feast-dev@06486a0)) * Updated docs for docling demo ([768e6cc](feast-dev@768e6cc)) * Updating action for unit tests ([0996c28](feast-dev@0996c28)) * Updating github actions to filter at job level ([0a09622](feast-dev@0a09622)) * Updating Java CI ([c7c3a3c](feast-dev@c7c3a3c)) * Updating java pr to skip tests ([e997dd9](feast-dev@e997dd9)) * Updating workflows ([c66bcd2](feast-dev@c66bcd2)) ### Features * Add date_partition_column_format for spark source ([feast-dev#5273](feast-dev#5273)) ([7a61d6f](feast-dev@7a61d6f)) * Add Milvus tutorial with Feast integration ([feast-dev#5292](feast-dev#5292)) ([a1388a5](feast-dev@a1388a5)) * Add pgvector tutorial with PostgreSQL integration ([feast-dev#5290](feast-dev#5290)) ([bb1cbea](feast-dev@bb1cbea)) * Add ReactFlow visualization for Feast registry metadata ([feast-dev#5297](feast-dev#5297)) ([9768970](feast-dev@9768970)) * Add retrieve online documents v2 method into pgvector ([feast-dev#5253](feast-dev#5253)) ([6770ee6](feast-dev@6770ee6)) * Compute Engine Initial Implementation ([feast-dev#5223](feast-dev#5223)) ([64bdafd](feast-dev@64bdafd)) * Enable write node for compute engine ([feast-dev#5287](feast-dev#5287)) ([f9baf97](feast-dev@f9baf97)) * Local compute engine ([feast-dev#5278](feast-dev#5278)) ([8e06dfe](feast-dev@8e06dfe)) * Make transform on writes configurable for ingestion ([feast-dev#5283](feast-dev#5283)) ([ecad170](feast-dev@ecad170)) * Offline store update pull_all_from_table_or_query to make timestampfield optional ([feast-dev#5281](feast-dev#5281)) ([4b94608](feast-dev@4b94608)) * Serialization version 2 deprecation notice ([feast-dev#5248](feast-dev#5248)) ([327d99d](feast-dev@327d99d)) * Vector length definition moved to Feature View from Config ([feast-dev#5289](feast-dev#5289)) ([d8f1c97](feast-dev@d8f1c97))
What this PR does / why we need it:
Added updated functionality to MilvusOnlineStore. The update functionality is able to delete and create a collection.
Relevant tests added as well
Which issue(s) this PR fixes:
Fixes #