Skip to content

Conversation

franciscojavierarceo
Copy link
Member

@franciscojavierarceo franciscojavierarceo commented Apr 19, 2025

What this PR does / why we need it:

This PR introduces a new transform_on_write flag to various functions and classes in the Feast SDK. The flag allows users to enable or disable transformations of data when writing to the online or offline store. This feature improves flexibility in managing data transformations during feature ingestion.

This feature is particularly useful if you want to batch transform a set of data (e.g., using Spark) and upload that transformed data to the online store and then want to also offer transformation on writes with the Feast Feature Server using an On Demand transformation.

Changes Summary:

  • sdk/python/feast/feature_server.py
    • Added transform_on_write attribute to WriteToFeatureStoreRequest and PushFeaturesRequest classes.
    • Updated push and write_to_online_store functions to pass the transform_on_write parameter to downstream calls.
  • sdk/python/feast/feature_store.py
    • Added transform_on_write parameter to the push method and its docstring.
    • Updated the push method to pass the transform_on_write parameter to the write_to_online_store and write_to_offline_store methods.
    • Modified _get_feature_view_and_df_for_online_write and write_to_online_store functions to support the transform_on_write parameter.
  • sdk/python/tests/unit/test_on_demand_python_transformation.py
    • Fixed a typo in the constant ODFV_OTHER_STRING_CONSTANT.
    • Introduced new test cases for transform_on_write functionality to verify behavior when the flag is disabled (transform_on_write=False).
    • Added constants and test data for validating untransformed writes to the online store.

This enhancement provides better control over data transformations when pushing features, which can be particularly useful for scenarios where pre-transformed data is being ingested.

Which issue(s) this PR fixes:

#5196

Misc

@franciscojavierarceo franciscojavierarceo marked this pull request as ready for review April 19, 2025 01:51
@franciscojavierarceo franciscojavierarceo requested a review from a team as a code owner April 19, 2025 01:51
Copy link
Collaborator

@shuchu shuchu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let

@franciscojavierarceo franciscojavierarceo merged commit ecad170 into master Apr 19, 2025
32 checks passed
franciscojavierarceo pushed a commit that referenced this pull request Apr 29, 2025
# [0.49.0](v0.48.0...v0.49.0) (2025-04-29)

### Bug Fixes

* Adding brackets to unit tests ([c46fea3](c46fea3))
* Adding logic back for a step ([2bb240b](2bb240b))
* Adjustment for unit test action ([a6f78ae](a6f78ae))
* Allow get_historical_features with only On Demand Feature View ([#5256](#5256)) ([0752795](0752795))
* CI adjustment ([3850643](3850643))
* Embed Query configuration breaks when switching between DataFrame and SQL ([#5257](#5257)) ([32375a5](32375a5))
* Fix for proto issue in utils ([1b291b2](1b291b2))
* Fix milvus online_read ([#5233](#5233)) ([4b91f26](4b91f26))
* Fix tests ([431d9b8](431d9b8))
* Fixed Permissions object parameter in example ([#5259](#5259)) ([045c100](045c100))
* Java CI [#12](#12) ([d7e44ac](d7e44ac))
* Java PR [#15](#15) ([a5da3bb](a5da3bb))
* Java PR [#16](#16) ([e0320fe](e0320fe))
* Java PR [#17](#17) ([49da810](49da810))
* Materialization logs ([#5243](#5243)) ([4aa2f49](4aa2f49))
* Moving to custom github action for checking skip tests ([caf312e](caf312e))
* Operator - remove default replicas setting from Feast Deployment ([#5294](#5294)) ([e416d01](e416d01))
* Patch java pr [#14](#14) ([592526c](592526c))
* Patch update for test ([a3e8967](a3e8967))
* Remove conditional from steps ([995307f](995307f))
* Remove misleading HTTP prefix from gRPC endpoints in logs and doc ([#5280](#5280)) ([0ee3a1e](0ee3a1e))
* removing id ([268ade2](268ade2))
* Renaming workflow file ([5f46279](5f46279))
* Resolve `no pq wrapper` import issue ([#5240](#5240)) ([d5906f1](d5906f1))
* Update actions to remove check skip tests ([#5275](#5275)) ([b976f27](b976f27))
* Update docling demo ([446efea](446efea))
* Update java pr [#13](#13) ([fda7db7](fda7db7))
* Update java_pr ([fa138f4](fa138f4))
* Update repo_config.py ([6a59815](6a59815))
* Update unit tests workflow ([06486a0](06486a0))
* Updated docs for docling demo ([768e6cc](768e6cc))
* Updating action for unit tests ([0996c28](0996c28))
* Updating github actions to filter at job level ([0a09622](0a09622))
* Updating Java CI ([c7c3a3c](c7c3a3c))
* Updating java pr to skip tests ([e997dd9](e997dd9))
* Updating workflows ([c66bcd2](c66bcd2))

### Features

* Add date_partition_column_format for spark source ([#5273](#5273)) ([7a61d6f](7a61d6f))
* Add Milvus tutorial with Feast integration ([#5292](#5292)) ([a1388a5](a1388a5))
* Add pgvector tutorial with PostgreSQL integration ([#5290](#5290)) ([bb1cbea](bb1cbea))
* Add ReactFlow visualization for Feast registry metadata ([#5297](#5297)) ([9768970](9768970))
* Add retrieve online documents v2 method into  pgvector  ([#5253](#5253)) ([6770ee6](6770ee6))
* Compute Engine Initial Implementation ([#5223](#5223)) ([64bdafd](64bdafd))
* Enable write node for compute engine ([#5287](#5287)) ([f9baf97](f9baf97))
* Local compute engine ([#5278](#5278)) ([8e06dfe](8e06dfe))
* Make transform on writes configurable for ingestion ([#5283](#5283)) ([ecad170](ecad170))
* Offline store update pull_all_from_table_or_query to make timestampfield optional ([#5281](#5281)) ([4b94608](4b94608))
* Serialization version 2 deprecation notice ([#5248](#5248)) ([327d99d](327d99d))
* Vector length definition moved to Feature View from Config  ([#5289](#5289)) ([d8f1c97](d8f1c97))
j-wine pushed a commit to j-wine/feast that referenced this pull request Jun 7, 2025
…5283)

* feat: Make transform on writes configurable for batch ingestion

Signed-off-by: Francisco Javier Arceo <[email protected]>

* update test and fix bug to support skipping transformation on writes for ODFV

Signed-off-by: Francisco Javier Arceo <[email protected]>

---------

Signed-off-by: Francisco Javier Arceo <[email protected]>
Signed-off-by: Jacob Weinhold <[email protected]>
j-wine pushed a commit to j-wine/feast that referenced this pull request Jun 7, 2025
# [0.49.0](feast-dev/feast@v0.48.0...v0.49.0) (2025-04-29)

### Bug Fixes

* Adding brackets to unit tests ([c46fea3](feast-dev@c46fea3))
* Adding logic back for a step ([2bb240b](feast-dev@2bb240b))
* Adjustment for unit test action ([a6f78ae](feast-dev@a6f78ae))
* Allow get_historical_features with only On Demand Feature View ([feast-dev#5256](feast-dev#5256)) ([0752795](feast-dev@0752795))
* CI adjustment ([3850643](feast-dev@3850643))
* Embed Query configuration breaks when switching between DataFrame and SQL ([feast-dev#5257](feast-dev#5257)) ([32375a5](feast-dev@32375a5))
* Fix for proto issue in utils ([1b291b2](feast-dev@1b291b2))
* Fix milvus online_read ([feast-dev#5233](feast-dev#5233)) ([4b91f26](feast-dev@4b91f26))
* Fix tests ([431d9b8](feast-dev@431d9b8))
* Fixed Permissions object parameter in example ([feast-dev#5259](feast-dev#5259)) ([045c100](feast-dev@045c100))
* Java CI [feast-dev#12](feast-dev#12) ([d7e44ac](feast-dev@d7e44ac))
* Java PR [feast-dev#15](feast-dev#15) ([a5da3bb](feast-dev@a5da3bb))
* Java PR [feast-dev#16](feast-dev#16) ([e0320fe](feast-dev@e0320fe))
* Java PR [feast-dev#17](feast-dev#17) ([49da810](feast-dev@49da810))
* Materialization logs ([feast-dev#5243](feast-dev#5243)) ([4aa2f49](feast-dev@4aa2f49))
* Moving to custom github action for checking skip tests ([caf312e](feast-dev@caf312e))
* Operator - remove default replicas setting from Feast Deployment ([feast-dev#5294](feast-dev#5294)) ([e416d01](feast-dev@e416d01))
* Patch java pr [feast-dev#14](feast-dev#14) ([592526c](feast-dev@592526c))
* Patch update for test ([a3e8967](feast-dev@a3e8967))
* Remove conditional from steps ([995307f](feast-dev@995307f))
* Remove misleading HTTP prefix from gRPC endpoints in logs and doc ([feast-dev#5280](feast-dev#5280)) ([0ee3a1e](feast-dev@0ee3a1e))
* removing id ([268ade2](feast-dev@268ade2))
* Renaming workflow file ([5f46279](feast-dev@5f46279))
* Resolve `no pq wrapper` import issue ([feast-dev#5240](feast-dev#5240)) ([d5906f1](feast-dev@d5906f1))
* Update actions to remove check skip tests ([feast-dev#5275](feast-dev#5275)) ([b976f27](feast-dev@b976f27))
* Update docling demo ([446efea](feast-dev@446efea))
* Update java pr [feast-dev#13](feast-dev#13) ([fda7db7](feast-dev@fda7db7))
* Update java_pr ([fa138f4](feast-dev@fa138f4))
* Update repo_config.py ([6a59815](feast-dev@6a59815))
* Update unit tests workflow ([06486a0](feast-dev@06486a0))
* Updated docs for docling demo ([768e6cc](feast-dev@768e6cc))
* Updating action for unit tests ([0996c28](feast-dev@0996c28))
* Updating github actions to filter at job level ([0a09622](feast-dev@0a09622))
* Updating Java CI ([c7c3a3c](feast-dev@c7c3a3c))
* Updating java pr to skip tests ([e997dd9](feast-dev@e997dd9))
* Updating workflows ([c66bcd2](feast-dev@c66bcd2))

### Features

* Add date_partition_column_format for spark source ([feast-dev#5273](feast-dev#5273)) ([7a61d6f](feast-dev@7a61d6f))
* Add Milvus tutorial with Feast integration ([feast-dev#5292](feast-dev#5292)) ([a1388a5](feast-dev@a1388a5))
* Add pgvector tutorial with PostgreSQL integration ([feast-dev#5290](feast-dev#5290)) ([bb1cbea](feast-dev@bb1cbea))
* Add ReactFlow visualization for Feast registry metadata ([feast-dev#5297](feast-dev#5297)) ([9768970](feast-dev@9768970))
* Add retrieve online documents v2 method into  pgvector  ([feast-dev#5253](feast-dev#5253)) ([6770ee6](feast-dev@6770ee6))
* Compute Engine Initial Implementation ([feast-dev#5223](feast-dev#5223)) ([64bdafd](feast-dev@64bdafd))
* Enable write node for compute engine ([feast-dev#5287](feast-dev#5287)) ([f9baf97](feast-dev@f9baf97))
* Local compute engine ([feast-dev#5278](feast-dev#5278)) ([8e06dfe](feast-dev@8e06dfe))
* Make transform on writes configurable for ingestion ([feast-dev#5283](feast-dev#5283)) ([ecad170](feast-dev@ecad170))
* Offline store update pull_all_from_table_or_query to make timestampfield optional ([feast-dev#5281](feast-dev#5281)) ([4b94608](feast-dev@4b94608))
* Serialization version 2 deprecation notice ([feast-dev#5248](feast-dev#5248)) ([327d99d](feast-dev@327d99d))
* Vector length definition moved to Feature View from Config  ([feast-dev#5289](feast-dev#5289)) ([d8f1c97](feast-dev@d8f1c97))

Signed-off-by: Jacob Weinhold <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants