Skip to content

Conversation

@astronautas
Copy link
Contributor

What this PR does / why we need it:

Adds an option to materialize only the latest values (essentially pushes down deduplication to offline store), to reduce client memory consumption and reduce e2e duration. Especially noticeable for large-scale materialization - think hundreds of thousands of rows with ~150 feature views, with latency-critical materializations - as we observed in our ML project at cast.ai.

Which issue(s) this PR fixes:

#5707 (comment)

Misc

This will be configured via feature store (repo) config file:

registry: registry.db
project: credit_scoring_aws
provider: local
online_store:
  type: sqlite
offline_store:
  type: file
entity_key_serialization_version: 3
materialization: # <- new option here
  pull_latest_features: true | false (default)

@astronautas astronautas requested a review from a team as a code owner November 5, 2025 11:29
@astronautas
Copy link
Contributor Author

@HaoXuAI

@astronautas
Copy link
Contributor Author

@franciscojavierarceo

Signed-off-by: lukas.valatka <[email protected]>
…e' of github.com:astronautas/feast into feat/add-selective-deduplicate-pushdown-to-offline-store
@astronautas
Copy link
Contributor Author

Let's re-run tests? Random issue, but no changes to dependency management :/

@astronautas
Copy link
Contributor Author

image

Dependency install was skipped :what:

@astronautas
Copy link
Contributor Author

astronautas commented Nov 5, 2025

seems like aws creds have expired @franciscojavierarceo

@franciscojavierarceo
Copy link
Member

@ntkathole @jeremyary can you investigate?

Copy link
Collaborator

@HaoXuAI HaoXuAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to add the config to the fs.materialize API? So that you can customize the materialize process that materialize the FeatureView if you need pushdown filter, and some other process you don't need.

@astronautas
Copy link
Contributor Author

I think it might be better to add the config to the fs.materialize API? So that you can customize the materialize process that materialize the FeatureView if you need pushdown filter, and some other process you don't need.

Why not indeed. I'll check it out and tag you back.

@franciscojavierarceo franciscojavierarceo merged commit 8d77b72 into feast-dev:master Nov 11, 2025
17 of 18 checks passed
HaoXuAI pushed a commit that referenced this pull request Nov 12, 2025
…performance (#5713)

* add pull_all_from_table_or_query for clickhouse, to align with new materialization logic (calling it)

Signed-off-by: lukas.valatka <[email protected]>

* add option to select to materialize only latest values, for performance

Signed-off-by: lukas.valatka <[email protected]>

* enforce non optional params

Signed-off-by: lukas.valatka <[email protected]>

---------

Signed-off-by: lukas.valatka <[email protected]>
Co-authored-by: Lukas Valatka <[email protected]>
franciscojavierarceo pushed a commit that referenced this pull request Nov 13, 2025
# [0.57.0](v0.56.0...v0.57.0) (2025-11-13)

### Bug Fixes

* Improve trino to feast type mapping with (real,varchar,timestamp,decimal) ([#5691](#5691)) ([f855ad2](f855ad2))
* Materialize API - ODFV views not looked-up (thinks views non existant)  - crashes materialize ([#5716](#5716)) ([1b050b3](1b050b3))
* Support historical feature retrieval with start_date/end_date in RemoteOfflineStore ([#5703](#5703)) ([ad32756](ad32756))
* Thread safe Clickhouse offline store ([#5710](#5710)) ([5f446ed](5f446ed))

### Features

* Add annotations to cronjob CRDs ([#5701](#5701)) ([be6e6c2](be6e6c2))
* Add batch commit mode for MySQL OnlineStore ([#5699](#5699)) ([3cfe4eb](3cfe4eb))
* Add possibility to materialize only latest values, to increase performance ([#5713](#5713)) ([8d77b72](8d77b72))
* Support table format: Iceberg, Delta, and Hudi ([#5650](#5650)) ([2915ad1](2915ad1))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants