Skip to content

Conversation

iamhatesz
Copy link
Contributor

What this PR does / why we need it:

This PR adds a new contrib offline store backed by Clickhouse.

Which issue(s) this PR fixes:

Lack of Clickhouse support :)

Misc

The implementation is heavily based on the Postgres store and tested against it. The resulting features were identical to the point that it's possible with two different backends (e.g., different data types).

I added a helper to run integration tests: make test-python-universal-clickhouse-offline. Unfortunately, 3 test cases are failing:

ERROR sdk/python/tests/integration/registration/test_universal_types.py::test_feature_get_historical_features_types_match[TypeTestConfig(feature_dtype='float', feature_is_list=True, has_empty_list=False)-ParameterSet(values=(LOCAL:Clickhouse:RedisOnlineStoreCreator:python_fs:False,), marks=[], id=None)] - TypeError: object of type 'NoneType' has no len()
ERROR sdk/python/tests/integration/registration/test_universal_types.py::test_feature_get_historical_features_types_match[TypeTestConfig(feature_dtype='bool', feature_is_list=True, has_empty_list=False)-ParameterSet(values=(LOCAL:Clickhouse:RedisOnlineStoreCreator:python_fs:False,), marks=[], id=None)] - TypeError: object of type 'NoneType' has no len()
ERROR sdk/python/tests/integration/registration/test_universal_types.py::test_feature_get_historical_features_types_match[TypeTestConfig(feature_dtype='datetime', feature_is_list=True, has_empty_list=False)-ParameterSet(values=(LOCAL:Clickhouse:RedisOnlineStoreCreator:python_fs:False,), marks=[], id=None)] - TypeError: object of type 'NoneType' has no len()

This is because Clickhouse doesn't support Nullable(Array(...)) type. I could have added test_universal_types to the ignore list, but I thought it's worth keeping it on, as many other test cases from this test are passing.

Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
@iamhatesz iamhatesz requested a review from a team as a code owner October 31, 2024 13:31
@zerafachris
Copy link

Hi @iamhatesz , I patched the requirements on your branch to hopefully fix the errors being thrown by the checks

Merge at your earliest convenience.

Looking forward to using this. ATM, running on PG and would like to move to CH asap

@franciscojavierarceo
Copy link
Member

Bunch of stuff failing here, think you also need to rebase. Let me know if you need some help!

@iamhatesz
Copy link
Contributor Author

@zerafachris @franciscojavierarceo thanks! I will take a look and push some updates.

…fline-store

# Conflicts:
#	sdk/python/docs/source/feast.infra.online_stores.contrib.rst
#	sdk/python/requirements/py3.10-ci-requirements.txt
#	sdk/python/requirements/py3.10-requirements.txt
#	sdk/python/requirements/py3.11-ci-requirements.txt
#	sdk/python/requirements/py3.11-requirements.txt
#	sdk/python/requirements/py3.9-ci-requirements.txt
#	sdk/python/requirements/py3.9-requirements.txt
#	setup.py
Signed-off-by: Tomasz Wrona <[email protected]>
@iamhatesz iamhatesz force-pushed the feast-clickhouse-offline-store branch from 97678aa to bcb90ca Compare February 11, 2025 16:59
@iamhatesz
Copy link
Contributor Author

Hey @franciscojavierarceo! I fixed some issues related to Python 3.9, regenerated lock files and merged the latest changes. Could you please approve the workflows to see if the tests are passing now?

@iamhatesz
Copy link
Contributor Author

FAILED sdk/python/tests/integration/online_store/test_universal_online.py::test_async_online_retrieval_with_event_timestamps_dynamo[ParameterSet(values=(LOCAL:File:dynamodb:python_fs:False,), marks=[], id=None)] - botocore.exceptions.HTTPClientError: An HTTP Client raised an unhandled exception: Event loop is closed

this failure doesn't seem to be related with the content of this PR.

@miroslawas
Copy link
Contributor

Would love to have this functionality as part of Feast ❤️

@iamhatesz
Copy link
Contributor Author

Hey @franciscojavierarceo! Could you please help me find a reviewer for this PR?

…fline-store

# Conflicts:
#	sdk/python/docs/source/feast.infra.offline_stores.contrib.rst
#	sdk/python/docs/source/feast.infra.utils.rst
#	sdk/python/requirements/py3.10-ci-requirements.txt
#	sdk/python/requirements/py3.10-requirements.txt
#	sdk/python/requirements/py3.11-ci-requirements.txt
#	sdk/python/requirements/py3.11-requirements.txt
#	sdk/python/requirements/py3.9-ci-requirements.txt
#	sdk/python/requirements/py3.9-requirements.txt
#	setup.py
@iamhatesz iamhatesz force-pushed the feast-clickhouse-offline-store branch from 57c04ff to 80bf5d7 Compare March 6, 2025 10:13
@franciscojavierarceo
Copy link
Member

Yes, reviewing now!

@franciscojavierarceo
Copy link
Member

For the failed test cases, can you mark ignore on them for clickhouse? Probably worth adding that Nullable Arrays aren't supported, yeah?

Apologies for the delay on this, frankly, this is embarrassing on my part. Do feel free to tag me directly going forward. I'll get this addressed ASAP, thank you for the contribution. 🙏

Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Tomasz Wrona <[email protected]>
@iamhatesz
Copy link
Contributor Author

Hey @franciscojavierarceo ! Thanks for the review. I hope I addressed all your concerns. I had to add the ignore rule as part of a fixture, since pytest -k doesn't seem to work with complex test case names (e.g., with parentheses). I hope it's not a problem since I based on a similar hack for Redshift.

@masterlexa
Copy link

Hello, we are really looking forward to this feature, can you tell us when it will be added?)

Copy link
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@franciscojavierarceo
Copy link
Member

@iamhatesz sorry one last conflict needs to be resolved 😞

…fline-store

# Conflicts:
#	sdk/python/requirements/py3.10-ci-requirements.txt
#	sdk/python/requirements/py3.11-ci-requirements.txt
#	sdk/python/requirements/py3.9-ci-requirements.txt
@iamhatesz
Copy link
Contributor Author

@franciscojavierarceo merged the latest changes and regenerated lock files with make lock-python-dependencies-all but it also updates all packages to their latest versions. After updating torch (which is completely irrelevant to this PR), two unit tests failed due to missing wheels. Let me know how I can fix this. How do you guys update the lock files if not with that command?

The e2e test failure seems unrelated as well.

@franciscojavierarceo
Copy link
Member

franciscojavierarceo commented Mar 12, 2025

@ntkathole @jyejare

@ntkathole
Copy link
Member

ntkathole commented Mar 12, 2025

@franciscojavierarceo merged the latest changes and regenerated lock files with make lock-python-dependencies-all but it also updates all packages to their latest versions. After updating torch (which is completely irrelevant to this PR), two unit tests failed due to missing wheels. Let me know how I can fix this. How do you guys update the lock files if not with that command?

The e2e test failure seems unrelated as well.

@iamhatesz can you please pin torch version 2.2.2 in setup.py and pyproject.toml ? This is happening since PyTorch 2.2.x is the last version that supports macOS x64.

Signed-off-by: Tomasz Wrona <[email protected]>
@iamhatesz
Copy link
Contributor Author

@ntkathole done

Copy link
Member

@ntkathole ntkathole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@franciscojavierarceo franciscojavierarceo merged commit 86794c2 into feast-dev:master Mar 12, 2025
32 checks passed
franciscojavierarceo pushed a commit that referenced this pull request Apr 7, 2025
# [0.48.0](v0.47.0...v0.48.0) (2025-04-07)

### Bug Fixes

* Enhance integration logos display and styling in the UI ([#5221](#5221)) ([5799257](5799257))
* Fix space typo in push.md docs ([#5184](#5184)) ([81677b2](81677b2))
* Fixed integration tests for qdrant and milvus ([#5224](#5224)) ([d6b080d](d6b080d))
* Formatting trino ([760ec0e](760ec0e))
* Multiple fixes in retrieval of online documents ([#5168](#5168)) ([66ddd3e](66ddd3e))
* Operator route creation for Feast UI in OpenShift ([e3946b4](e3946b4))
* Remove entity_rows parameter from retrieve_online_documents_v2 call ([#5225](#5225)) ([2a2e304](2a2e304))
* Styling ([#5222](#5222)) ([34c393c](34c393c))
* typo in the chart ([bd3448b](bd3448b))
* Update milvus-quickstart and feature_store.yaml with correct Milvus Config ([#5200](#5200)) ([306acca](306acca))
* Update Qdrant online store paths in repo_config.py ([#5207](#5207)) ([ab35b0b](ab35b0b)), closes [#5206](#5206)
* Update the doc ([#5194](#5194)) ([726464e](726464e))
* Updated the operator-rabc example to test RBAC from a Kubernete pod ([#5147](#5147)) ([d23a1a5](d23a1a5))

### Features

* add `real`(float32) type for trino offline store ([#4749](#4749)) ([0947f96](0947f96))
* Add async DynamoDB timeout and retry configuration ([#5178](#5178)) ([2f3bcf5](2f3bcf5))
* Add CronJob capability to the Operator (feast apply & materialize-incremental) ([#5217](#5217)) ([285c0dc](285c0dc))
* Add RAG tutorial and Use Cases documentation ([#5226](#5226)) ([99f4004](99f4004))
* Added CLI for features, get historical and online features ([#5197](#5197)) ([4ab9f74](4ab9f74))
* Added export support in feast UI ([#5198](#5198)) ([b079553](b079553))
* Added global registry search support in Feast UI ([#5195](#5195)) ([f09ea49](f09ea49))
* Added UI for Features list ([#5192](#5192)) ([cc7fd47](cc7fd47))
* Adding blog on RAG with Milvus ([#5161](#5161)) ([b9e2e6c](b9e2e6c))
* Adding Docling RAG demo ([#5109](#5109)) ([569404b](569404b))
* Allow transformations on writes to output list of entities ([#5209](#5209)) ([955521a](955521a))
* Cache get_any_feature_view results ([#5175](#5175)) ([924b8a3](924b8a3))
* Clickhouse offline store ([#4725](#4725)) ([86794c2](86794c2))
* Enable keyword search for Milvus ([#5199](#5199)) ([ac44967](ac44967))
* Enable transformations on PDFs ([#5172](#5172)) ([3674971](3674971))
* Enable users to use Entity Query as CTE during historical retrieval ([#5202](#5202)) ([fe69eaf](fe69eaf))
* helm support more deployment config ([d575372](d575372))
* Improved CLI file structuring ([#5201](#5201)) ([972ed34](972ed34))
* Kickoff Transformation implementationtransformation code base ([#5181](#5181)) ([0083303](0083303))
* Make keep-alive timeout configurable for async DynamoDB connections ([#5167](#5167)) ([7f3e528](7f3e528))
* Operator mounts the odh-trusted-ca-bundle configmap when deployed on RHOAI or ODH ([d4d7b0d](d4d7b0d))
* Spark Transformation ([#5185](#5185)) ([be3d85c](be3d85c))
j-wine pushed a commit to j-wine/feast that referenced this pull request Jun 7, 2025
* Clickhouse offline store - initial working version

Signed-off-by: Tomasz Wrona <[email protected]>

* Remove untested `pull_all_from_table_or_query`

Signed-off-by: Tomasz Wrona <[email protected]>

* Reorder functions

Signed-off-by: Tomasz Wrona <[email protected]>

* Remove commented line

Signed-off-by: Tomasz Wrona <[email protected]>

* Fix frozen mypy errors

Signed-off-by: Tomasz Wrona <[email protected]>

* mypy fixes; remove online source creator

Signed-off-by: Tomasz Wrona <[email protected]>

* Remove commented code

Signed-off-by: Tomasz Wrona <[email protected]>

* Added docs

Signed-off-by: Tomasz Wrona <[email protected]>

* Python 3.9 deps

Signed-off-by: Tomasz Wrona <[email protected]>

* Python 3.10 deps

Signed-off-by: Tomasz Wrona <[email protected]>

* Python 3.11 deps (updated)

Signed-off-by: Tomasz Wrona <[email protected]>

* Remove unused ClickhouseOnlineStoreConfig

Signed-off-by: Tomasz Wrona <[email protected]>

* Regenerate requirements.txt files

Signed-off-by: Tomasz Wrona <[email protected]>

* Lint & format fixes

Signed-off-by: Tomasz Wrona <[email protected]>

* Regenerate requirements.txt files

Signed-off-by: Tomasz Wrona <[email protected]>

* Add clickhouse to pyproject.toml

Signed-off-by: Tomasz Wrona <[email protected]>

* Fix dependencies

Signed-off-by: Tomasz Wrona <[email protected]>

* Simplify names

Signed-off-by: Tomasz Wrona <[email protected]>

* Skip problematic Clickhouse tests

Signed-off-by: Tomasz Wrona <[email protected]>

* format & lint

Signed-off-by: Tomasz Wrona <[email protected]>

* Post-merge `make lock-python-dependencies-all`

Signed-off-by: Tomasz Wrona <[email protected]>

* Pin torch to 2.2.2

Signed-off-by: Tomasz Wrona <[email protected]>

---------

Signed-off-by: Tomasz Wrona <[email protected]>
Signed-off-by: Jacob Weinhold <[email protected]>
j-wine pushed a commit to j-wine/feast that referenced this pull request Jun 7, 2025
# [0.48.0](feast-dev/feast@v0.47.0...v0.48.0) (2025-04-07)

### Bug Fixes

* Enhance integration logos display and styling in the UI ([feast-dev#5221](feast-dev#5221)) ([5799257](feast-dev@5799257))
* Fix space typo in push.md docs ([feast-dev#5184](feast-dev#5184)) ([81677b2](feast-dev@81677b2))
* Fixed integration tests for qdrant and milvus ([feast-dev#5224](feast-dev#5224)) ([d6b080d](feast-dev@d6b080d))
* Formatting trino ([760ec0e](feast-dev@760ec0e))
* Multiple fixes in retrieval of online documents ([feast-dev#5168](feast-dev#5168)) ([66ddd3e](feast-dev@66ddd3e))
* Operator route creation for Feast UI in OpenShift ([e3946b4](feast-dev@e3946b4))
* Remove entity_rows parameter from retrieve_online_documents_v2 call ([feast-dev#5225](feast-dev#5225)) ([2a2e304](feast-dev@2a2e304))
* Styling ([feast-dev#5222](feast-dev#5222)) ([34c393c](feast-dev@34c393c))
* typo in the chart ([bd3448b](feast-dev@bd3448b))
* Update milvus-quickstart and feature_store.yaml with correct Milvus Config ([feast-dev#5200](feast-dev#5200)) ([306acca](feast-dev@306acca))
* Update Qdrant online store paths in repo_config.py ([feast-dev#5207](feast-dev#5207)) ([ab35b0b](feast-dev@ab35b0b)), closes [feast-dev#5206](feast-dev#5206)
* Update the doc ([feast-dev#5194](feast-dev#5194)) ([726464e](feast-dev@726464e))
* Updated the operator-rabc example to test RBAC from a Kubernete pod ([feast-dev#5147](feast-dev#5147)) ([d23a1a5](feast-dev@d23a1a5))

### Features

* add `real`(float32) type for trino offline store ([feast-dev#4749](feast-dev#4749)) ([0947f96](feast-dev@0947f96))
* Add async DynamoDB timeout and retry configuration ([feast-dev#5178](feast-dev#5178)) ([2f3bcf5](feast-dev@2f3bcf5))
* Add CronJob capability to the Operator (feast apply & materialize-incremental) ([feast-dev#5217](feast-dev#5217)) ([285c0dc](feast-dev@285c0dc))
* Add RAG tutorial and Use Cases documentation ([feast-dev#5226](feast-dev#5226)) ([99f4004](feast-dev@99f4004))
* Added CLI for features, get historical and online features ([feast-dev#5197](feast-dev#5197)) ([4ab9f74](feast-dev@4ab9f74))
* Added export support in feast UI ([feast-dev#5198](feast-dev#5198)) ([b079553](feast-dev@b079553))
* Added global registry search support in Feast UI ([feast-dev#5195](feast-dev#5195)) ([f09ea49](feast-dev@f09ea49))
* Added UI for Features list ([feast-dev#5192](feast-dev#5192)) ([cc7fd47](feast-dev@cc7fd47))
* Adding blog on RAG with Milvus ([feast-dev#5161](feast-dev#5161)) ([b9e2e6c](feast-dev@b9e2e6c))
* Adding Docling RAG demo ([feast-dev#5109](feast-dev#5109)) ([569404b](feast-dev@569404b))
* Allow transformations on writes to output list of entities ([feast-dev#5209](feast-dev#5209)) ([955521a](feast-dev@955521a))
* Cache get_any_feature_view results ([feast-dev#5175](feast-dev#5175)) ([924b8a3](feast-dev@924b8a3))
* Clickhouse offline store ([feast-dev#4725](feast-dev#4725)) ([86794c2](feast-dev@86794c2))
* Enable keyword search for Milvus ([feast-dev#5199](feast-dev#5199)) ([ac44967](feast-dev@ac44967))
* Enable transformations on PDFs ([feast-dev#5172](feast-dev#5172)) ([3674971](feast-dev@3674971))
* Enable users to use Entity Query as CTE during historical retrieval ([feast-dev#5202](feast-dev#5202)) ([fe69eaf](feast-dev@fe69eaf))
* helm support more deployment config ([d575372](feast-dev@d575372))
* Improved CLI file structuring ([feast-dev#5201](feast-dev#5201)) ([972ed34](feast-dev@972ed34))
* Kickoff Transformation implementationtransformation code base ([feast-dev#5181](feast-dev#5181)) ([0083303](feast-dev@0083303))
* Make keep-alive timeout configurable for async DynamoDB connections ([feast-dev#5167](feast-dev#5167)) ([7f3e528](feast-dev@7f3e528))
* Operator mounts the odh-trusted-ca-bundle configmap when deployed on RHOAI or ODH ([d4d7b0d](feast-dev@d4d7b0d))
* Spark Transformation ([feast-dev#5185](feast-dev#5185)) ([be3d85c](feast-dev@be3d85c))

Signed-off-by: Jacob Weinhold <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants