You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/getting-started/genai.md
+19Lines changed: 19 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -104,6 +104,24 @@ This integration enables:
104
104
- Efficiently materializing features to vector databases
105
105
- Scaling RAG applications to enterprise-level document repositories
106
106
107
+
### Scaling with Ray Integration
108
+
109
+
Feast integrates with Ray to enable distributed processing for RAG applications:
110
+
111
+
***Ray Compute Engine**: Distributed feature computation using Ray's task and actor model
112
+
***Ray Offline Store**: Process large document collections and generate embeddings at scale
113
+
***Ray Batch Materialization**: Efficiently materialize features from offline to online stores
114
+
***Distributed Embedding Generation**: Scale embedding generation across multiple nodes
115
+
116
+
This integration enables:
117
+
- Distributed processing of large document collections
118
+
- Parallel embedding generation for millions of text chunks
119
+
- Kubernetes-native scaling for RAG applications
120
+
- Efficient resource utilization across multiple nodes
121
+
- Production-ready distributed RAG pipelines
122
+
123
+
For detailed information on building distributed RAG applications with Feast and Ray, see [Feast + Ray: Distributed Processing for RAG Applications](https://feast.dev/blog/feast-ray-distributed-processing/).
124
+
107
125
## Model Context Protocol (MCP) Support
108
126
109
127
Feast supports the Model Context Protocol (MCP), which enables AI agents and applications to interact with your feature store through standardized MCP interfaces. This allows seamless integration with LLMs and AI agents for GenAI applications.
@@ -158,6 +176,7 @@ For more detailed information and examples:
158
176
* [RAG Tutorial with Docling](../tutorials/rag-with-docling.md)
159
177
* [RAG Fine Tuning with Feast and Milvus](../../examples/rag-retriever/README.md)
Copy file name to clipboardExpand all lines: docs/reference/compute-engine/ray.md
+43Lines changed: 43 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,24 @@
2
2
3
3
The Ray compute engine is a distributed compute implementation that leverages [Ray](https://www.ray.io/) for executing feature pipelines including transformations, aggregations, joins, and materializations. It provides scalable and efficient distributed processing for both `materialize()` and `get_historical_features()` operations.
4
4
5
+
## Quick Start with Ray Template
6
+
7
+
### Ray RAG Template - Batch Embedding at Scale
8
+
9
+
For RAG (Retrieval-Augmented Generation) applications with distributed embedding generation:
10
+
11
+
```bash
12
+
feast init -t ray_rag my_rag_project
13
+
cd my_rag_project/feature_repo
14
+
```
15
+
16
+
The Ray RAG template demonstrates:
17
+
-**Parallel Embedding Generation**: Uses Ray compute engine to generate embeddings across multiple workers
18
+
-**Vector Search Integration**: Works with Milvus for semantic similarity search
19
+
-**Complete RAG Pipeline**: Data → Embeddings → Search workflow
20
+
21
+
The Ray compute engine automatically distributes the embedding generation across available workers, making it ideal for processing large datasets efficiently.
22
+
5
23
## Overview
6
24
7
25
The Ray compute engine provides:
@@ -365,6 +383,8 @@ batch_engine:
365
383
366
384
### With Feature Transformations
367
385
386
+
#### On-Demand Transformations
387
+
368
388
```python
369
389
from feast import FeatureView, Field
370
390
from feast.types import Float64
@@ -385,4 +405,27 @@ features = store.get_historical_features(
385
405
)
386
406
```
387
407
408
+
#### Ray Native Transformations
409
+
410
+
For distributed transformations that leverage Ray's dataset and parallel processing capabilities, use `mode="ray"` in your `BatchFeatureView`:
For more information, see the [Ray documentation](https://docs.ray.io/en/latest/) and [Ray Data guide](https://docs.ray.io/en/latest/data/getting-started.html).
Copy file name to clipboardExpand all lines: docs/reference/offline-stores/overview.md
+22-22Lines changed: 22 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,33 +26,33 @@ The first three of these methods all return a `RetrievalJob` specific to an offl
26
26
## Functionality Matrix
27
27
28
28
There are currently four core offline store implementations: `DaskOfflineStore`, `BigQueryOfflineStore`, `SnowflakeOfflineStore`, and `RedshiftOfflineStore`.
29
-
There are several additional implementations contributed by the Feast community (`PostgreSQLOfflineStore`, `SparkOfflineStore`, and `TrinoOfflineStore`), which are not guaranteed to be stable or to match the functionality of the core implementations.
29
+
There are several additional implementations contributed by the Feast community (`PostgreSQLOfflineStore`, `SparkOfflineStore`, `TrinoOfflineStore`, and `RayOfflineStore`), which are not guaranteed to be stable or to match the functionality of the core implementations.
30
30
Details for each specific offline store, such as how to configure it in a `feature_store.yaml`, can be found [here](README.md).
31
31
32
32
Below is a matrix indicating which offline stores support which methods.
Copy file name to clipboardExpand all lines: docs/reference/offline-stores/ray.md
+17Lines changed: 17 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,23 @@
5
5
6
6
The Ray offline store is a data I/O implementation that leverages [Ray](https://www.ray.io/) for reading and writing data from various sources. It focuses on efficient data access operations, while complex feature computation is handled by the [Ray Compute Engine](../compute-engine/ray.md).
7
7
8
+
## Quick Start with Ray Template
9
+
10
+
The easiest way to get started with Ray offline store is to use the built-in Ray template:
11
+
12
+
```bash
13
+
feast init -t ray my_ray_project
14
+
cd my_ray_project/feature_repo
15
+
```
16
+
17
+
This template includes:
18
+
- Pre-configured Ray offline store and compute engine setup
19
+
- Sample feature definitions optimized for Ray processing
20
+
- Demo workflow showcasing Ray capabilities
21
+
- Resource settings for local development
22
+
23
+
The template provides a complete working example with sample datasets and demonstrates both Ray offline store data I/O operations and Ray compute engine distributed processing.
Copy file name to clipboardExpand all lines: docs/reference/online-stores/mysql.md
+22Lines changed: 22 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,6 +28,28 @@ online_store:
28
28
29
29
The full set of configuration options is available in [MySQLOnlineStoreConfig](https://rtd.feast.dev/en/master/#feast.infra.online_stores.mysql_online_store.MySQLOnlineStoreConfig).
30
30
31
+
## Batch write mode
32
+
By default, the MySQL online store performs row-by-row insert and commit for each feature record. While this ensures per-record atomicity, it can lead to significant overhead on write operations — especially on distributed SQL databases (for example, TiDB, which is MySQL-compatible and uses a consensus protocol).
33
+
34
+
To improve writing performance, you can enable batch write mode by setting `batch_write` to `true` and `batch_size`, which executes multiple insert queries in batches and commits them together per batch instead of committing each record individually.
35
+
36
+
{% code title="feature_store.yaml" %}
37
+
```yaml
38
+
project: my_feature_repo
39
+
registry: data/registry.db
40
+
provider: local
41
+
online_store:
42
+
type: mysql
43
+
host: DB_HOST
44
+
port: DB_PORT
45
+
database: DB_NAME
46
+
user: DB_USERNAME
47
+
password: DB_PASSWORD
48
+
batch_write: true
49
+
batch_size: 100
50
+
```
51
+
{% endcode %}
52
+
31
53
## Functionality Matrix
32
54
33
55
The set of functionality supported by online stores is described in detail [here](overview.md#functionality).
0 commit comments