-
Notifications
You must be signed in to change notification settings - Fork 69
Updating feast example after merge in feast repo #426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating feast example after merge in feast repo #426
Conversation
WalkthroughThis update removes the custom Feast RAG retriever implementation and its supporting classes, refactors the notebook to use Feast’s built-in retriever and vector store modules, and updates configuration and documentation accordingly. The notebook now simplifies dependency installation, reorganizes data paths, adjusts imports and initialization to align with the new Feast integration, and clarifies dataset chunking logic. Changes
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
examples/kfto_feast_rag/README.md (1)
48-48
: Fix markdown formatting consistency.The static analysis tool has identified formatting inconsistencies in this line.
Apply this diff to fix the markdown formatting:
-* **__feature_repo/ragproject_repo.py__** +- **feature_repo/ragproject_repo.py**examples/kfto_feast_rag/rag_feast_kfto.ipynb (1)
28-38
: Consider pinning to a specific commit for reproducibility.Installing from the master branch is appropriate to access the latest RAG features. However, consider pinning to a specific commit hash for better reproducibility.
For better reproducibility, consider pinning to a specific commit:
-%pip install git+https://github.com/feast-dev/feast.git@master +%pip install git+https://github.com/feast-dev/feast.git@<specific_commit_hash>
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
examples/kfto_feast_rag/README.md
(1 hunks)examples/kfto_feast_rag/feast_rag_retriever.py
(0 hunks)examples/kfto_feast_rag/feature_repo/feature_store.yaml
(1 hunks)examples/kfto_feast_rag/rag_feast_kfto.ipynb
(6 hunks)
💤 Files with no reviewable changes (1)
- examples/kfto_feast_rag/feast_rag_retriever.py
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
examples/kfto_feast_rag/README.md
48-48: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
48-48: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
48-48: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
🔇 Additional comments (10)
examples/kfto_feast_rag/feature_repo/feature_store.yaml (1)
6-6
: LGTM! Proper URL scheme added.Adding the "http://" prefix to the host configuration improves clarity and aligns with standard URL formatting expectations for the Milvus connection.
examples/kfto_feast_rag/README.md (1)
48-48
: Filename correction aligns with notebook imports.Good catch! This correction ensures consistency with the import path used in the notebook (
ragproject_repo
).examples/kfto_feast_rag/rag_feast_kfto.ipynb (8)
23-23
: Dependency simplification approved.Removing
faiss-cpu
from the installation is appropriate since the updated Feast integration leverages Milvus for vector operations directly.
138-138
: Data organization improvement.Creating a dedicated
data
directory withinfeature_repo
improves project structure and organization.
166-166
: File path updated to match new directory structure.The updated path correctly uses the new
feature_repo/data
directory structure.
252-254
: Import updates reflect Feast RAG integration.The updated imports correctly use the new Feast-provided RAG modules instead of the custom implementation. The filename correction (
ragproject_repo
) maintains consistency with the actual file structure.
267-269
: Simplified API initialization with enhanced features.The updated
FeastVectorStore
initialization uses a cleaner API withrepo_path="."
and includes the necessarypassage_id
feature for proper retrieval functionality.
272-272
: Cleaner FeastIndex initialization.The simplified initialization without the
vector_store
argument reflects improved API design in the updated Feast integration.
284-285
: More explicit retriever configuration.The updated parameters (
feature_view
andfeatures
) provide clearer and more direct configuration compared to inferring these from a vector store object.
289-289
: Text field specification for retrieval.Adding the
text_field="passage_text"
parameter is essential for the retriever to correctly identify the text content field for RAG functionality.
d5e672b
to
c9e6cff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
c9e6cff
to
240d412
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
examples/kfto_feast_rag/README.md (1)
51-59
: README still claims a “custom” vector store / retriever — this is now stalePR description says the bespoke
FeastRAGRetriever
has been removed in favour of Feast’s built-in components, yet the bullet list continues to advertise:- **Vector Store**: Custom implementation with Feast integration - **Retriever**: Custom implementation with Feast integration extending HuggingFace's RagRetriever ... - Inference utilising a custom RagRetriever: FeastRagRetriever
Leaving this unchanged will confuse users who try to hunt down code that no longer exists. Update the wording to reflect the built-in Feast retriever/vector-store and remove references to the deleted
FeastRAGRetriever
.Example fix:
- - **Vector Store**: Custom implementation with Feast integration - - **Retriever**: Custom implementation with Feast integration extending HuggingFace's RagRetriever + - **Vector Store**: Feast’s built-in vector store backed by Milvus + - **Retriever**: Feast’s native retriever (no custom subclass needed) ... - - Inference utilising a custom RagRetriever: FeastRagRetriever + - Inference using Feast’s built-in retriever
🧹 Nitpick comments (1)
examples/kfto_feast_rag/README.md (1)
48-50
: Fix Markdown style & remove redundant underscores around filenameThe double-underscore trick isn’t needed when the line is already bolded, and
markdownlint
is complaining about the*
list marker.-* **__feature_repo/ragproject_repo.py__** + - **feature_repo/ragproject_repo.py**
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
examples/kfto_feast_rag/README.md
(1 hunks)examples/kfto_feast_rag/feast_rag_retriever.py
(0 hunks)examples/kfto_feast_rag/feature_repo/feature_store.yaml
(1 hunks)examples/kfto_feast_rag/rag_feast_kfto.ipynb
(7 hunks)
💤 Files with no reviewable changes (1)
- examples/kfto_feast_rag/feast_rag_retriever.py
🚧 Files skipped from review as they are similar to previous changes (2)
- examples/kfto_feast_rag/feature_repo/feature_store.yaml
- examples/kfto_feast_rag/rag_feast_kfto.ipynb
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
examples/kfto_feast_rag/README.md
48-48: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
48-48: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
48-48: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
/lgtm, just a nitpick
may be it's good to add name of vector store and retriever in Readme.file |
/lgtm |
240d412
to
99d68d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
examples/kfto_feast_rag/README.md (2)
23-26
: Tighten wording & formatting for model / store / retriever bullets– Minor Markdown polish: wrap class names in back-ticks so they render as code, drop the possessive “Feast’s” (already implied by the names), and keep wording parallel.
– No functional change, but this avoids future lint warnings and improves readability.- - **Generator Model**: `granite-3.2-2b-instruct` (configurable) - - **Vector Store**: Feast’s built-in FeastVectorStore backed by Milvus - - **Retriever**: Feast’s native rag retriever FeastRAGRetriever + - **Generator Model**: `granite-3.2-2b-instruct` (configurable) + - **Vector Store**: `FeastVectorStore` (Milvus backend) + - **Retriever**: `FeastRAGRetriever`
48-50
: Fix filename markup & satisfy markdown-lintThe double underscores around
feature_repo/ragproject_repo.py
are interpreted as “bold & italics” by Markdown and trigger MD050 / MD004.
Prefer inline-code styling for paths and keep the bullet consistent with the others.-* **__feature_repo/ragproject_repo.py__** +* `feature_repo/ragproject_repo.py`
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
examples/kfto_feast_rag/README.md
(2 hunks)examples/kfto_feast_rag/feast_rag_retriever.py
(0 hunks)examples/kfto_feast_rag/feature_repo/feature_store.yaml
(1 hunks)examples/kfto_feast_rag/rag_feast_kfto.ipynb
(7 hunks)
💤 Files with no reviewable changes (1)
- examples/kfto_feast_rag/feast_rag_retriever.py
🚧 Files skipped from review as they are similar to previous changes (2)
- examples/kfto_feast_rag/feature_repo/feature_store.yaml
- examples/kfto_feast_rag/rag_feast_kfto.ipynb
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
examples/kfto_feast_rag/README.md
48-48: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
48-48: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
48-48: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
🔇 Additional comments (1)
examples/kfto_feast_rag/README.md (1)
54-60
: Inconsistent wording: still refers to a “custom RagRetriever”Earlier you state the example now uses the built-in
FeastRAGRetriever
; calling it “custom” here is confusing. Please align with the updated wording.- - Inference utilising a custom RagRetriever: FeastRagRetriever + - Inference utilising the built-in `FeastRAGRetriever`
/lgtm |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
99d68d9
to
a77fddf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
examples/kfto_feast_rag/README.md (1)
48-48
: Fix markdown formatting issues.Static analysis flagged several formatting inconsistencies on this line that should be addressed for consistency.
Apply this diff to fix the formatting issues:
-* **__feature_repo/ragproject_repo.py__** +- **feature_repo/ragproject_repo.py**
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
examples/kfto_feast_rag/README.md
(2 hunks)examples/kfto_feast_rag/feast_rag_retriever.py
(0 hunks)examples/kfto_feast_rag/feature_repo/feature_store.yaml
(1 hunks)examples/kfto_feast_rag/rag_feast_kfto.ipynb
(7 hunks)
💤 Files with no reviewable changes (1)
- examples/kfto_feast_rag/feast_rag_retriever.py
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/kfto_feast_rag/feature_repo/feature_store.yaml
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
examples/kfto_feast_rag/README.md
48-48: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
48-48: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
48-48: Strong style
Expected: asterisk; Actual: underscore
(MD050, strong-style)
🔇 Additional comments (10)
examples/kfto_feast_rag/README.md (1)
23-25
: Component descriptions accurately reflect the migration to native Feast components.The updated descriptions correctly document the transition from custom implementations to Feast's built-in components, which aligns with the broader refactoring goals of this PR.
examples/kfto_feast_rag/rag_feast_kfto.ipynb (9)
23-23
: Installation simplified appropriately.Removing the explicit
faiss-cpu
dependency streamlines the installation process while maintaining functionality through the Feast dependencies.
52-58
: Clear explanation of the chunking strategy.The added markdown cell provides valuable context about the chunking approach, explaining the character limit and word-boundary preservation strategy.
65-96
: Improved chunking logic with better word boundary handling.The new chunking implementation correctly preserves word boundaries and ensures chunks don't exceed the character limit, which is an improvement over the previous approach.
140-147
: Good practice to organize data in dedicated directory.Creating a dedicated
data
directory withinfeature_repo
improves project organization and separation of concerns.
174-174
: Data path updated consistently with new directory structure.The parquet file path correctly reflects the new
feature_repo/data/
directory structure.
260-262
: Import statements correctly updated for native Feast components.The imports now correctly reference the native Feast modules and the corrected feature repository filename, aligning with the migration away from custom implementations.
275-278
: FeastVectorStore initialization updated appropriately.The initialization now uses
repo_path="."
instead of a store object and includes the passage_id in the features list, which is consistent with the native Feast implementation.
280-280
: FeastIndex initialization simplified correctly.Removing the
vector_store
argument simplifies the initialization consistent with the native Feast implementation.
292-298
: FeastRAGRetriever initialization updated with correct parameters.The retriever initialization now uses the correct parameters for the native Feast implementation, including the explicit
feature_view
andfeatures
parameters and the newtext_field
argument.
" if current_chunk_words:\n", | ||
" chunk_text = ' '.join(current_chunk_words)\n", | ||
" all_chunks.append(chunk_text)\n", | ||
" all_ids.append(f\"{examples['id'][i]}_{len(all_chunks)}\") # Unique ID for the chunk\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential issue with chunk ID generation.
The chunk ID generation uses len(all_chunks)
which includes the current chunk being added, potentially causing inconsistent or duplicate IDs.
Apply this diff to fix the ID generation:
- all_ids.append(f"{examples['id'][i]}_{len(all_chunks)}") # Unique ID for the chunk
+ all_ids.append(f"{examples['id'][i]}_{len(all_chunks) + 1}") # Unique ID for the chunk
- all_ids.append(f"{examples['id'][i]}_{len(all_chunks)}") # Unique ID for the chunk
+ all_ids.append(f"{examples['id'][i]}_{len(all_chunks) + 1}") # Unique ID for the chunk
Also applies to: 94-94
🤖 Prompt for AI Agents
In examples/kfto_feast_rag/rag_feast_kfto.ipynb at lines 83 and 94, the chunk ID
generation uses len(all_chunks) which counts the current chunk being added,
risking duplicate or inconsistent IDs. To fix this, replace len(all_chunks) with
the index of the chunk before appending it, ensuring each chunk ID is unique and
consistent by using the correct chunk index value.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: astefanutti The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description
Following on from merging of these PR's (feast-dev/feast#5405, feast-dev/feast#5470) in the feast repo, this PR updates the feast rag example in accordance with these changes.
How Has This Been Tested?
I have tested this manually on the team cluster, by cloning this PR and running the notebook.
Merge criteria:
Summary by CodeRabbit