Feat: Add first-class LanceDB retrievers and prompt augmentation across LLM operators #460

shreyashankar · 2025-11-22T20:03:08Z

This change introduces first-class retrievers powered by LanceDB (OSS) and integrates retrieval-based context into LLM operations. Pipelines can now define retrievers once at the top level, specifying what to index and what to query via clear Jinja phrases (always using input.*). At runtime, any map, extract, reduce, or filter op can attach retriever: name; the op fetches FTS/embedding/hybrid results from LanceDB and augments the prompt with {{ retrieval_context }} (or a safe prepend if omitted). The implementation follows LanceDB’s native FTS and hybrid search patterns, including explicit FTS query mode and RRF-based hybrid reranking, aligning with the docs: https://lancedb.com/docs/search/hybrid-search/.

Overall, this PR adds a RAG primitive without per-op overrides, keeps ops simple, and builds indexes once. It also ships documentation (parameter reference and examples), and tests for FTS, embedding, and hybrid retrieval. Optional install is via the “retrieval” extra (lancedb).

shreyashankar added 2 commits November 22, 2025 02:23

feat: adding retrievers

53b4a17

feat: allow indexes to be built on datasets created by docetl pipelines

ba46f45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Add first-class LanceDB retrievers and prompt augmentation across LLM operators #460

Feat: Add first-class LanceDB retrievers and prompt augmentation across LLM operators #460

Uh oh!

shreyashankar commented Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feat: Add first-class LanceDB retrievers and prompt augmentation across LLM operators #460

Are you sure you want to change the base?

Feat: Add first-class LanceDB retrievers and prompt augmentation across LLM operators #460

Uh oh!

Conversation

shreyashankar commented Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants