Optimizing-rag-performance-in-librechat.md

voelspriet · web-flow · commit 081c059a4e31 · 2025-04-05T08:24:53.000+02:00
This PR adds a detailed blog post written by Henk van Ess that covers how to optimize Retrieval-Augmented Generation (RAG) performance in LibreChat.

The guide walks through:

Improving vector database performance (PostgreSQL/pgvector)
Chunking strategies (CHUNK_SIZE / CHUNK_OVERLAP)
Embedding provider options (OpenAI, Azure, Ollama)
Retrieval settings (RAG_API_TOP_K)
Monitoring and server resource tips
It's designed to help developers fine-tune their LibreChat instances for speed and quality. All content is based on hands-on testing and is Markdown-formatted for blog use.

Looking forward to feedback — happy to revise if needed!
diff --git a/pages/blog/optimizing-rag-performance-in-librechat.md b/pages/blog/optimizing-rag-performance-in-librechat.md
@@ -67,7 +67,7 @@ Check with `\di` again. Look for a `hnsw` or `ivfflat` index type.
 docker stats vectordb
 ```
 
-Watch for memory or CPU saturation. PostgreSQL benefits from abundant RAM.
+Watch for memory and/or CPU saturation. PostgreSQL benefits from abundant RAM.
 
 #### Optional: Set resource limits in `docker-compose.override.yml`