Issues with large document sets and source references #791
Unanswered
octoberweb69
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I’m using Kotaemon and I’ve noticed two issues that I’d like to share, to understand whether others have experienced the same or if there are possible fixes/configurations.
Issue 1: Source prioritization
I uploaded about 13,000 documents.
When I run queries, the system seems to mainly select the documents that were uploaded first, while the ones uploaded later are rarely considered.
It looks like there’s some kind of prioritization on the earliest indexed sources.
Issue 2: Source references in chatbot answers
In the chatbot’s response, the inline reference markers (bullets) for sources sometimes do not correctly point to the actual source document.
This makes it hard to verify the validity of the answer, since the source ↔ content mapping is inaccurate.
Has anyone else noticed these behaviors?
Do you have suggestions for fixes or any known workarounds?
Thanks! 🙏
Beta Was this translation helpful? Give feedback.
All reactions