-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Describe the bug
There's an issue in the logic at:
tesseract.js/src/worker-script/index.js
Lines 101 to 104 in adcb5b8
if (path !== null) { | |
const resp = await (isWebWorker ? fetch : adapter.fetch)(`${path}/${lang}.traineddata${gzip ? '.gz' : ''}`); | |
data = await resp.arrayBuffer(); | |
} else { |
It doesn't check if the response is successful, which results in a bad response being cached indefinitely. This can happen if the server is down or there's a temporary 404 error, resulting in clients being broken until the IndexedDB is cleared.
This results in the following error being logged on all subsequent usages even if the server error was resolved:
tesseract-core.wasm.js:9 Error opening data file ./mrza.traineddata
tesseract-core.wasm.js:9 Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
tesseract-core.wasm.js:9 Failed loading language 'mrza'
tesseract-core.wasm.js:9 Tesseract couldn't load any languages!
To Reproduce
Use an invalid language and notice that the response is cached in IndexedDb.
Expected behavior
An error should be thrown.
Screenshots
In the screenshot below, mrza
is a file that didn't exist, notice the length of the Unit8Array. It's a 404 response.
Additional context
I'll open a PR with a fix shortly.