Bangla OCR

Dataset Description:

BanglaWriting

Process

Preprocessing

The dataset is not processed and it needs further preprocessing. From the raw image folder the word images have been extracted using the provided json file. During the extraction process the cropped images are binarized using Otsu’s Binarization technique. The filename follows the configuration below.

"পরিবার 18__225_15_1.jpg" as "label wordNumberOfThePage__uniquePersonNumber_age_gender.extension"

Model

CRNN = CNN + BiDirectional GRU

Loss Function

CTC Loss

Optimizer

Adam

Usage

Download the dataset from the provided link and unzip the "raw" file in the current directory and run

python generator.py

Finally, run the notebook.

Requirements

python==3.7.0
numpy=1.16.0
scikit-learn==0.23.2
opencv-python==4.4.0.46
torch==1.7.0
tqdm==4.53.0

Further Improvement can be done through:

Preprocessing such as skew correction, noise removal, thinning and skeletonization
Gathering and/or generating synthetic data
Making the dataset balanced
Using Focal CTC loss to overcome class imbalance problem
Using Edit distance to predict neareast word
Using better optimizer such as RAdam

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
OCR.ipynb		OCR.ipynb
generator.py		generator.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bangla OCR

Dataset Description:

BanglaWriting

Process

Preprocessing

Model

Loss Function

Optimizer

Usage

Requirements

Further Improvement can be done through:

References

About

Uh oh!

Releases

Packages

Languages

sazzadhrz/Bangla-OCR

Folders and files

Latest commit

History

Repository files navigation

Bangla OCR

Dataset Description:

BanglaWriting

Process

Preprocessing

Model

Loss Function

Optimizer

Usage

Requirements

Further Improvement can be done through:

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages