Skip to content

sazzadhrz/Bangla-OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Bangla OCR

Dataset Description:

BanglaWriting

Process

Preprocessing

The dataset is not processed and it needs further preprocessing. From the raw image folder the word images have been extracted using the provided json file. During the extraction process the cropped images are binarized using Otsu’s Binarization technique. The filename follows the configuration below.

"পরিবার 18__225_15_1.jpg" as "label wordNumberOfThePage__uniquePersonNumber_age_gender.extension"

Model

  • CRNN = CNN + BiDirectional GRU

Loss Function

  • CTC Loss

Optimizer

  • Adam

Usage

Download the dataset from the provided link and unzip the "raw" file in the current directory and run

python generator.py

Finally, run the notebook.

Requirements

  • python==3.7.0
  • numpy=1.16.0
  • scikit-learn==0.23.2
  • opencv-python==4.4.0.46
  • torch==1.7.0
  • tqdm==4.53.0

Further Improvement can be done through:

  • Preprocessing such as skew correction, noise removal, thinning and skeletonization
  • Gathering and/or generating synthetic data
  • Making the dataset balanced
  • Using Focal CTC loss to overcome class imbalance problem
  • Using Edit distance to predict neareast word
  • Using better optimizer such as RAdam

References

  1. Handwriting to Text Conversion using Time Distributed CNN and LSTM with CTC Loss Function

  2. Use PyTorch’s DataLoader with Variable Length Sequences for LSTM/GRU

  3. Data Preparation for Variable Length Input Sequences

  4. Captcha recognition using PyTorch (Convolutional-RNN + CTC Loss)

  5. Image Text Recognition

  6. Sequence Modeling: Recurrentand Recursive Nets

About

Bangla handwriting recognition utilizing BanglaWriting dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published