Detectnet, Multiple class object detection with an imbalanced dataset, Poor results (mAP) for one class

Hi there!

I'm using DIGITS for object detection. I am using three custom classes with custom image sizes. I have an unbalanced training dataset. I have 4k images for Class1, 1.5k images for Class2, and 400 images for Class3. Also, the bounding boxes size are much less than 25x25. 
First, I tried the following:

- When creating the dataset, I resize the images to 1024x1024 and in the custom class I added: "DontCare,Class1,Class2,Class3". Resizing to 1024x1024, I doubled the size of most of the images and also the bounding boxes (trying to match the 50x50 - 400x400 range).
- I modified the prototxt, according to the 2-Class prototxt example, for allowing 3 class detection. Also, I changed the image size to match 1024x1024.
- Also in the prototxt, I changed the data_augmentation layer to set the probabilities to zero. If I did not change this, the model won't get any mAP greater than zero for any of the three classes.

I have trained approximately 800 epochs. First, I trained 100 epochs using the bvlc_googlenet.caffemodel weights. Then, I trained 300 more epochs using the previous model as pre-trained. Finally, I trained 450 epochs more, again, using the previous model as pre-trained. With this approach I got the following mAP: **Class1 = 53.75%, Class2 = 24.42%, Class3 = 0.91%.**

After this, I tried to create a balanced dataset reducing the number of images for Class1 and Class2 to match the number of images of Class3. The total number of images for this try was 1100. I trained 500 epochs and I got the following mAP: **Class1 = 25.6%, Class2= 24.22%, and Class3 = 5.55%.** It helped me to increase the mAP for class3 but class 1 dropped by 50% and class 2 stayed similar.

Later, I tried an offline data augmentation for Class3. I generated 6 new images for each image applying: horizontal flip, shear, rotation by 60 degrees, rotation by 10 degrees, and translation. With this, I increased my dataset size to 6k images. It is still training, but it has passed 45 epochs and the preliminary results are: mAP: **Class1 = 30.98%, Class2= 20.73% and Class3 = 4.28%**. Below is the chart from Digits.

<img width="717" alt="image" src="https://user-images.githubusercontent.com/8478711/117376191-3b690d80-ae96-11eb-80e0-e459efad254b.png">

My friends, based on your experience, What could be the problem? or What approach I could use to increase the mAP for Class3?

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Detectnet, Multiple class object detection with an imbalanced dataset, Poor results (mAP) for one class #2241

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Detectnet, Multiple class object detection with an imbalanced dataset, Poor results (mAP) for one class #2241

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions