Hi there!
I'm using DIGITS for object detection. I am using three custom classes with custom image sizes. I have an unbalanced training dataset. I have 4k images for Class1, 1.5k images for Class2, and 400 images for Class3. Also, the bounding boxes size are much less than 25x25.
First, I tried the following:
- When creating the dataset, I resize the images to 1024x1024 and in the custom class I added: "DontCare,Class1,Class2,Class3". Resizing to 1024x1024, I doubled the size of most of the images and also the bounding boxes (trying to match the 50x50 - 400x400 range).
- I modified the prototxt, according to the 2-Class prototxt example, for allowing 3 class detection. Also, I changed the image size to match 1024x1024.
- Also in the prototxt, I changed the data_augmentation layer to set the probabilities to zero. If I did not change this, the model won't get any mAP greater than zero for any of the three classes.
I have trained approximately 800 epochs. First, I trained 100 epochs using the bvlc_googlenet.caffemodel weights. Then, I trained 300 more epochs using the previous model as pre-trained. Finally, I trained 450 epochs more, again, using the previous model as pre-trained. With this approach I got the following mAP: Class1 = 53.75%, Class2 = 24.42%, Class3 = 0.91%.
After this, I tried to create a balanced dataset reducing the number of images for Class1 and Class2 to match the number of images of Class3. The total number of images for this try was 1100. I trained 500 epochs and I got the following mAP: Class1 = 25.6%, Class2= 24.22%, and Class3 = 5.55%. It helped me to increase the mAP for class3 but class 1 dropped by 50% and class 2 stayed similar.
Later, I tried an offline data augmentation for Class3. I generated 6 new images for each image applying: horizontal flip, shear, rotation by 60 degrees, rotation by 10 degrees, and translation. With this, I increased my dataset size to 6k images. It is still training, but it has passed 45 epochs and the preliminary results are: mAP: Class1 = 30.98%, Class2= 20.73% and Class3 = 4.28%. Below is the chart from Digits.

My friends, based on your experience, What could be the problem? or What approach I could use to increase the mAP for Class3?
Thanks in advance!