Skip to content

Commit 4a34243

Browse files
committed
Add MS COCO utils
1 parent fcb68f0 commit 4a34243

File tree

4 files changed

+299
-7
lines changed

4 files changed

+299
-7
lines changed

README.md

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -299,23 +299,30 @@ with tf.Session() as sess:
299299
* `TinyYOLOv2VOC`: `TinyYOLOv2(inputs, TinyDarknet19)`,
300300
* `FasterRCNN_ZF_VOC`: `FasterRCNN(inputs, ZF)`,
301301
* `FasterRCNN_VGG16_VOC`: `FasterRCNN(inputs, VGG16, stem_out='conv5/3')`.
302-
- The mAPs were obtained with TensorNets on **PASCAL VOC2007 test set** and may slightly differ from the original ones.
302+
- The mAPs were obtained with TensorNets and may slightly differ from the original ones.
303303
- The test input sizes were the numbers reported as the best in the papers:
304304
* `YOLOv3`, `YOLOv2`: 416x416
305305
* `FasterRCNN`: min\_shorter\_side=600, max\_longer\_side=1000
306306
- The sizes stand for rounded the number of parameters.
307307
- The computation times were measured on NVIDIA Tesla P100 (3584 cores, 16 GB global memory) with cuDNN 6.0 and CUDA 8.0.
308-
* Speed: milliseconds only for network inferences of a 416x416 single image
308+
* Speed: milliseconds only for network inferences of a 416x416 or 608x608 single image
309309
* FPS: 1000 / speed
310310

311-
| | mAP | Size | Speed | FPS | References |
311+
| PASCAL VOC2007 test | mAP | Size | Speed | FPS | References |
312312
|------------------------------------------------------------------------|--------|--------|-------|-------|------------|
313-
| [YOLOv3VOC](tensornets/references/yolos.py#L175) | 0.7423 | 62M | 24.09 | 41.51 | [[paper]](https://pjreddie.com/media/files/papers/YOLOv3.pdf) [[darknet]](https://pjreddie.com/darknet/yolo/) [[darkflow]](https://github.com/thtrieu/darkflow) |
314-
| [YOLOv2VOC](tensornets/references/yolos.py#L195) | 0.7320 | 51M | 14.75 | 67.80 | [[paper]](https://arxiv.org/abs/1612.08242) [[darknet]](https://pjreddie.com/darknet/yolov2/) [[darkflow]](https://github.com/thtrieu/darkflow) |
315-
| [TinyYOLOv2VOC](tensornets/references/yolos.py#L205) | 0.5303 | 16M | 6.534 | 153.0 | [[paper]](https://arxiv.org/abs/1612.08242) [[darknet]](https://pjreddie.com/darknet/yolov2/) [[darkflow]](https://github.com/thtrieu/darkflow) |
313+
| [YOLOv3VOC(416)](tensornets/references/yolos.py#L175) | 0.7423 | 62M | 24.09 | 41.51 | [[paper]](https://pjreddie.com/media/files/papers/YOLOv3.pdf) [[darknet]](https://pjreddie.com/darknet/yolo/) [[darkflow]](https://github.com/thtrieu/darkflow) |
314+
| [YOLOv2VOC(416)](tensornets/references/yolos.py#L195) | 0.7320 | 51M | 14.75 | 67.80 | [[paper]](https://arxiv.org/abs/1612.08242) [[darknet]](https://pjreddie.com/darknet/yolov2/) [[darkflow]](https://github.com/thtrieu/darkflow) |
315+
| [TinyYOLOv2VOC(416)](tensornets/references/yolos.py#L205) | 0.5303 | 16M | 6.534 | 153.0 | [[paper]](https://arxiv.org/abs/1612.08242) [[darknet]](https://pjreddie.com/darknet/yolov2/) [[darkflow]](https://github.com/thtrieu/darkflow) |
316316
| [FasterRCNN\_ZF\_VOC](tensornets/references/rcnns.py#L151) | 0.4466 | 59M | 241.4 | 3.325 | [[paper]](https://arxiv.org/abs/1506.01497) [[caffe]](https://github.com/rbgirshick/py-faster-rcnn) [[roi-pooling]](https://github.com/deepsense-ai/roi-pooling) |
317317
| [FasterRCNN\_VGG16\_VOC](tensornets/references/rcnns.py#L187) | 0.6872 | 137M | 300.7 | 4.143 | [[paper]](https://arxiv.org/abs/1506.01497) [[caffe]](https://github.com/rbgirshick/py-faster-rcnn) [[roi-pooling]](https://github.com/deepsense-ai/roi-pooling) |
318318

319+
| MS COCO val2014 | mAP | Size | Speed | FPS | References |
320+
|------------------------------------------------------------------------|--------|--------|-------|-------|------------|
321+
| [YOLOv3COCO(608)](tensornets/references/yolos.py#L167) | 0.6016 | 62M | 60.66 | 16.49 | [[paper]](https://pjreddie.com/media/files/papers/YOLOv3.pdf) [[darknet]](https://pjreddie.com/darknet/yolo/) [[darkflow]](https://github.com/thtrieu/darkflow) |
322+
| [YOLOv3COCO(416)](tensornets/references/yolos.py#L167) | 0.6028 | 62M | 40.23 | 24.85 | [[paper]](https://pjreddie.com/media/files/papers/YOLOv3.pdf) [[darknet]](https://pjreddie.com/darknet/yolo/) [[darkflow]](https://github.com/thtrieu/darkflow) |
323+
| [YOLOv2COCO(608)](tensornets/references/yolos.py#L187) | 0.5189 | 51M | 45.88 | 21.80 | [[paper]](https://arxiv.org/abs/1612.08242) [[darknet]](https://pjreddie.com/darknet/yolov2/) [[darkflow]](https://github.com/thtrieu/darkflow) |
324+
| [YOLOv2COCO(416)](tensornets/references/yolos.py#L187) | 0.4922 | 51M | 21.66 | 46.17 | [[paper]](https://arxiv.org/abs/1612.08242) [[darknet]](https://pjreddie.com/darknet/yolov2/) [[darkflow]](https://github.com/thtrieu/darkflow) |
325+
319326
## News 📰
320327

321328
- PNASNetlarge is released, [12 May 2018](https://github.com/taehoonlee/tensornets/commit/e2e0f0f7791731d3b7dfa989cae569c15a22cdd6).
@@ -329,6 +336,6 @@ with tf.Session() as sess:
329336
- Add image classification models (PolyNet).
330337
- Add object detection models (MaskRCNN, SSD).
331338
- Add image segmentation models (FCN, UNet).
332-
- Add image datasets (COCO, OpenImages).
339+
- Add image datasets (OpenImages).
333340
- Add style transfer examples which can be coupled with any network in TensorNets.
334341
- Add speech and language models with representative datasets (WaveNet, ByteNet).

tensornets/datasets/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
from __future__ import absolute_import
22

3+
from . import coco
34
from . import imagenet
45
from . import voc

tensornets/datasets/coco.names

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
person
2+
bicycle
3+
car
4+
motorbike
5+
aeroplane
6+
bus
7+
train
8+
truck
9+
boat
10+
traffic light
11+
fire hydrant
12+
stop sign
13+
parking meter
14+
bench
15+
bird
16+
cat
17+
dog
18+
horse
19+
sheep
20+
cow
21+
elephant
22+
bear
23+
zebra
24+
giraffe
25+
backpack
26+
umbrella
27+
handbag
28+
tie
29+
suitcase
30+
frisbee
31+
skis
32+
snowboard
33+
sports ball
34+
kite
35+
baseball bat
36+
baseball glove
37+
skateboard
38+
surfboard
39+
tennis racket
40+
bottle
41+
wine glass
42+
cup
43+
fork
44+
knife
45+
spoon
46+
bowl
47+
banana
48+
apple
49+
sandwich
50+
orange
51+
broccoli
52+
carrot
53+
hot dog
54+
pizza
55+
donut
56+
cake
57+
chair
58+
sofa
59+
pottedplant
60+
bed
61+
diningtable
62+
toilet
63+
tvmonitor
64+
laptop
65+
mouse
66+
remote
67+
keyboard
68+
cell phone
69+
microwave
70+
oven
71+
toaster
72+
sink
73+
refrigerator
74+
book
75+
clock
76+
vase
77+
scissors
78+
teddy bear
79+
hair drier
80+
toothbrush

tensornets/datasets/coco.py

Lines changed: 204 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
"""Collection of MS COCO utils
2+
3+
The codes were adapted from [py-faster-rcnn](https://github.com/
4+
rbgirshick/py-faster-rcnn/blob/master/lib/datasets/voc_eval.py).
5+
"""
6+
from __future__ import division
7+
8+
import os
9+
import json
10+
import numpy as np
11+
12+
try:
13+
import cv2
14+
except ImportError:
15+
cv2 = None
16+
17+
try:
18+
from pycocotools.coco import COCO
19+
except ImportError:
20+
COCO = None
21+
22+
try:
23+
xrange # Python 2
24+
except NameError:
25+
xrange = range # Python 3
26+
27+
28+
metas = {}
29+
30+
with open(os.path.join(os.path.dirname(__file__), 'coco.names'), 'r') as f:
31+
classnames = [line.rstrip() for line in f.readlines()]
32+
33+
34+
def classidx(classname):
35+
return dict((k, i) for (i, k) in enumerate(classnames))[classname]
36+
37+
38+
def area(box):
39+
if box.ndim == 1:
40+
return (box[2] - box[0] + 1.) * (box[3] - box[1] + 1.)
41+
else:
42+
return (box[:, 2] - box[:, 0] + 1.) * (box[:, 3] - box[:, 1] + 1.)
43+
44+
45+
def get_files(data_dir, data_name, total_num=None):
46+
assert COCO is not None, '`datasets.coco` requires `pycocotools`.'
47+
if data_name not in metas:
48+
metas[data_name] = COCO("%s/annotations/instances_%s.json" %
49+
(data_dir, data_name))
50+
images = metas[data_name].imgs
51+
fileids = images.keys()
52+
if total_num is not None:
53+
fileids = fileids[:total_num]
54+
files = [images[i]['file_name'] for i in fileids]
55+
return fileids, files
56+
57+
58+
def get_annotations(data_dir, data_name, ids):
59+
assert COCO is not None, '`datasets.coco` requires `pycocotools`.'
60+
if data_name not in metas:
61+
metas[data_name] = COCO("%s/annotations/instances_%s.json" %
62+
(data_dir, data_name))
63+
cmap = dict([(b, a) for (a, b) in enumerate(metas[data_name].getCatIds())])
64+
annotations = {}
65+
for i in ids:
66+
annids = metas[data_name].getAnnIds(imgIds=i, iscrowd=None)
67+
objs = metas[data_name].loadAnns(annids)
68+
annotations[i] = [[] for _ in range(80)]
69+
width = metas[data_name].imgs[i]['width']
70+
height = metas[data_name].imgs[i]['height']
71+
valid_objs = []
72+
for obj in objs:
73+
x1 = np.max((0, obj['bbox'][0]))
74+
y1 = np.max((0, obj['bbox'][1]))
75+
x2 = np.min((width - 1, x1 + np.max((0, obj['bbox'][2] - 1))))
76+
y2 = np.min((height - 1, y1 + np.max((0, obj['bbox'][3] - 1))))
77+
if obj['area'] > 0 and x2 >= x1 and y2 >= y1:
78+
obj_struct = {'bbox': [x1, y1, x2, y2]}
79+
cidx = cmap[obj['category_id']]
80+
annotations[i][cidx].append(obj_struct)
81+
return annotations
82+
83+
84+
def load(data_dir, data_name, min_shorter_side=None, max_longer_side=1000,
85+
batch_size=1, total_num=None):
86+
assert cv2 is not None, '`load` requires `cv2`.'
87+
_, files = get_files(data_dir, data_name, total_num)
88+
total_num = len(files)
89+
90+
for batch_start in range(0, total_num, batch_size):
91+
x = cv2.imread("%s/%s/%s" % (data_dir, data_name, files[batch_start]))
92+
if min_shorter_side is not None:
93+
scale = float(min_shorter_side) / np.min(x.shape[:2])
94+
else:
95+
scale = 1.0
96+
if round(scale * np.max(x.shape[:2])) > max_longer_side:
97+
scale = float(max_longer_side) / np.max(x.shape[:2])
98+
x = cv2.resize(x, None, None, fx=scale, fy=scale,
99+
interpolation=cv2.INTER_LINEAR)
100+
x = np.array([x], dtype=np.float32)
101+
scale = np.array([scale], dtype=np.float32)
102+
yield x, scale
103+
del x
104+
105+
106+
def evaluate_class(ids, scores, boxes, annotations, files, ovthresh):
107+
if scores.shape[0] == 0:
108+
return 0.0, np.zeros(len(ids)), np.zeros(len(ids))
109+
110+
# extract gt objects for this class
111+
diff = [np.array([0 for obj in annotations[filename]])
112+
for filename in files]
113+
total = sum([sum(x == 0) for x in diff])
114+
detected = dict(zip(files, [[False] * len(x) for x in diff]))
115+
116+
# sort by confidence
117+
sorted_ind = np.argsort(-scores)
118+
ids = ids[sorted_ind]
119+
boxes = boxes[sorted_ind, :]
120+
121+
# go down dets and mark TPs and FPs
122+
tp_list = []
123+
fp_list = []
124+
for d in range(len(ids)):
125+
actual = np.array([x['bbox'] for x in annotations[ids[d]]])
126+
difficult = np.array([0 for x in annotations[ids[d]]])
127+
128+
if actual.size > 0:
129+
iw = np.maximum(np.minimum(actual[:, 2], boxes[d, 2]) -
130+
np.maximum(actual[:, 0], boxes[d, 0]) + 1, 0)
131+
ih = np.maximum(np.minimum(actual[:, 3], boxes[d, 3]) -
132+
np.maximum(actual[:, 1], boxes[d, 1]) + 1, 0)
133+
inters = iw * ih
134+
overlaps = inters / (area(actual) + area(boxes[d, :]) - inters)
135+
jmax = np.argmax(overlaps)
136+
ovmax = overlaps[jmax]
137+
else:
138+
ovmax = -np.inf
139+
140+
tp = 0.
141+
fp = 0.
142+
if ovmax > ovthresh:
143+
if difficult[jmax] == 0:
144+
if not detected[ids[d]][jmax]:
145+
tp = 1.
146+
detected[ids[d]][jmax] = True
147+
else:
148+
fp = 1.
149+
else:
150+
fp = 1.
151+
tp_list.append(tp)
152+
fp_list.append(fp)
153+
154+
tp = np.cumsum(tp_list)
155+
fp = np.cumsum(fp_list)
156+
recall = tp / float(total)
157+
precision = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
158+
ap = np.mean([0 if np.sum(recall >= t) == 0
159+
else np.max(precision[recall >= t])
160+
for t in np.linspace(0, 1, 11)])
161+
162+
return ap, precision, recall
163+
164+
165+
def evaluate(results, data_dir, data_name, ovthresh=0.5, verbose=True):
166+
fileids, _ = get_files(data_dir, data_name)
167+
fileids = fileids[:len(results)]
168+
annotations = get_annotations(data_dir, data_name, fileids)
169+
aps = []
170+
171+
for c in range(80):
172+
ids = []
173+
scores = []
174+
boxes = []
175+
for (i, fileid) in enumerate(fileids):
176+
pred = results[i][c]
177+
if pred.shape[0] > 0:
178+
for k in xrange(pred.shape[0]):
179+
ids.append(fileid)
180+
scores.append(pred[k, -1])
181+
boxes.append(pred[k, :4] + 1)
182+
ids = np.array(ids)
183+
scores = np.array(scores)
184+
boxes = np.array(boxes)
185+
_annotations = dict((k, v[c]) for (k, v) in annotations.iteritems())
186+
ap, _, _ = evaluate_class(ids, scores, boxes, _annotations,
187+
fileids, ovthresh)
188+
aps += [ap]
189+
190+
strs = ''
191+
for c in range(80):
192+
strs += "| %6s " % classnames[c][:6]
193+
strs += '|\n'
194+
195+
for ap in aps:
196+
strs += '|--------'
197+
strs += '|\n'
198+
199+
for ap in aps:
200+
strs += "| %.4f " % ap
201+
strs += '|\n'
202+
203+
strs += "Mean = %.4f" % np.mean(aps)
204+
return strs

0 commit comments

Comments
 (0)