-
Notifications
You must be signed in to change notification settings - Fork 103
Closed
Description
FasterRCNN supports training with multiple objects per image (according to following)
During training, the model expects both the input tensors, as well as a targets (list of dictionary), containing:
- boxes (FloatTensor[N, 4]): the ground-truth boxes in [x1, y1, x2, y2] format, with values of x between 0 and W and values of y between 0 and H
- labels (Int64Tensor[N]): the class label for each ground-truth box
Is this supported by detecto? I have only been able to pass in {"boxes": Tensor(1, 4), "labels": str}
Would you be open to a pull request fixing this? It looks like the only place where there is an issue is in core.Model
class Model:
...
# Converts all string labels in a list of target dicts to
# their corresponding int mappings
def _convert_to_int_labels(self, targets):
for target in targets:
# Convert string labels to integer mapping
if _is_iterable(target["labels"]):
target["labels"] = torch.tensor(self._int_mapping[label] for label in target["labels"])
else:
target["labels"] = torch.tensor(self._int_mapping[target["labels"]]).view(1)
This would now accept target["labels"] = "one_object"
or target["labels"] = ["obj1_class", "obj2_class"]
and would still return a tensor of length equal to the number of objects.
Metadata
Metadata
Assignees
Labels
No labels