Skip to content
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion docs/apis/data_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,14 @@
|-------|----|--------|-----|
|`data_dir`|`str`|数据集存放目录。||
|`image_dir`|`str`|输入图像存放目录。||
|`ann_path`|`str`|[COCO格式](https://cocodataset.org/#home)标注文件路径。||
|`anno_path`|`str`|[COCO格式](https://cocodataset.org/#home)标注文件路径。||
|`transforms`|`paddlers.transforms.Compose`|对输入数据应用的数据变换算子。||
|`label_list`|`str` \| `None`|label list文件。label list是一个文本文件,其中每一行包含一个类别的名称。|`None`|
|`num_workers`|`int` \| `str`|加载数据时使用的辅助进程数。若设置为`'auto'`,则按照如下规则确定使用进程数:当CPU核心数大于16时,使用8个数据读取辅助进程;否则,使用CPU核心数一半数量的辅助进程。|`'auto'`|
|`shuffle`|`bool`|是否随机打乱数据集中的样本。|`False`|
|`allow_empty`|`bool`|是否向数据集中添加负样本。|`False`|
|`empty_ratio`|`float`|负样本占比,仅当`allow_empty`为`True`时生效。若`empty_ratio`为负值或大于等于1,则保留所有生成的负样本。|`1.0`|
|`batch_transforms`|`paddlers.transforms.BatchCompose`|对输入数据应用的批数据变换算子。||

### VOC格式目标检测数据集`VOCDetDataset`

Expand All @@ -81,6 +82,7 @@
|`shuffle`|`bool`|是否随机打乱数据集中的样本。|`False`|
|`allow_empty`|`bool`|是否向数据集中添加负样本。|`False`|
|`empty_ratio`|`float`|负样本占比,仅当`allow_empty`为`True`时生效。若`empty_ratio`为负值或大于等于1,则保留所有生成的负样本。|`1.0`|
|`batch_transforms`|`paddlers.transforms.BatchCompose`|对输入数据应用的批数据变换算子。||

`VOCDetDataset`对file list的要求如下:

Expand Down
4 changes: 3 additions & 1 deletion docs/apis/data_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,14 @@ The initialization parameter list is as follows:
|-------|----|--------|-----|
|`data_dir`|`str`|Directory that stores the dataset.||
|`image_dir`|`str`|Directory of input images.||
|`ann_path`|`str`|[COCO Format](https://cocodataset.org/#home)label file path.||
|`anno_path`|`str`|[COCO Format](https://cocodataset.org/#home)label file path.||
|`transforms`|`paddlers.transforms.Compose`|Data transformation operators applied to input data.||
|`label_list`|`str` \| `None`|Label list path. Label list is a text file, in which each line contains the name of class.|`None`|
|`num_workers`|`int` \| `str`|Number of auxiliary processes used when loading data. If it is set to `'auto'`, use the following rules to determine the number of processes to use: When the number of CPU cores is greater than 16, 8 data read auxiliary processes are used; otherwise, the number of auxiliary processes is set to half the counts of CPU cores.|`'auto'`|
|`shuffle`|`bool`|Whether to randomly shuffle the samples in the dataset.|`False`|
|`allow_empty`|`bool`|Whether to add negative samples to the dataset.|`False`|
|`empty_ratio`|`float`|Negative sample ratio. Take effect only if `allow_empty` is `True`. If `empty_ratio` is negative or greater than or equal to 1, all negative samples generated are retained.|`1.0`|
|`batch_transforms`|`paddlers.transforms.BatchCompose`|Data batch transformation operators applied to input data.||

### VOC Format Object Detection Dataset `VOCDetDataset`

Expand All @@ -81,6 +82,7 @@ The initialization parameter list is as follows:
|`shuffle`|`bool`|Whether to randomly shuffle the samples in the dataset.|`False`|
|`allow_empty`|`bool`|Whether to add negative samples to the dataset.|`False`|
|`empty_ratio`|`float`|Negative sample ratio. Takes effect only if `allow_empty` is `True`. If `empty_ratio` is negative or greater than or equal to `1`, all negative samples generated will be retained.|`1.0`|
|`batch_transforms`|`paddlers.transforms.BatchCompose`|Data batch transformation operators applied to input data.||

The requirements of `VOCDetDataset` for the file list are as follows:

Expand Down
4 changes: 3 additions & 1 deletion docs/apis/train_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,7 @@ def train(self,
warmup_start_lr=0.0,
lr_decay_epochs=(216, 243),
lr_decay_gamma=0.1,
cosine_decay_num_epochs=1000,
metric=None,
use_ema=False,
early_stop=False,
Expand Down Expand Up @@ -196,7 +197,8 @@ def train(self,
|`warmup_start_lr`|`int`|默认优化器warm-up阶段使用的初始学习率。|`0`|
|`lr_decay_epochs`|`list` \| `tuple`|默认优化器学习率衰减的milestones,以epoch计。即,在第几个epoch执行学习率的衰减。|`(216, 243)`|
|`lr_decay_gamma`|`float`|学习率衰减系数,适用于默认优化器。|`0.1`|
|`metric`|`str` \| `None`|评价指标,可以为`'VOC'`、`COCO`或`None`。若为`None`,则根据数据集格式自动确定使用的评价指标。|`None`|
|`cosine_decay_num_epochs`|`int`|使用余弦退火学习率调度器时计算退火周期的参数。|`1000`|
|`metric`|`str` \| `None`|评价指标,可以为`'VOC'`、`'COCO'`、`'RBOX'`或`None`。若为`None`,则根据数据集格式自动确定使用的评价指标。|`None`|
|`use_ema`|`bool`|是否启用[指数滑动平均策略](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/models/ppdet/optimizer.py)更新模型权重参数。|`False`|
|`early_stop`|`bool`|训练过程是否启用早停策略。|`False`|
|`early_stop_patience`|`int`|启用早停策略时的`patience`参数(参见[`EarlyStop`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/utils/utils.py))。|`5`|
Expand Down
4 changes: 3 additions & 1 deletion docs/apis/train_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,7 @@ def train(self,
warmup_start_lr=0.0,
lr_decay_epochs=(216, 243),
lr_decay_gamma=0.1,
cosine_decay_num_epochs=1000,
metric=None,
use_ema=False,
early_stop=False,
Expand Down Expand Up @@ -196,7 +197,8 @@ The meaning of each parameter is as follows:
|`warmup_start_lr`|`int`|Default initial learning rate used in the warm-up phase of the optimizer.|`0`|
|`lr_decay_epochs`|`list` \| `tuple`|Milestones of learning rate decline of the default optimizer, in terms of epochs. That is, which epoch the decay of the learning rate occurs.|`(216, 243)`|
|`lr_decay_gamma`|`float`|Learning rate attenuation coefficient, for default optimizer.|`0.1`|
|`metric`|`str` \| `None`|Evaluation metrics, which can be `'VOC'`, `COCO`, or `None`. If `None`, the evaluation metrics will be automatically determined according to the format of the dataset.|`None`|
|`cosine_decay_num_epochs`|`int`|Parameter to determine the annealing cycle when a cosine annealing learning rate scheduler is used.|`1000`|
|`metric`|`str` \| `None`|Evaluation metrics, which can be `'VOC'`, `'COCO'`, `'RBOX'`, or `None`. If `None`, the evaluation metrics will be automatically determined according to the format of the dataset.|`None`|
|`use_ema`|`bool`|Whether to enable [exponential moving average strategy](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/models/ppdet/optimizer.py) to update model weights.|`False`|
|`early_stop`|`bool`|Whether to enable the early stopping policy during training.|`False`|
|`early_stop_patience`|`int`|`patience` parameter when the early stopping policy is enabled. Please refer to [`EarlyStop`](https://github.com/PaddlePaddle/PaddleRS/blob/develop/paddlers/utils/utils.py) for more details.|`5`|
Expand Down
5 changes: 3 additions & 2 deletions docs/intro/data_prep_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@
| 变化检测 | LEVIR-CD | https://justchenhao.github.io/LEVIR/ | [prepare_levircd.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_levircd.py) |
| 变化检测 | Season-varying | https://paperswithcode.com/dataset/cdd-dataset-season-varying | [prepare_svcd.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_svcd.py) |
| 场景分类 | UC Merced | http://weegee.vision.ucmerced.edu/datasets/landuse.html | [prepare_ucmerced.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_ucmerced.py) |
| 目标检测 | RSOD | https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset- | [prepare_rsod](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_rsod.py) |
| 图像分割 | iSAID | https://captain-whu.github.io/iSAID/ | [prepare_isaid](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_isaid.py) |
| 目标检测 | DOTA | https://captain-whu.github.io/DOTA/ | [prepare_dota.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_dota.py) |
| 目标检测 | RSOD | https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset- | [prepare_rsod.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_rsod.py) |
| 图像分割 | iSAID | https://captain-whu.github.io/iSAID/ | [prepare_isaid.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_isaid.py) |
5 changes: 3 additions & 2 deletions docs/intro/data_prep_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@
| Change Detection | LEVIR-CD | https://justchenhao.github.io/LEVIR/ | [prepare_levircd.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_levircd.py) |
| Change Detection | Season-varying | https://paperswithcode.com/dataset/cdd-dataset-season-varying | [prepare_svcd.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_svcd.py) |
| Scene Classification | UC Merced | http://weegee.vision.ucmerced.edu/datasets/landuse.html | [prepare_ucmerced.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_ucmerced.py) |
| Object Detection | RSOD | https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset- | [prepare_rsod](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_rsod.py) |
| Image Segmentation | iSAID | https://captain-whu.github.io/iSAID/ | [prepare_isaid](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_isaid.py) |
| Object Detection | DOTA | https://captain-whu.github.io/DOTA/ | [prepare_dota.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_dota.py) |
| Object Detection | RSOD | https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset- | [prepare_rsod.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_rsod.py) |
| Image Segmentation | iSAID | https://captain-whu.github.io/iSAID/ | [prepare_isaid.py](https://github.com/PaddlePaddle/PaddleRS/blob/develop/tools/prepare_dataset/prepare_isaid.py) |
1 change: 1 addition & 0 deletions docs/intro/model_cons_params_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -449,6 +449,7 @@

| 参数名 | 描述 | 默认值 |
| --- |-------------------------------| --- |
| `rotate (bool)` | 表示是否执行旋转目标检测 | `False` |
| `num_classes (int)` | 目标类别数量 | `80` |
| `backbone (str)` | 骨干网络名称 | `'MobileNetV1'` |
| `anchors (list[list[int]])` | 预定义锚框的大小 | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]]` |
Expand Down
1 change: 1 addition & 0 deletions docs/intro/model_cons_params_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,7 @@ The YOLOv3 implementation based on PaddlePaddle.

| Parameter Name | Description | Default Value |
| --- |-----------------------------------------------------------------------------------------------------------------------------| --- |
| `rotate (bool)` | If True, the model performs rotated object detection | `False` |
| `num_classes (int)` | Number of target classes | `80` |
| `backbone (str)` | Backbone network to use | `'MobileNetV1'` |
| `anchors (list[list[int]])` | Sizes of predefined anchor boxes | `[[10, 13], [16, 30], [33, 23], [30, 61], [62, 45 ], [59, 119], [116, 90], [156, 198], [373, 326]]` |
Expand Down
1 change: 1 addition & 0 deletions docs/intro/model_zoo_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ PaddleRS目前已支持的全部模型如下(标注\*的为遥感专用模型
| 图像复原 | NAFNet | 是 |
| 图像复原 | SwinIR | 是 |
| 目标检测 | Faster R-CNN | 否 |
| 目标检测 | FCOSR | 否 |
| 目标检测 | PP-YOLO | 否 |
| 目标检测 | PP-YOLO Tiny | 否 |
| 目标检测 | PP-YOLOv2 | 否 |
Expand Down
1 change: 1 addition & 0 deletions docs/intro/model_zoo_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ All models currently supported by PaddleRS are listed below (those marked \* are
| Image Restoration | SwinIR | Yes |
| Image Restoration | NAFNet | Yes |
| Object Detection | Faster R-CNN | No |
| Object Detection | FCOSR | No |
| Object Detection | PP-YOLO | No |
| Object Detection | PP-YOLO Tiny | No |
| Object Detection | PP-YOLOv2 | No |
Expand Down
36 changes: 31 additions & 5 deletions paddlers/datasets/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,22 @@
from paddle.fluid.dataloader.collate import default_collate_fn

from paddlers.utils import get_num_workers
from paddlers.transforms import construct_sample_from_dict, Compose
import paddlers.utils.logging as logging
from paddlers.transforms import construct_sample_from_dict, Compose, BatchCompose


class BaseDataset(Dataset):
_KEYS_TO_KEEP = None
_KEYS_TO_DISCARD = None
_collate_trans_info = False

def __init__(self, data_dir, label_list, transforms, num_workers, shuffle):
def __init__(self,
data_dir,
label_list,
transforms,
num_workers,
shuffle,
batch_transforms=None):
super(BaseDataset, self).__init__()

self.data_dir = data_dir
Expand All @@ -37,6 +44,8 @@ def __init__(self, data_dir, label_list, transforms, num_workers, shuffle):

self.num_workers = get_num_workers(num_workers)
self.shuffle = shuffle
self.batch_transforms = None
self.build_collate_fn(batch_transforms)

def __getitem__(self, idx):
sample = construct_sample_from_dict(self.file_list[idx])
Expand All @@ -59,8 +68,25 @@ def collate_fn(self, batch):
for key in self._KEYS_TO_DISCARD:
for s, _ in batch:
s.pop(key, None)

samples = [s[0] for s in batch]

if self.batch_transforms:
samples = self.batch_transforms(samples)

if self._collate_trans_info:
return default_collate_fn(
[s[0] for s in batch]), [s[1] for s in batch]
return default_collate_fn(samples), [s[1] for s in batch]
else:
return default_collate_fn([s[0] for s in batch])
return default_collate_fn(samples)

def build_collate_fn(self, batch_transforms, collate_fn_constructor=None):
if self.batch_transforms is not None and batch_transforms:
logging.warning(
"The initial `batch_transforms` will be overwritten.")
if batch_transforms is not None:
batch_transforms = copy.deepcopy(batch_transforms)
if isinstance(batch_transforms, list):
batch_transforms = BatchCompose(batch_transforms)
self.batch_transforms = batch_transforms
if collate_fn_constructor:
self.collate_fn = collate_fn_constructor(self)
5 changes: 3 additions & 2 deletions paddlers/datasets/cd_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,10 @@ def __init__(self,
num_workers='auto',
shuffle=False,
with_seg_labels=False,
binarize_labels=False):
binarize_labels=False,
batch_transforms=None):
super(CDDataset, self).__init__(data_dir, label_list, transforms,
num_workers, shuffle)
num_workers, shuffle, batch_transforms)

DELIMETER = ' '

Expand Down
6 changes: 4 additions & 2 deletions paddlers/datasets/clas_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,11 @@ def __init__(self,
transforms,
label_list=None,
num_workers='auto',
shuffle=False):
shuffle=False,
batch_transforms=None):
super(ClasDataset, self).__init__(data_dir, label_list, transforms,
num_workers, shuffle)
num_workers, shuffle,
batch_transforms)
self.file_list = list()
self.labels = list()

Expand Down
Loading