Skip to content

Commit e63f269

Browse files
Make subtopic for Export timeout in Community Edition (#2606)
docs: DEV-2797 Add subtopic in export file
1 parent bbd10db commit e63f269

File tree

2 files changed

+87
-75
lines changed

2 files changed

+87
-75
lines changed

docs/source/guide/export.md

Lines changed: 85 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -9,70 +9,105 @@ meta_description: Label Studio documentation for exporting data labeling annotat
99

1010
At any point in your labeling project, you can export the annotations from Label Studio.
1111

12-
Label Studio stores your annotations in a raw JSON format in the SQLite database backend, PostGreSQL database backend, or whichever cloud or database storage you specify as target storage. Cloud storage buckets contain one file per labeled task named as `task_id.json`. See [Cloud storage setup](storage.html) for more details about syncing target storage.
12+
Label Studio stores your annotations in a raw JSON format in the SQLite database backend, PostgreSQL database backend, or whichever cloud or database storage you specify as target storage. Cloud storage buckets contain one file per labeled task named `task_id.json`. For more information about syncing target storage, see [Cloud storage setup](storage.html).
13+
14+
Image annotations exported in JSON format use percentages of overall image size, not pixels, to describe the size and location of the bounding boxes. For more information, see [how to convert the image annotation units](#Units-of-image-annotations).
1315

14-
Image annotations exported in JSON format use percentages of overall image size, not pixels, to describe the size and location of the bounding boxes. See [how to convert the image annotation units](#Units-of-image-annotations).
1516

1617
## Export data from Label Studio
1718

1819
Export your completed annotations from Label Studio.
1920

20-
> Some export formats export only the annotations and not the data from the task. See the [export formats supported by Label Studio](#Export-formats-supported-by-Label-Studio).
21+
!!! note
22+
Some export formats export only the annotations and not the data from the task. For more information, see the [export formats supported by Label Studio](#Export-formats-supported-by-Label-Studio).
2123

2224
### Export using the UI in Community Edition of Label Studio
2325

24-
You can export data and annotations from the Label Studio UI.
26+
Use the following steps to export data and annotations from the Label Studio UI.
2527

2628
1. For a project, click **Export**.
2729
2. Select an available export format.
2830
3. Click **Export** to export your data.
2931

30-
#### Notes
31-
* Export will always include the annotated tasks, regardless of filters set on the tab.
32-
* Cancelled annotated tasks will be included in the exported result too.
33-
* If you want to apply tab filters to the export, try to use [export snapshots using the SDK](https://labelstud.io/sdk/project.html#label_studio_sdk.project.Project.export_snapshot_create) or [API](#Export-snapshots-using-the-API).
34-
* If the export times out, see how to [export snapshots using the SDK](https://labelstud.io/sdk/project.html#label_studio_sdk.project.Project.export_snapshot_create) or [API](#Export-snapshots-using-the-API).
32+
!!! note
33+
1. The export will always include the annotated tasks, regardless of filters set on the tab.
34+
2. Cancelled annotated tasks will be included in the exported result too.
35+
3. If you want to apply tab filters to the export, try to use [export snapshots using the SDK](https://labelstud.io/sdk/project.html#label_studio_sdk.project.Project.export_snapshot_create) or [API](#Export-snapshots-using-the-API).
36+
37+
#### Export timeout in Community Edition
38+
39+
If the export times out, see how to [export snapshots using the SDK](https://labelstud.io/sdk/project.html#label_studio_sdk.project.Project.export_snapshot_create) or [API](#Export-snapshots-using-the-API).
3540

3641
### <i class='ent'></i> Export snapshots using the UI
3742

3843
In Label Studio Enterprise, create a snapshot of your data and annotations. Create a snapshot to export exactly what you want from your data labeling project. This delayed export method makes it easier to export large labeling projects from the Label Studio UI.
3944

4045
1. Within a project in the Label Studio UI, click **Export**.
4146
2. Click **Create New Snapshot**.
42-
3. For **Export from...**, select the option for **All Tasks**. The default option is THIS.
43-
4. For **Include in the Snapshot...**, choose which type of data you want to include in the snapshot. Select **All tasks**, **Only annotated** tasks, or **Only reviewed** tasks.
44-
5. For **Annotations**, enable the types of annotations that you want to export. You can specify **Annotations**, **Ground Truth** annotations, and **Skipped** annotations. By default, only annotations are exported.
45-
6. (Optional) Add a **Snapshot Name** to make it easier to find in the future. By default, export snapshots are named `PROJECT-NAME-at-YEAR-MM-DD-HH-MM`, where the time is in UTC.
46-
7. For **Drafts**, choose whether to export the complete draft annotations for tasks, or only the IDs of draft annotations, to indicate that drafts exist.
47-
8. For **Predictions**, choose whether to export the complete predictions for tasks, or only the IDs of predictions to indicate that the tasks had predictions.
48-
9. (Optional) Enable the option to remove annotator emails to anonymize your result dataset.
47+
3. **Apply filters from tab ...**: Select **Default** from the drop-down list.
48+
4. (Optional) **Snapshot Name**: Enter a snapshot name to make it easier to find in the future. By default, export snapshots are named `PROJECT-NAME-at-YEAR-MM-DD-HH-MM`, where the time is in UTC.
49+
5. **Include in the Snapshot…**: Choose which type of data you want to include in the snapshot. Select **All tasks**, **Only annotated** tasks, or **Only reviewed** tasks.
50+
6. **Drafts**: Choose whether to export the complete draft annotations (**Complete drafts**) for tasks, or only the IDs (**Only IDs**) of draft annotations, to indicate that drafts exist.
51+
7. **Predictions**: Choose whether to export the complete predictions (**Complete predictions**) for tasks, or only the IDs (**Only IDs**) of predictions to indicate that the tasks had predictions.
52+
8. **Annotations**: Enable the types of annotations that you want to export. You can specify **Annotations**, **Ground Truth** annotations, and **Skipped** annotations. By default, only annotations are exported.
53+
9. (Optional) Enable the **Remove user details** option to remove the user's details.
4954
10. Click **Create a Snapshot** to start the export process.
50-
11. You see the list of snapshots available to download, with details about what is included in the snapshot and when and by whom it was created.
51-
12. Click **Download** and select the export format that you want to use. The snapshot file downloads to your computer.
55+
11. You see the list of snapshots available to download, with details about what is included in the snapshot, when it was created, and who created it.
56+
12. Click **Download** and select the export format that you want to use. Now, the snapshot file downloads to your computer.
5257

5358
### Export using the API
59+
5460
You can call the Label Studio API to export annotations. For a small labeling project, call the [export endpoint](/api#operation/api_projects_export_read) to export annotations.
5561

5662
### Export snapshots using the API
63+
5764
For a large labeling project with hundreds of thousands of tasks, do the following:
5865
1. Make a POST request to [create a new export file or snapshot](/api#operation/api_projects_exports_create). The response includes an `id` for the created file.
59-
2. [Check the status of the export file creation](/api#operation/api_projects_exports_read) using the `id` as the `export_pk`.
66+
2. [Check the status of the export file created](/api#operation/api_projects_exports_read) using the `id` as the `export_pk`.
6067
3. Using the `id` from the created snapshot as the export primary key, or `export_pk`, make a GET request to [download the export file](/api#operation/api_projects_exports_download_read).
6168

6269
## Manually convert JSON annotations to another format
63-
You can run the [Label Studio converter tool](https://github.com/heartexlabs/label-studio-converter) on a directory or file of completed JSON annotations using the command line or Python to convert the completed annotations from Label Studio JSON format into another format. If you use versions of Label Studio earlier than 1.0.0, this is the only way to convert your Label Studio JSON format annotations into another labeling format.
70+
You can run the [Label Studio converter tool](https://github.com/heartexlabs/label-studio-converter) on a directory or file of completed JSON annotations using the command line or Python to convert the completed annotations from Label Studio JSON format into another format.
71+
72+
!!! note
73+
If you use versions of Label Studio earlier than 1.0.0, then this is the only way to convert your Label Studio JSON format annotations into another labeling format.
74+
6475

6576
## Export formats supported by Label Studio
6677

67-
Label Studio supports many common and standard formats for exporting completed labeling tasks. If you don't see a format that works for you, you can contribute one. See the [GitHub repository for the Label Studio Converter tool](https://github.com/heartexlabs/label-studio-converter).
78+
Label Studio supports many common and standard formats for exporting completed labeling tasks. If you don't see a format that works for you, you can contribute one. For more information, see the [GitHub repository for the Label Studio Converter tool](https://github.com/heartexlabs/label-studio-converter).
79+
80+
### ASR_MANIFEST
81+
82+
Export audio transcription labels for automatic speech recognition as the JSON manifest format expected by [NVIDIA NeMo models](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/v0.11.0/collections/nemo_asr.html). Supports audio transcription labeling projects that use the `Audio` or `AudioPlus` tags with the `TextArea` tag.
83+
84+
```json
85+
{“audio_filepath”: “/path/to/audio.wav”, “text”: “the transcription”, “offset”: 301.75, “duration”: 0.82, “utt”: “utterance_id”, “ctm_utt”: “en_4156”, “side”: “A”}
86+
```
87+
88+
### Brush labels to NumPy and PNG
89+
90+
Export your brush mask labels as NumPy 2d arrays and PNG images. Each label outputs as one image. Supports brush labeling image projects that use the `BrushLabels` tag.
91+
92+
### COCO
93+
94+
A popular machine learning format used by the [COCO dataset](http://cocodataset.org/#home) for object detection and image segmentation tasks. Supports bounding box and polygon image labeling projects that use the `RectangleLabels` or `PolygonLabels` tags.
95+
96+
### CoNLL2003
97+
98+
A popular format used for the [CoNLL-2003 named entity recognition challenge](https://www.clips.uantwerpen.be/conll2003/ner/). Supports text labeling projects that use the `Text` and `Labels` tags.
99+
100+
### CSV
101+
102+
Results are stored as comma-separated values with the column names specified by the values of the `"from_name"` and `"to_name"` fields in the labeling configuration. Supports all project types.
68103

69104
### JSON
70105

71106
List of items in [raw JSON format](#Label-Studio-JSON-format-of-annotated-tasks) stored in one JSON file. Use this format to export both the data and the annotations for a dataset. Supports all project types.
72107

73108
### JSON_MIN
74109

75-
List of items where only `"from_name", "to_name"` values from the [raw JSON format](#Label-Studio-JSON-format-of-annotated-tasks) are exported. Use this format to export the annotations and the data for a dataset, and no Label-Studio-specific fields. Supports all project types.
110+
List of items where only `"from_name", "to_name"` values from the [raw JSON format](#Label-Studio-JSON-format-of-annotated-tasks) are exported. Use this format to export the annotations and the data for a dataset, and no Label-Studio-specific fields. Supports all project types.
76111

77112
For example:
78113
```json
@@ -91,69 +126,45 @@ For example:
91126
}
92127
```
93128

94-
### CSV
95-
96-
Results are stored as comma-separated values with the column names specified by the values of the `"from_name"` and `"to_name"` fields in the labeling configuration. Supports all project types.
97-
98-
99-
### TSV
100-
101-
Results are stored in tab-separated tabular file with column names specified by `"from_name"` and `"to_name"` values in the labeling configuration. Supports all project types.
102-
103-
### CONLL2003
104-
105-
Popular format used for the [CoNLL-2003 named entity recognition challenge](https://www.clips.uantwerpen.be/conll2003/ner/). Supports text labeling projects that use the `Text` and `Labels` tags.
106-
107-
### COCO
108-
109-
Popular machine learning format used by the [COCO dataset](http://cocodataset.org/#home) for object detection and image segmentation tasks. Supports bounding box and polygon image labeling projects that use the `RectangleLabels` or `PolygonLabels` tags.
110-
111129
### Pascal VOC XML
112130

113-
Popular XML-formatted task data used for object detection and image segmentation tasks. Supports bounding box image labeling projects that use the `RectangleLabels` tag..
114-
115-
### Brush labels to NumPy & PNG
116-
117-
Export your brush mask labels as NumPy 2d arrays and PNG images. Each label outputs as one image. Supports brush labeling image projects that use the `BrushLabels` tag.
118-
119-
### ASR_MANIFEST
120-
121-
Export audio transcription labels for automatic speech recognition as the JSON manifest format expected by [NVIDIA NeMo models](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/v0.11.0/collections/nemo_asr.html). Supports audio transcription labeling projects that use the `Audio` or `AudioPlus` tags with the `TextArea` tag.
122-
123-
```json
124-
{“audio_filepath”: “/path/to/audio.wav”, “text”: “the transcription”, “offset”: 301.75, “duration”: 0.82, “utt”: “utterance_id”, “ctm_utt”: “en_4156”, “side”: “A”}
125-
```
126-
127-
### YOLO
128-
129-
Export object detection annotations in the YOLOv3 and YOLOv4 format. Supports object detection labeling projects that use the `RectangleLabels` tag.
131+
A popular XML-formatted task data is used for object detection and image segmentation tasks. Supports bounding box image labeling projects that use the `RectangleLabels` tag.
130132

131133
### spaCy
132134

133-
Label Studio doesn't support exporting directly to spaCy binary format, but you can convert annotations exported from Label Studio to a format compatible with spaCy. You must have the spacy python package installed to perform this conversion.
135+
Label Studio does not support exporting directly to spaCy binary format, but you can convert annotations exported from Label Studio to a format compatible with spaCy. You must have the spacy python package installed to perform this conversion.
134136

135137
To transform Label Studio annotations into spaCy binary format, do the following:
136138
1. Export your annotations to CONLL2003 format.
137-
2. Open the downloaded file and update the first line of the exported file to add a O on the first line:
139+
2. Open the downloaded file and update the first line of the exported file to add `O` on the first line:
138140
```
139141
-DOCSTART- -X- O O
140142
```
141-
3. From the command line, run spacy convert to convert the CONLL-formatted annotations to spaCy binary format, replacing `/path/to/<filename>` with the path and file name of your annotations:
143+
3. From the command line, run spacy convert to convert the CoNLL-formatted annotations to spaCy binary format, replacing `/path/to/<filename>` with the path and file name of your annotations:
142144

143-
spacy version 2:
144-
```shell
145-
spacy convert /path/to/<filename>.conll -c ner
146-
```
147-
spacy version 3:
148-
```shell
149-
spacy convert /path/to/<filename>.conll -c conll .
150-
```
145+
spacy version 2:
146+
```shell
147+
spacy convert /path/to/<filename>.conll -c ner
148+
```
149+
spacy version 3:
150+
```shell
151+
spacy convert /path/to/<filename>.conll -c conll .
152+
```
153+
154+
For more information, see the spaCy documentation on [Converting existing corpora and annotations](https://spacy.io/usage/training#data-convert) on running spacy convert.
155+
156+
### TSV
157+
158+
Results are stored in a tab-separated tabular file with column names specified by `"from_name"` and `"to_name"` values in the labeling configuration. Supports all project types.
159+
160+
### YOLO
161+
162+
Export object detection annotations in the YOLOv3 and YOLOv4 format. Supports object detection labeling projects that use the `RectangleLabels` tag.
151163

152-
See the spaCy documentation on [Converting existing corpora and annotations](https://spacy.io/usage/training#data-convert) for more details on running spacy convert.
153164

154165
## Label Studio JSON format of annotated tasks
155166

156-
When you annotate data, Label Studio stores the output in JSON format. The raw JSON structure of each completed task follows this example:
167+
When you annotate data, Label Studio stores the output in JSON format. The raw JSON structure of each completed task uses the following example:
157168

158169
```json
159170
{
@@ -250,6 +261,7 @@ When you annotate data, Label Studio stores the output in JSON format. The raw J
250261
```
251262

252263
### Relevant JSON property descriptions
264+
253265
Review the full list of JSON properties in the [API documentation](api.html).
254266

255267
| JSON property name | Description |
@@ -265,13 +277,13 @@ Review the full list of JSON properties in the [API documentation](api.html).
265277
| result.from_name | Name of the tag used to label the region. See [control tags](/tags). |
266278
| result.to_name | Name of the object tag that provided the region to be labeled. See [object tags](/tags). |
267279
| result.type | Type of tag used to annotate the task. |
268-
| result.value | Tag-specific value that includes details of the result of labeling the task. The value structure depends on the tag for the label. [Explore each tag](/tags) for more details. |
280+
| result.value | Tag-specific value that includes details of the result of labeling the task. The value structure depends on the tag for the label. For more information, see [Explore each tag](/tags). |
269281
| annotations.completed_by | User ID of the user that created the annotation. Matches the list order of users on the People page on the Label Studio UI. |
270282
| annotations.was_cancelled | Boolean. Details about whether or not the annotation was skipped, or cancelled. |
271-
| annotations.reviews | Enterprise only. Array containing the details of reviews for this annotation. |
283+
| annotations.reviews | <i class='ent'></i> Array containing the details of reviews for this annotation. |
272284
| reviews.id | Enterprise only. ID of the specific annotation review. |
273-
| reviews.created_by | Enterprise only. Dictionary containing user ID, email, first name and last name of the user performing the review. |
274-
| reviews.accepted | Enterprise only. Boolean. Whether the reviewer accepted the annotation as part of their review. |
285+
| reviews.created_by | <i class='ent'></i> Dictionary containing user ID, email, first name and last name of the user performing the review. |
286+
| reviews.accepted | <i class='ent'></i> Boolean. Whether the reviewer accepted the annotation as part of their review. |
275287
| drafts | Array of draft annotations. Follows similar format as the annotations array. Included only for tasks exported as a snapshot [from the UI](#Export-snapshots-using-the-UI) or [using the API](#Export-snapshots-using-the-API).
276288
| predictions | Array of machine learning predictions. Follows the same format as the annotations array, with one additional parameter. |
277289
| predictions.score | The overall score of the result, based on the probabilistic output, confidence level, or other. |

docs/source/includes/annotation_ids.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
Each annotation that you create when you label a task contains regions and results.
44

5-
- **Regions** refer to the selected area of the data, whether a text span, image area, audio segment, or something else.
5+
- **Regions** refer to the selected area of the data, whether a text span, image area, audio segment, or another entity.
66
- **Results** refer to the labels assigned to the region.
77

88
Each region has a unique ID for each annotation, formed as a string with the characters `A-Za-z0-9_-`. Each result ID is the same as the region ID that it applies to.
99

10-
When a prediction is used to create an annotation, the result IDs stay the same in the annotation field. This lets you track the regions generated by your machine learning model and compare them directly to the human-created and reviewed annotations.
10+
When a prediction is used to create an annotation, the result IDs stay the same in the annotation field. This allows you to track the regions generated by your machine learning model and compare them directly to the human-created and reviewed annotations.

0 commit comments

Comments
 (0)