Skip to content

Yolo_predict_tutorial_deepsea #6258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open
Changes from 7 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
01b2f0e
add yolo prediction tutorial based on deepsea dataset.
mjoudy Jul 16, 2025
37d1a8b
Gemfile?!
mjoudy Jul 16, 2025
c40eabb
Merge remote-tracking branch 'upstream/main' into yolo_predict_tutori…
mjoudy Jul 16, 2025
868b9dc
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
yvanlebras Jul 18, 2025
a95ce4d
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
yvanlebras Jul 18, 2025
3547180
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
yvanlebras Jul 18, 2025
498fe12
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
yvanlebras Jul 18, 2025
c9083d0
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
yvanlebras Jul 18, 2025
95c963b
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
yvanlebras Jul 18, 2025
52c1afa
Update tutorial.md with GTN compliant elements (I hope ;) )
yvanlebras Jul 18, 2025
f31775f
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
yvanlebras Jul 18, 2025
1b574c7
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
yvanlebras Jul 18, 2025
29cd75c
Update tutorial.md to homogeneize parameters presentation
yvanlebras Jul 18, 2025
82bd6f6
fix some typo and add a step to create class name file
yvanlebras Jul 18, 2025
7654df9
fix typo and add SEANOE dataset reference
yvanlebras Jul 22, 2025
1598632
Gemfile checkout
mjoudy Jul 23, 2025
498bdc9
comments resolved. GTN build testing error fixed.
mjoudy Jul 24, 2025
c492ec3
references added.
mjoudy Jul 25, 2025
ac664b7
Merge branch 'main' into yolo_predict_tutorial_deepsea
anuprulez Jul 29, 2025
f236611
comments resolved. except parts for history search and segmntation.
mjoudy Jul 29, 2025
b175b99
Apply suggestions from code review
mjoudy Jul 29, 2025
7e9fc1b
note for ? char added.
mjoudy Jul 29, 2025
a2b025e
Merge branch 'yolo_predict_tutorial_deepsea' of https://github.com/mj…
mjoudy Jul 29, 2025
589d0eb
Merge branch 'main' into yolo_predict_tutorial_deepsea
anuprulez Jul 30, 2025
3b22986
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
mjoudy Jul 31, 2025
931ca71
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
mjoudy Jul 31, 2025
bc8252d
Merge branch 'main' into yolo_predict_tutorial_deepsea
bgruening Aug 11, 2025
b0f73b0
Update tutorial.md
bgruening Aug 11, 2025
6d0ef7a
Apply suggestion from @kostrykin
kostrykin Aug 11, 2025
3badbb6
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
kostrykin Aug 11, 2025
d7efb00
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
kostrykin Aug 11, 2025
b2b838e
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
kostrykin Aug 11, 2025
de8b832
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
kostrykin Aug 11, 2025
0ad22ca
Update topics/imaging/tutorials/yolo_prediction/tutorial.md
kostrykin Aug 14, 2025
596b41b
Apply suggestions from code review
kostrykin Aug 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 80 additions & 54 deletions topics/imaging/tutorials/yolo_prediction/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ priority: 5
contributions:
authorship:
- mjoudy
- yvanlebras

follow_up_training:
- type: "internal"
Expand All @@ -38,7 +39,7 @@ follow_up_training:

# Introduction

YOLO (You Only Look Once) is a fast, deep learning-based algorithm for real-time object detection. It predicts object classes and bounding boxes in a single pass over the image. YOLOv8 is one of its version, offering improved accuracy and speed.
YOLO (You Only Look Once) is a fast, deep learning-based algorithm for real-time object detection. It predicts object classes and bounding boxes in a single pass over the image. YOLOv8 is a specific version, offering improved accuracy and speed.

In this tutorial, you will use Galaxy to run two types of YOLOv8 models:

Expand All @@ -59,62 +60,88 @@ We will use selected images from the SEANOE dataset:

The SEANOE #101899 collection features real underwater images captured by deep‑sea observatories as part of a citizen science initiative called Deep Sea Spy. These non‑destructive imaging stations continuously monitor marine ecosystems and provide snapshots of various fauna. In this dataset, multiple annotators—including trained scientists and enthusiastic citizen scientists—have manually labeled images with polygons, lines, or points highlighting marine organisms. These annotations are then cleaned and converted into bounding boxes to create a training-ready dataset for object detection with YOLOv8. Though the exact species vary, images often include deep-sea fish, species, making this dataset well-suited for practicing detection tasks.

## Get data

> <hands-on-title> Data Upload </hands-on-title>
>
> 1. Create a new history for this tutorial and give it a name (example: “Ecoregionalization workflow”) for you to find it again later if needed.
>
> {% snippet faqs/galaxy/histories_create_new.md %}
>
> {% snippet faqs/galaxy/histories_rename.md %}
>
> 2. Import images data files and models from [SEANOE marine ,datawarehouse](https://www.seanoe.org/data/00907/101899/)
>
> DeepSeaSpy images data files and models as a zip file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yvanlebras for this tutorial, should we use one or two images showcasing species detection from this collection (https://www.seanoe.org/data/00907/101899/) or the full collection is required? thanks!

> ```
> https://www.seanoe.org/data/00907/101899/data/115473.zip
> ```
>
> {% snippet faqs/galaxy/datasets_import_via_link.md %}

>
> 3. Use {% tool [Unzip a file](toolshed.g2.bx.psu.edu/repos/imgteam/unzip/unzip/6.0+galaxy0) %} to create a data collection in your history where all archive files will be unzipped
>
> 5. Unhide the models data files
>
> History search `name:detection deleted:false visible:any` then unhidde the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> History search `name:detection deleted:false visible:any` then unhidde the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
> History search `name:detection deleted:false visible:any` then show the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".

Copy link
Collaborator

@kostrykin kostrykin Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually called "unhide" in Galaxy:

Image
Suggested change
> History search `name:detection deleted:false visible:any` then unhidde the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
> History search `name:detection deleted:false visible:any` then unhide the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".

Copy link
Collaborator

@kostrykin kostrykin Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file names are quite long, it might be worth putting them in separate lines. Also, I might be missing something, but I never heard the term "History search", do you maybe mean "Search the history"?

Suggested change
> History search `name:detection deleted:false visible:any` then unhidde the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
> Search the history for `name:detection deleted:false visible:any`, then unhide the 2 model files
> - "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and
> - "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".

>
>
> {% snippet faqs/galaxy/datasets_unhidden.md %}
>
> {% snippet faqs/galaxy/datasets_change_datatype.md datatype="tabular" %}
>
> 6. Unhide images files and create a dedicated data collection
>
> History search `extension:jpg deleted:false visible:any` then click on "select all" and "build dataset list", select 100 files and give a name of the data collection, "DeepSeaSpy 100 images sample" for example. Tips: To select only last 100 files, you can use the history search function and specify `extension:jpg deleted:false hid>XXXX visible:any` in the serach bar where XXXX is the id of the last image dataset minus 100 (for example `extension:jpg deleted:false hid>47659 visible:any` if you have images until the history dataset ID 47759.
>
>
{: .hands_on}

## 📦 Model

This dataset provides two pretrained YOLOv8 detection models tailored for the marine species found in SEANOE #101899. One model detects Buccinidae (a family of sea snails), and the other targets Bythograeidae (a family of deep-sea crabs). These models were trained on cleaned annotation sets that contain thousands of examples—for instance, the Buccinidae set includes over 14,900 annotations in total. For this tutorial, you’ll find two model files—`*.pt` files—each accompanied by the appropriate class_names.txt file. You can upload either or both to Galaxy to run detection experiments on your underwater images.

## ⚙️ Run YOLOv8 in detect mode


To perform object detection in Galaxy, use the tool "**Perform YOLO image labeling** with ultralytics (Galaxy Version 8.3.0+galaxy2)". Here’s how to set it up:

- `Input images:` Select the .jpg underwater images you downloaded from the SEANOE dataset.

- `Class names file:` This is a plain text file (.txt) that lists the names of the classes the model can detect.
For example, for detecting Bythograeidae species, the file should look like this:

```
Bythograeid crab
Buccinid snail

```
and for Buccinide spicies, class fil names could be like:

```
Autre poisson
Couverture de moules
Couverture microbienne
Couverture vers tubicole
Crabe araignée
Crabe bythograeidé
Crevette alvinocarididae
Escargot buccinidé
Ophiure
Poisson Cataetyx
Poisson chimère
Poisson zoarcidé
Pycnogonide
Ver polynoidé
Vers polynoidés
```

Each class name must be on its own line, in the same order used during model training. So the class ID 0 corresponds to Buccinidae, and 1 to Bythograeidae.

- `Model`: Upload and choose from the dataset either `YOLOv8-weights-for-Buccinidae-detection.pt` or `YOLOv8-weights-for-Bythograeidae-detection.pt`, or test both on different runs.

- `Prediction mode`: Select detect. This tells YOLO to output bounding boxes around detected objects.

- `Image size`: Use 1000 (or a smaller number like 640 if processing speed is important). This controls how much the image is resized before prediction. Smaller values = faster but possibly less accurate.

- `Confidence threshold`: Set to 0.25 (25%). This controls how confident the model must be to report a detection. If you increase this value (e.g., 0.5), you’ll get fewer detections, but they’ll be more confident. If you lower it (e.g., 0.1), you may get more results, but possibly more false positives.

- `IoU threshold`: Set to 0.45. This is used for Non-Maximum Suppression (NMS), which removes overlapping detections. A higher IoU value (e.g., 0.7) keeps more overlapping boxes. A lower IoU (e.g., 0.3) removes more overlaps, which may help clean up crowded images.

- `Max detections`: Set a reasonable cap like 300. This limits the number of objects detected per image.

💡 Tip: Try changing the confidence and IoU thresholds to see how detection results vary. It helps you find a good balance between sensitivity and accuracy.

> 💡 **Note**: These models are trained only for detection, not segmentation.
> <hands-on-title> Detect Buccinid snails on images </hands-on-title>
>
> 1. {% tool [Perform YOLO image labeling](toolshed.g2.bx.psu.edu/repos/bgruening/yolo_predict/yolo_predict/8.3.0+galaxy2) %} with the following parameters:
> - {% icon param-file %} *"Input images"*: `DeepSeaSpy 100 images sample` (Input images dataset collection)
> - *"Class names file"*: `Buccinide` (Input plain text file (.txt) that lists the names of the classes the model can detect)
> - *"Model"*: `dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection` (Input pt file)
> - *"Prediction mode"*: `Detect`
> - *"Image size"*: `1000`
> - *"Confidence"*: `0.25`
> - *"IoU"*: `0.45`
> - *"Max detections"*: `300`
>
> > <warning-title> Models type </warning-title>
> >
> > The model is trained only for detection, not segmentation
> >
> {: .warning}
>
> > <tip-title>IoU threshold parameter</tip-title>
> >
> > Try changing the confidence and IoU thresholds to see how detection results vary. It helps you find a good balance between sensitivity and accuracy.
> >
> >
> {: .tip}
>
> > <comment-title>Additional information on class names file and parameters</comment-title>
> >
> > Concerning the class names file: Each class name must be on its own line, in the same order used during model training. So the class ID 0 corresponds to Buccinidae, and 1 to Bythograeidae.
> > Concerning tool parameters:
> > - `Image size`: Use 1000 (or a smaller number like 640 if processing speed is important). This controls how much the image is resized before prediction. Smaller values = faster but possibly less accurate.
> > - `Confidence threshold`: Set to 0.25 (25%). This controls how confident the model must be to report a detection. If you increase this value (e.g., 0.5), you’ll get fewer detections, but they’ll be more confident. If you lower it (e.g., 0.1), you may get more results, but possibly more false positives.
> > - `IoU threshold`: Set to 0.45. This is used for Non-Maximum Suppression (NMS), which removes overlapping detections. A higher IoU value (e.g., 0.7) keeps more overlapping boxes. A lower IoU (e.g., 0.3) removes more overlaps, which may help clean up crowded images.
> > - `Max detections`: Set a reasonable cap like 300. This limits the number of objects detected per image.
Copy link
Collaborator

@kostrykin kostrykin Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider adapting the formatting of the parameters to the format used in lines 113–118.

Copy link
Contributor

@sunyi000 sunyi000 Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the image size, if we make prediction on an image with 512x512, setting image size to 1000 will probably not work? im not too sure..better to test it

Copy link
Contributor

@sunyi000 sunyi000 Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, i think yolo automatically adjust the size, if input 1000, yolo will change it to 1024, 500 will become 512.
maybe we need to add this to the image size explaination? we need to test it, to be safe.. i remember i had some issue with setting size bigger than the image..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @sunyi000 I think its correct. I will test it and add this point to the tutorial.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunyi000 I tested this issue and there is no problem. my images size are 1920by1080 and in the tool I tested for 640, 1000, 2000, 3000. They all had correct outputs.

> >
> {: .comment}
>
{: .hands_on}

## 🧾 Explore the Outputs

Expand All @@ -141,7 +168,7 @@ which means:

These are your original images with colored boxes drawn around detected species. Each box also includes:
The class name and the confidence score. For example, you might see a box labeled:
Crabe bythograeid 0.54
`Crabe bythograeid 0.54`

![buccinid](../../images/yolo/buccinid.jpeg)
![bythog](../../images/yolo/bythog2.jpeg)
Expand All @@ -150,8 +177,7 @@ Crabe bythograeid 0.54

These images are useful for visually checking whether detections are correct or if something was missed.

⚠️ No masks or segmentation files
Since we used detect mode, this tool will not generate segmentation masks (like .tiff or polygon files). Those are only available in segment mode, which we'll cover next.
⚠️ **No masks or segmentation files:** Since we used detect mode, this tool will not generate segmentation masks (like .tiff or polygon files). Those are only available in segment mode, which we'll cover next.



Expand Down Expand Up @@ -210,7 +236,7 @@ This output contains class IDs, bounding box coordinates, confidence scores, and
|------------------|--------------------------|----------------------------------|
| Mode | `detect` | `segment` |
| Output overlays | Bounding boxes only | Boxes + masks |
| `.tiff` masks | ❌ | ✅ |
| .tiff masks | ❌ | ✅ |
| Use case | Objects presence/count | Object shape, size, morphology |
| Performance | Faster | Slightly slower, more detailed |

Expand Down