galaxyproject · mjoudy · Jul 16, 2025 · Jul 16, 2025 · Jul 16, 2025 · Jul 18, 2025
@@ -0,0 +1,8 @@
+@online{lebeaud2024deepsea,
+  author       = {Lebeaud, Antoine and Tosello, Vanessa and Borremans, Catherine and Matabos, Marjolaine},
+  title        = {Deep-sea observatories images labeled by citizen for object detection algorithms},
+  year         = {2024},
+  publisher    = {SEANOE},
+  doi          = {10.17882/101899},
+  url          = {https://www.seanoe.org/data/00907/101899}
+}
@@ -0,0 +1,286 @@
+---
+layout: tutorial_hands_on
+title: Object detection and segmentation with YOLO
+
+zenodo_link: ''
+
+questions:
+- How do object detection and segmentation differ in practice?
+- How can I run YOLO models on marine images to detect species using Galaxy?
+- Why does the choice of model type matter?
+
+objectives:
+- Detect marine species in underwater images using a pretrained YOLOv8 model
+- Compare results between detection and segmentation modes
+- Understand the requirements and output types of each model
+
+time_estimation: "45m"
+
+tags:
+- object-detection
+- image-segmentation
+- deep-learning
+- ecology
+
+bibtex: tutorial.bib
+
+priority: 5
+
+contributions:
+  authorship:
+    - mjoudy
+    - yvanlebras
+
+follow_up_training:
+  - type: "internal"
+    topic_name: machine-learning
+    tutorials:
+      - ml-advanced-image
+
+---
+
+YOLO (You Only Look Once) is a fast, deep learning-based algorithm for real-time object detection. It predicts object classes and bounding boxes in a single pass over the image. YOLOv8 is a specific version, offering improved accuracy and speed.
+
+In this tutorial, you will use Galaxy to run two types of YOLOv8 models:
+
+1. **Object Detection** using real underwater images from the SEANOE dataset.
+2. **Segmentation** using an Ultralytics demo model on a standard image.
+
+We'll compare both modes and discuss what kind of output they generate and why it matters in bioimage analysis.
+
+---
+
+# Part 1: Detection of marine species
+
+## 🔗 Dataset
+
+👉 We will use selected images from the SEANOE dataset {% cite lebeaud2024deepsea %}.
+
+The [SEANOE](https://www.seanoe.org/data/00907/101899) collection features real underwater images captured by deep‑sea observatories as part of a citizen science initiative called Deep Sea Spy. These non‑destructive imaging stations continuously monitor marine ecosystems and provide snapshots of various fauna. In this dataset, multiple annotators—including trained scientists and enthusiastic citizen scientists—have manually labeled images with polygons, lines, or points highlighting marine organisms. These annotations were then cleaned and converted into bounding boxes to create a training-ready dataset for object detection with YOLOv8. Though the exact species vary, images often include deep-sea fish, species, making this dataset well-suited for practicing detection tasks.
+
+<img src="../../images/yolo/CAM-TEMPO.jpg" style="width:40%; display:inline-block;" alt="sample buccinid data">
+<img src="../../images/yolo/MOMAR.jpg" style="width:40%; display:inline-block;" alt="sample bythongraede data">
+<img src="../../images/yolo/CAM-TEMPO2.jpg" style="width:40%; display:inline-block;" alt="sample buccinid2 data">
+<img src="../../images/yolo/CAM-TEMPO3.jpg" style="width:40%; display:inline-block;" alt="sample buccinid3 data">
+
+## Get data
+
+> <hands-on-title> Data Upload </hands-on-title>
+>
+> 1. Create a new history for this tutorial and give it a name (example: “DeepSeaSpy Yolo tutorial”) for you to find it again later if needed.
+>
+>    {% snippet faqs/galaxy/histories_create_new.md %}
+>
+>    {% snippet faqs/galaxy/histories_rename.md %}
+>
+> 2. Import image data files and models from [SEANOE marine datawarehouse](https://www.seanoe.org/data/00907/101899/).
+>
+>    DeepSeaSpy image data files and models as a zip file:
+>    ```
+>    https://www.seanoe.org/data/00907/101899/data/115473.zip
+>    ```
+>
+>    {% snippet faqs/galaxy/datasets_import_via_link.md %}
+>
+> 3. Use  {% tool [Unzip](toolshed.g2.bx.psu.edu/repos/imgteam/unzip/unzip/6.0+galaxy0) %} to create a data collection in your history where all archive files will be unzipped.
+>
+> 4. Unhide the models data files.
+>
+>    History search `name:detection deleted:false visible:any` then unhidde the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
->    History search `name:detection deleted:false visible:any` then unhidde the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
+>    Search the history for `name:detection deleted:false visible:any`, then unhide the 2 model files
+>    - "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and
+>    - "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
->    History search `name:detection deleted:false visible:any` then unhidde the 2 model files "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
+>    Search the history for `name:detection deleted:false visible:any`, then unhide the 2 model files
+>    - "dataset_seanoe_101899_YOLOv8-weights-for-Bythograeidae-detection" and
+>    - "dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection".
+>
+>
+>    {% snippet faqs/galaxy/datasets_unhidden.md %}
+>
+>    {% snippet faqs/galaxy/datasets_change_datatype.md datatype="tabular" %}
+>
+> 5. Select a sample of 100 image files and create a dedicated data collection
+>
+>    History search `extension:jpg deleted:false visible:any` then click on "select all" and "autobuild list", select 100 files and give a name of the data collection, "DeepSeaSpy 100 images sample" for example. Tips: To select only last 100 files, you can use the history search function and specify `extension:jpg deleted:false hid>XXXX visible:any` in the search bar where XXXX is the id of the last image dataset minus 100 (for example `extension:jpg deleted:false hid>3886 visible:any` if you have images until the history dataset ID 3986.
->    History search `extension:jpg deleted:false visible:any` then click on "select all" and "autobuild list", select 100 files and give a name of the data collection, "DeepSeaSpy 100 images sample" for example. Tips: To select only last 100 files, you can use the history search function and specify `extension:jpg deleted:false hid>XXXX visible:any` in the search bar where XXXX is the id of the last image dataset minus 100 (for example `extension:jpg deleted:false hid>3886 visible:any` if you have images until the history dataset ID 3986.
+>    Search the history for `extension:jpg deleted:false visible:any`, then click on "select all" and "autobuild list", select 100 files and give a name of the data collection, "DeepSeaSpy 100 images sample" for example. Tips: To select only last 100 files, you can use the history search function and specify `extension:jpg deleted:false hid>XXXX visible:any` in the search bar, where `XXXX` is the ID of the last image dataset minus 100 (for example `extension:jpg deleted:false hid>3886 visible:any` if you have images up to the history dataset ID 3986).
->    History search `extension:jpg deleted:false visible:any` then click on "select all" and "autobuild list", select 100 files and give a name of the data collection, "DeepSeaSpy 100 images sample" for example. Tips: To select only last 100 files, you can use the history search function and specify `extension:jpg deleted:false hid>XXXX visible:any` in the search bar where XXXX is the id of the last image dataset minus 100 (for example `extension:jpg deleted:false hid>3886 visible:any` if you have images until the history dataset ID 3986.
+>    Search the history for `extension:jpg deleted:false visible:any`, then click on "select all" and "autobuild list", select 100 files and give a name of the data collection, "DeepSeaSpy 100 images sample" for example. Tips: To select only last 100 files, you can use the history search function and specify `extension:jpg deleted:false hid>XXXX visible:any` in the search bar, where `XXXX` is the ID of the last image dataset minus 100 (for example `extension:jpg deleted:false hid>3886 visible:any` if you have images up to the history dataset ID 3986).
+>
+> 6. Create class name file "Buccinide", copying and pasting this content in the file uploader:
+>
+> ```
+> Autre poisson
+> Couverture de moules
+> Couverture microbienne
+> Couverture vers tubicole
+> Crabe araignée
+> Crabe bythograeidé
-> Crabe bythograeidé
+> Crabe bythograeide
-> Crabe bythograeidé
+> Crabe bythograeide
+> Crevette alvinocarididae
+> Escargot buccinidé
+> Ophiure
+> Poisson Cataetyx
+> Poisson chimère
+> Poisson zoarcidé
+> Pycnogonide
+> Ver polynoidé
+> Vers polynoidés 
+> ```
+{: .hands_on}
+
+## 📦 Model
+
+This dataset provides two pretrained YOLOv8 detection models tailored for the marine species found in [SEANOE](https://www.seanoe.org/data/00907/101899). One model detects Buccinidae (a family of sea snails), and the other targets Bythograeidae (a family of deep-sea crabs). These models were trained on cleaned annotation sets that contain thousands of examples—for instance, the Buccinidae set includes over 14,900 annotations in total. For this tutorial, you’ll find two model files—`*.pt` files—each accompanied by the appropriate class_names.txt file. You can upload either or both to Galaxy to run detection experiments on your underwater images.
+
+## ⚙️ Run YOLOv8 in detect mode
+
+
+> <hands-on-title> Detect Buccinid snails on images </hands-on-title>
+>
+> 1. {% tool [Perform YOLO image labeling](toolshed.g2.bx.psu.edu/repos/bgruening/yolo_predict/yolo_predict/8.3.0+galaxy2) %} with the following parameters:
+>    - {% icon param-file %} *"Input images"*: `DeepSeaSpy 100 images sample` (Input images dataset collection)
+>    - {% icon param-file %} *"Class names file"*: `Buccinide` (Input plain text file (.txt) that lists the names of the classes the model can detect)
+>    - *"Model"*: `dataset_seanoe_101899_YOLOv8-weights-for-Buccinidae-detection` (Input pt file)
+>    - *"Prediction mode"*: `Detect`
+>    - *"Image size"*: `1000`
+>    - *"Confidence"*: `0.25`
+>    - *"IoU"*: `0.45`
+>    - *"Max detections"*: `300`
+>
+>    > <warning-title> Models type </warning-title>
+>    >
+>    > The model is trained only for detection, not segmentation.
+>    >
+>    {: .warning}
+> 
+> > <tip-title>IoU threshold parameter</tip-title>
+> >
+> >   Try changing the confidence and IoU thresholds to see how detection results vary. It helps you find a good balance between sensitivity and accuracy.
+> >
+> >
+> {: .tip}
+>
+> > <comment-title>Additional information on class names file and parameters</comment-title>
+> >
+> >    Concerning the class names file: Each class name must be on its own line, in the same order used during model training. So the class ID 0 corresponds to Buccinidae, and 1 to Bythograeidae.
+> >    Concerning tool parameters:
+> >    - *"Image size"*: Use 1000 (or a smaller number like 640 if processing speed is important). This controls how much the image is resized before prediction. Smaller values = faster but possibly less accurate.
+> >    - *"Confidence"*: Set to 0.25 (25%). This controls how confident the model must be to report a detection. If you increase this value (e.g., 0.5), you’ll get fewer detections, but they’ll be more confident. If you lower it (e.g., 0.1), you may get more results, but possibly more false positives.
+> >    - *"IoU"*: Set to 0.45. This is used for Non-Maximum Suppression (NMS), which removes overlapping detections. A higher IoU value (e.g., 0.7) keeps more overlapping boxes. A lower IoU (e.g., 0.3) removes more overlaps, which may help clean up crowded images.
+> >    - *"Max detections"*: Set a reasonable cap like 300. This limits the number of objects detected per image.
+> > 
+> {: .comment}
+> 
+{: .hands_on}
+
+## 🧾 Explore the Outputs
+
+After running the tool, Galaxy will give you several output files for each image. Let’s go through what each one means and how to use them:
+
+📄 **Text files (*.txt):**
+
+These are plain text files containing the detection results. Each line in a file shows:
+
+```
+<class_id> <confidence_score> <x_center> <y_center> <width> <height>
+```
+
+For example: ```0 0.82 350 200 100 120```
+which means:
+
+ - Class ID 0 (in our case, Buccinidae)
+
+ - Detected with 82% confidence
+
+ - The bounding box is centered at (350, 200) and has a width of 100 and height of 120 (in pixels, relative to the image. You can use this file to do further analysis, like counting species or tracking locations over time.
+
+🖼️ **Overlay images (*.jpg):**
+
+These are your original images with colored boxes drawn around detected species. Each box also includes:
+The class name and the confidence score. For example, you might see a box labeled:
+`Crabe bythograeid 0.54`
+
+![buccinid](../../images/yolo/buccinid.jpeg)
+![bythog](../../images/yolo/bythog2.jpeg)
+
+
+These images are useful for visually checking whether detections are correct or if something was missed.
+
+⚠️ The `?` character you see in the annotation is due to an incompatibility between the class names in the pre-trained model and the character encoding used on the system where the model was originally trained. This typically happens when non-UTF-8 characters are not properly handled during training or export.
+
+⚠️ **No masks or segmentation files:** Since we used detect mode, this tool will not generate segmentation masks (like .tiff or polygon files). Those are only available in segment mode, which we'll cover next.
+
+
+
+## 💬 What to Look For
+
+- Are species detected correctly?
+- Any false positives or missed detections?
+- What confidence levels do you observe?
+- How many objects per image?
+
+---
+
+# 🧩 Part 2: Segmentation with an Ultralytics example
+
+Since the marine models only support detection, we’ll now demonstrate segmentation using a pretrained YOLOv8 model from Ultralytics.
+
+### 📦 Model & Input Image
+
+The YOLOv8n-seg model is a lightweight instance segmentation model trained on the COCO dataset (Common Objects in Context), which contains 80 everyday object classes such as person, bus, bicycle, car, dog, and more. These categories cover common scenes, making the model suitable for general-purpose detection and segmentation tasks. It’s ideal for quick testing, tutorials, or deployment on resource-limited systems.
+
+The proper class names file for this segmentation model is:
+```
+person
+bus
+bicycle
+car
+...
+```
+
+
+### 🧾 Explore the Segmentation Outputs
+YOLOv8 in segment mode produces a more detailed output than detection:
+
+🖼️ **Segmented overlay (*.jpg):**
+This image shows both bounding boxes and colored masks indicating the exact shape of each object.
+
+<img src="../../images/yolo/bus.jpg" style="width:40%;" alt="Input image of a bus used in segmentation example">
+
+
+🗺️ **Mask file (*_mask.tiff):**
+A grayscale image where each object appears as a white/black blob against a black background. Ideal for pixel-level analysis or downstream processing.
+
+
+<img src="../../images/yolo/bus_mask.png" style="width:40%;" alt="YOLOv8 predicted mask on the bus image">
+
+📄 **Annotation file (*.txt):**
+This output contains class IDs, bounding box coordinates, confidence scores, and detailed segmentation polygons—providing both the approximate location and the precise shape of each object in plain text format.
+
+
+
+---
+
+### 🔍 Compare Detection vs Segmentation
+
+| Feature           | Detection                | Segmentation                     |
+|------------------|--------------------------|----------------------------------|
+| Mode             | `detect`                 | `segment`                        |
+| Output overlays  | Bounding boxes only      | Boxes + masks                    |
+| .tiff masks      | ❌                        | ✅                               |
+| Use case         | Objects presence/count   | Object shape, size, morphology   |
+| Performance      | Faster                   | Slightly slower, more detailed   |
+
+
+---
+
+### 💡 Final Notes
+
+Galaxy simplifies running YOLO models in a user-friendly, reproducible way.
+
+- Choose your **prediction mode** according to your model and task
+- Always check that the **model type matches the mode** (`detect` vs `segment`)
+- Use overlays and annotation files for further analysis or visualization
+
+---
+
+### ✅ Next Steps
+
+Want to go further?
+
+- Train your own model on SEANOE annotations using [YOLO training tool in Galaxy](https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/bgruening/yolo_training/yolo_training/8.3.0+galaxy2)
+- Get more information from [YOLOv8 training notebook](https://docs.ultralytics.com) 
+
+---
+
+