Skip to content

Yolo_predict_tutorial_deepsea #6258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

mjoudy
Copy link

@mjoudy mjoudy commented Jul 16, 2025

Hi all,

This is a tutorial for yolo prediction tool based on deepsea seanoe dataset. Please kindly give your feedback.

@mjoudy mjoudy marked this pull request as ready for review July 16, 2025 19:42
@anuprulez anuprulez self-requested a review July 18, 2025 08:05
Copy link
Collaborator

@kostrykin kostrykin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some comments inside.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the ? in the image? Are these on purpose?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are generated by the model or yolo-prediction tool.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what do they mean? In case these are some artifacts, it's worth mentioning this in the text, so the reader isn't confused.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks likecharacter "é" in Crabe bythograeidé is not getting properly displayed

@yvanlebras
Copy link
Collaborator

Hi Mohammad. I added previously added content from the older PR (sorry, I was not seeing there was 2 identicial PR...) proposing a format more linked with GTN material with hands-on, comment tip and warning. I also propose to be co-author and to propose this tutorial in both Imaging and Ecology GTN sections (if this seems ok for you @shiltemann @bebatut @hexylena ).

Plaese, don't hesitate to modify / comment / revert !

Comment on lines 137 to 140
> > - `Image size`: Use 1000 (or a smaller number like 640 if processing speed is important). This controls how much the image is resized before prediction. Smaller values = faster but possibly less accurate.
> > - `Confidence threshold`: Set to 0.25 (25%). This controls how confident the model must be to report a detection. If you increase this value (e.g., 0.5), you’ll get fewer detections, but they’ll be more confident. If you lower it (e.g., 0.1), you may get more results, but possibly more false positives.
> > - `IoU threshold`: Set to 0.45. This is used for Non-Maximum Suppression (NMS), which removes overlapping detections. A higher IoU value (e.g., 0.7) keeps more overlapping boxes. A lower IoU (e.g., 0.3) removes more overlaps, which may help clean up crowded images.
> > - `Max detections`: Set a reasonable cap like 300. This limits the number of objects detected per image.
Copy link
Collaborator

@kostrykin kostrykin Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider adapting the formatting of the parameters to the format used in lines 113–118.

Copy link
Contributor

@sunyi000 sunyi000 Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the image size, if we make prediction on an image with 512x512, setting image size to 1000 will probably not work? im not too sure..better to test it

Copy link
Contributor

@sunyi000 sunyi000 Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, i think yolo automatically adjust the size, if input 1000, yolo will change it to 1024, 500 will become 512.
maybe we need to add this to the image size explaination? we need to test it, to be safe.. i remember i had some issue with setting size bigger than the image..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @sunyi000 I think its correct. I will test it and add this point to the tutorial.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunyi000 I tested this issue and there is no problem. my images size are 1920by1080 and in the tool I tested for 640, 1000, 2000, 3000. They all had correct outputs.

@anuprulez
Copy link
Member

@yvanlebras I agree. We can have this tutorial for both - Ecology and BioImaging. Actually, i was talking to @sunyi000 two days back and he suggested the same. He plans to work on a new tutorial for Yolo training tool. @mjoudy can you use images from BioImage model zoo (https://bioimage.io/#/models) to make prediction using Yolo models? Some sample images you can find in this tutorial: https://training.galaxyproject.org/training-material/topics/imaging/tutorials/process-image-bioimageio/tutorial.html

@anuprulez
Copy link
Member

@yvanlebras @kostrykin thank you very much for your work on this. @mjoudy please address the issues mentioned

@yvanlebras
Copy link
Collaborator

For SEANOE data part, I can also add some steps further to take all txt files to produce a summary of number of each species detected if relevant / of interest.

@mjoudy
Copy link
Author

mjoudy commented Jul 23, 2025

Hi Mohammad. I added previously added content from the older PR (sorry, I was not seeing there was 2 identicial PR...) proposing a format more linked with GTN material with hands-on, comment tip and warning. I also propose to be co-author and to propose this tutorial in both Imaging and Ecology GTN sections (if this seems ok for you @shiltemann @bebatut @hexylena ).

Plaese, don't hesitate to modify / comment / revert !

Thanks Yvan. sorry I had made a mistake and opened two PR. after a while I closed one of them.
Thank you for your suggestions and help.

@mjoudy
Copy link
Author

mjoudy commented Jul 23, 2025

@yvanlebras I agree. We can have this tutorial for both - Ecology and BioImaging. Actually, i was talking to @sunyi000 two days back and he suggested the same. He plans to work on a new tutorial for Yolo training tool. @mjoudy can you use images from BioImage model zoo (https://bioimage.io/#/models) to make prediction using Yolo models? Some sample images you can find in this tutorial: https://training.galaxyproject.org/training-material/topics/imaging/tutorials/process-image-bioimageio/tutorial.html

Thank you Anup for suggestions. Actually, as far as I know, images and models in the BioImage zoo are not yolo-based. Please let me know if there are any. Currently, I am looking for some bio-inspired yolo pre-trained models to make a better example for the segmentation part.

Copy link
Member

@anuprulez anuprulez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a few comments. I am trying to use this tutorial and these are my observations.

We should make the datasets access easier. For the tutorial, if we include a few images, that would be enough to show the usage of the tool. Currently we are pulling a large set of around 4000 images. We can point users to the original large repository. I think these sample images and trained model files should also come from Zenodo. In the current form, it is not easy to find the used datasets. Please me know your opinions @yvanlebras @kostrykin @mjoudy . Thanks

@kostrykin
Copy link
Collaborator

I have added a few comments. I am trying to use this tutorial and these are my observations.

We should make the datasets access easier. For the tutorial, if we include a few images, that would be enough to show the usage of the tool Currently we are pulling a large set of around 4000 images. We can point users to the original large repository. I think these sample images and trained model files should also come from Zenodo. In the current form, it is not easy to find the used datasets. Please me know your opinions @yvanlebras @kostrykin @mjoudy . Thanks

Yes I agree very much

> Couverture microbienne
> Couverture vers tubicole
> Crabe araignée
> Crabe bythograeidé
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to rather use

Suggested change
> Crabe bythograeidé
> Crabe bythograeide

so that the reader won't be suggested to use some characters that are not supported by the tool? (see this)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took these class names from the pre-trained model (.pt file) available in the dataset. I think if they are going to use this model, they have to provide these names.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe give it a shot and see whether it works? My gut feeling is that it does…

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I had tested. but will do it again.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjoudy Any updates on this yet?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to try it myself, but I can't find the image used as input to obtain topics/imaging/images/yolo/bythog2.jpeg. Can you @mjoudy maybe please point me to the right file?

@kostrykin
Copy link
Collaborator

Please also have a look at the complaints from the linter: https://github.com/galaxyproject/training-material/actions/runs/16617529809/job/47101135309?pr=6258#step:8

@mjoudy
Copy link
Author

mjoudy commented Jul 31, 2025

I have added a few comments. I am trying to use this tutorial and these are my observations.

We should make the datasets access easier. For the tutorial, if we include a few images, that would be enough to show the usage of the tool. Currently we are pulling a large set of around 4000 images. We can point users to the original large repository. I think these sample images and trained model files should also come from Zenodo. In the current form, it is not easy to find the used datasets. Please me know your opinions @yvanlebras @kostrykin @mjoudy . Thanks

Thanks for the suggestion. I will make a sample and provide its zenodo link.

@anuprulez
Copy link
Member

Hi @yvanlebras, can we update the input datasets to include just a few images and take them from Zenodo? Then we will have the opportunity to simplify the history operations a lot that we have in the tutorial? Thanks a lot!

@galaxyproject galaxyproject deleted a comment from github-actions bot Aug 11, 2025
Copy link
Collaborator

@kostrykin kostrykin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the linting issue…

Copy link
Collaborator

@kostrykin kostrykin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants