Skip to content

docs: update detection core with tips for using Gemini integration #1925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

tberends
Copy link

@tberends tberends commented Aug 2, 2025

Description

On request of @SkalskiP at PR: https://github.com/roboflow/notebooks/pull/384

This PR improves the documentation regarding the ordering of content in requests that combine images with text prompts. Following Google's Gemini API best practices, text prompts are now placed after image parts in the contents array when using a single image with text.

Type of change

  • This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

According to the Gemini API documentation on image prompts, when using a single image with text, the recommended approach is to place the text prompt after the image part in the contents array. This ordering has been shown to produce significantly better results in practice.

In our testing with Process & Instrument Diagrams (P&IDs) using object detection, this reordering led to drastically improved accuracy in bounding box positioning. While the object labels were already accurate, the spatial precision of detected elements improved considerably with the optimized prompt ordering

Docs

  • Docs updated? What were the changes: updated the tips for prompt engineering

@tberends tberends requested a review from SkalskiP as a code owner August 2, 2025 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant