Skip to content

Conversation

kouroshHakha
Copy link
Contributor

@kouroshHakha kouroshHakha commented Jul 3, 2025

As mentioned in this PR there is a lot of overhead when returning each row as its own batch with bsize=1 in an async generator udf. This PR fixes is by returning a batch.

There is an argument that we are sacrificing throughput by making items in a batch sync but that there is even a bigger overhead now because of yielding each at row levels.

Screenshot 2025-07-06 at 3 02 36 PM

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
@kouroshHakha
Copy link
Contributor Author

Running release tests to ensure correctness: https://buildkite.com/ray-project/release/builds/47502

@alexeykudinkin alexeykudinkin added the go add ONLY when ready to merge, run all tests label Jul 3, 2025
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
@kouroshHakha kouroshHakha marked this pull request as ready for review July 7, 2025 01:58
@kouroshHakha kouroshHakha requested a review from a team as a code owner July 7, 2025 01:58
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
@kouroshHakha kouroshHakha merged commit 0803021 into ray-project:master Jul 7, 2025
5 checks passed
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Jul 9, 2025
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Jul 9, 2025
ccmao1130 pushed a commit to ccmao1130/ray that referenced this pull request Jul 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants