Skip to content

Conversation

bveeramani
Copy link
Member

Why are these changes needed?

Typically, a map task output is around ~128 MiB and each core has 4 GiB of memory. However, if num_cpus is small (e.g., 0.01) or target_max_block_size is large (e.g., 1GB), then tasks can OOM even if it just uses enough memory to produce an output block. By setting memory to the average output size, we can mitigate this case.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Balaji Veeramani <[email protected]>
@bveeramani bveeramani requested a review from a team as a code owner March 19, 2025 21:04
Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
bveeramani and others added 4 commits March 19, 2025 16:54
Signed-off-by: Balaji Veeramani <[email protected]>
…ory.py

Co-authored-by: Hao Chen <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
Signed-off-by: Balaji Veeramani <[email protected]>
@bveeramani bveeramani enabled auto-merge (squash) March 20, 2025 01:52
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Mar 20, 2025
@github-actions github-actions bot disabled auto-merge March 20, 2025 03:45
@bveeramani bveeramani enabled auto-merge (squash) March 20, 2025 03:45
Signed-off-by: Balaji Veeramani <[email protected]>
@github-actions github-actions bot disabled auto-merge March 20, 2025 05:31
@bveeramani bveeramani merged commit 07cdfec into master Mar 20, 2025
5 checks passed
@bveeramani bveeramani deleted the configure-memory3 branch March 20, 2025 06:34
dhakshin32 pushed a commit to dhakshin32/ray that referenced this pull request Mar 27, 2025
)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

<!-- Please give a short summary of the change and the problem this
solves. -->

Typically, a map task output is around ~128 MiB and each core has 4 GiB
of memory. However, if num_cpus is small (e.g., 0.01) or
`target_max_block_size` is large (e.g., 1GB), then tasks can OOM even if
it just uses enough memory to produce an output block. By setting memory
to the average output size, we can mitigate this case.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Balaji Veeramani <[email protected]>
Co-authored-by: Hao Chen <[email protected]>
Signed-off-by: Dhakshin Suriakannu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-backlog go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants