chore: Simplify on-heap memory configuration #2599

andygrove · 2025-10-17T14:58:38Z

Which issue does this PR close?

N/A

Rationale for this change

Simplify the on-heap memory configuration, which is intended for test use only now.

The number of configs is reduced from 7 to 4. The remaining configs, which are now documented in a new Development & Testing Settings section in the configuration guide are:

COMET_ONHEAP_ENABLED
COMET_ONHEAP_MEMORY_POOL_TYPE
COMET_ONHEAP_MEMORY_OVERHEAD
COMET_ONHEAP_SHUFFLE_MEMORY_FACTOR

What changes are included in this PR?

Remove some internal configs:
- COMET_MEMORY_OVERHEAD_FACTOR
- COMET_MEMORY_OVERHEAD_MIN_MIB
- COMET_COLUMNAR_SHUFFLE_MEMORY_SIZE
Rename all memory config constants to contain ONHEAP or OFFHEAP

How are these changes tested?

CI

codecov-commenter · 2025-10-17T15:51:27Z

Codecov Report

❌ Patch coverage is 70.83333% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.17%. Comparing base (f09f8af) to head (f668822).
⚠️ Report is 623 commits behind head on main.

Files with missing lines	Patch %	Lines
...src/main/scala/org/apache/comet/GenerateDocs.scala	0.00%	5 Missing ⚠️
...ain/scala/org/apache/comet/CometExecIterator.scala	66.66%	1 Missing ⚠️
...park/src/main/scala/org/apache/spark/Plugins.scala	0.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2599      +/-   ##
============================================
+ Coverage     56.12%   59.17%   +3.04%     
- Complexity      976     1444     +468     
============================================
  Files           119      146      +27     
  Lines         11743    13719    +1976     
  Branches       2251     2353     +102     
============================================
+ Hits           6591     8118    +1527     
- Misses         4012     4379     +367     
- Partials       1140     1222      +82

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

comphead · 2025-10-18T17:32:01Z

common/src/main/scala/org/apache/comet/CometConf.scala

-        "This config is optional. If this is not specified, it will be set to " +
-        s"`spark.comet.memory.overhead.factor` * `spark.executor.memory`. $TUNING_GUIDE.")
-    .internal()
+        "when running Spark in on-heap mode.")


Should we add default value in the doc?

The generated docs show the default value:

comphead · 2025-10-18T17:33:41Z

docs/source/user-guide/latest/configs.md

+| `spark.comet.columnar.shuffle.memory.factor` | Fraction of Comet memory to be allocated per executor process for columnar shuffle when running in on-heap mode. For more information, refer to the [Comet Tuning Guide](https://datafusion.apache.org/comet/user-guide/tuning.html). | 1.0 |
+| `spark.comet.exec.onHeap.enabled` | Whether to allow Comet to run in on-heap mode. Required for running Spark SQL tests. | false |
+| `spark.comet.exec.onHeap.memoryPool` | The type of memory pool to be used for Comet native execution when running Spark in on-heap mode. Available pool types are `greedy`, `fair_spill`, `greedy_task_shared`, `fair_spill_task_shared`, `greedy_global`, `fair_spill_global`, and `unbounded`. | greedy_task_shared |
+| `spark.comet.memoryOverhead` | The amount of additional memory to be allocated per executor process for Comet, in MiB, when running Spark in on-heap mode. | 1073741824b |


should the default value to be 1024?

Hmm.. we could update GenerateDocs to recognize byte configs and show in the unit they were defined as. I'll take a look.

.bytesConf(ByteUnit.MiB)

I pushed a commit. The default is now shown as 1024 MiB.

comphead

Thanks @andygrove couple of small nits

andygrove added 3 commits October 17, 2025 08:56

Simplify on-heap memory configuration

0ff7524

fix

bca5869

conf

cbc0db5

fix regressions

9045090

andygrove changed the title ~~chore: Simplify on-heap memory configuration [WIP]~~ chore: Simplify on-heap memory configuration Oct 17, 2025

prep for review

f824495

andygrove marked this pull request as ready for review October 17, 2025 18:41

rename conf constants

a4629dd

comphead reviewed Oct 18, 2025

View reviewed changes

comphead approved these changes Oct 18, 2025

View reviewed changes

address feedback

f668822

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Simplify on-heap memory configuration #2599

chore: Simplify on-heap memory configuration #2599

Uh oh!

andygrove commented Oct 17, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented Oct 17, 2025 •

edited

Loading

Uh oh!

comphead Oct 18, 2025

Uh oh!

andygrove Oct 18, 2025

Uh oh!

comphead Oct 18, 2025

Uh oh!

andygrove Oct 18, 2025 •

edited

Loading

Uh oh!

andygrove Oct 18, 2025

Uh oh!

comphead left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chore: Simplify on-heap memory configuration #2599

Are you sure you want to change the base?

chore: Simplify on-heap memory configuration #2599

Uh oh!

Conversation

andygrove commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

codecov-commenter commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

comphead Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

andygrove Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

comphead Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

andygrove Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andygrove Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

comphead left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andygrove commented Oct 17, 2025 •

edited

Loading

codecov-commenter commented Oct 17, 2025 •

edited

Loading

andygrove Oct 18, 2025 •

edited

Loading