Skip to content

perf: optimize overflow check in RTS alloc_words, fix and update tests #3085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 31, 2022

Conversation

crusso
Copy link
Contributor

@crusso crusso commented Jan 29, 2022

#3077 added an expensive, but necessary overflow check to alloc_words, to prevent allocating addresses outside the 32-bit heap.

  • This PR optimizes that check to avoid the conditional test by doing the page diff arithmetic in u64s,
    relying on grow_memory to trap when asked to grow beyond 2^16 pages.
  • fixes test run-drun/oom-update to actually use an update (not query) method as advertised.

I wonder why grow_memory is marked inline_never? Seems like we'd save a function call on each allocation...

@crusso crusso requested a review from ggreif January 29, 2022 19:26
@crusso crusso changed the title perf: optimize overflow check in RTS alloc_words, fix and update tests perf: (WIP) optimize overflow check in RTS alloc_words, fix and update tests Jan 29, 2022
@crusso crusso removed the request for review from ggreif January 29, 2022 19:40
@crusso crusso marked this pull request as draft January 29, 2022 19:41
@crusso crusso requested review from ggreif and nomeata January 29, 2022 19:47
@crusso crusso marked this pull request as ready for review January 29, 2022 19:47
@crusso crusso changed the title perf: (WIP) optimize overflow check in RTS alloc_words, fix and update tests perf: optimize overflow check in RTS alloc_words, fix and update tests Jan 29, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2022

Comparing from 316b193 to 02e277b:
In terms of gas, 3 tests improved and the mean change is -2.2%.
In terms of size, 3 tests regressed and the mean change is +0.4%.

@crusso
Copy link
Contributor Author

crusso commented Jan 29, 2022

Comparing from 5f678e9 to f6d70fb: In terms of gas, 3 tests improved and the mean change is -2.2%. In terms of size, 3 tests regressed and the mean change is +0.4%.

Seems worth it, maybe.

And reclaims the cycle regression here: #3077 (comment)

@ggreif ggreif changed the title perf: optimize overflow check in RTS alloc_words, fix and update tests perf: optimize overflow check in RTS alloc_words, fix and update tests Jan 30, 2022
@ggreif
Copy link
Contributor

ggreif commented Jan 30, 2022

See also #2998, where I was wondering about the no-inline too...

@crusso crusso requested a review from ulan January 31, 2022 10:45

Value::from_ptr(old_hp as usize)
}
}

/// Page allocation. Ensures that the memory up to, but excluding, the given pointer is allocated.
#[inline(never)]
unsafe fn grow_memory(ptr: usize) {
unsafe fn grow_memory(ptr: u64) {
debug_assert!(ptr <= 2 * u64::from(core::u32::MAX));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the same condition as in the assert in line 85: ptr <= u64::from(core::u32::MAX) (i.e. compare against 4GiB, not 8GiB)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm deliberately comparing against 8GB because I want to test the overflow detection later on (otherwise we will get different faults on debug and release builds).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Sounds good!

Copy link
Contributor

@ggreif ggreif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic didn't change, only the u64 conversions moved a bit, and the trapping is consolidated. LGTM!


// Grow memory if needed
grow_memory(new_hp as usize);
grow_memory(new_hp);
Copy link
Contributor

@ggreif ggreif Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulan commenting remark in the meeting just now. The fast path should be predicated on the fact that the allocation remains on the last page, i.e. old_hp >> 16 == new_hp >> 16.

And that condition could even be made faster by (old_hp ^ new_hp) >> 16. See #3090.

Copy link
Contributor Author

@crusso crusso Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I'll leave that to #3090.

@crusso crusso added the automerge-squash When ready, merge (using squash) label Jan 31, 2022
@mergify mergify bot merged commit d1fef02 into master Jan 31, 2022
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants