Skip to content

Conversation

@robertnishihara
Copy link
Contributor

@pcmoritz
Copy link
Contributor

The reasoning for removing this is the following one: MAP_POPULATE will pre-populate the page entries when the mmap is executed. While this speeds up accessing the file from the client, it also induces a large latency penalty when a new memory mapped file is created. This latency can be on the order of several 10s of seconds for very large memory mapped files and we would like to avoid this behavior which can make the object store performance unpredictable.


int fd = create_buffer(size);
ARROW_CHECK(fd >= 0) << "Failed to create buffer during mmap";
#ifdef __linux__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get rid of this ifdef now; we can still have the comment that explains why we don't use mmap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, done

@atumanov
Copy link

the difference in performance behavior with and without MAP_POPULATE is the following: with MAP_POPULATE perf degradation occurs (and is tangible) only when we run out of space to place a new object and, thus, need to mmap a new file. Without MAP_POPULATE, we distribute degraded performance (due to page faulting on the server side) uniformly across most object puts. As stated, the advantage of the latter is throughput predictability.

It is important to note that, in steady state, when the object store has been fully warmed up, presence/absence of MAP_POPULATE doesn't matter. It only matters during the object store ramp up.

My favorite longer-term solution continues to be preallocating the configured capacity on object store startup as a single mmapped file and pre-faulting it in the background.

@pcmoritz
Copy link
Contributor

+1, the build failure in AppVeyor is unrelated

@asfgit asfgit closed this in a3607d2 Aug 25, 2017
@robertnishihara robertnishihara deleted the removemappopulate branch August 25, 2017 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants