-
-
Notifications
You must be signed in to change notification settings - Fork 125
Open
Labels
Description
Profiling shows that the sparse_mmap_array
node location store is faster than others for small and medium datasets, but it still takes a huge amount of processing time. This is probably due to the binary search over a large array of data that totally defeats the CPU cache, basically every memory access is a cache fail.
This could probably be optimized. Here are some ideas:
- Store node IDs and locations not as pair of data but in separate memory areas. Find ID first, which gives us the offset into the array and then get location with one lookup.
- Store node IDs not as full 64bit but use several arrays with a subset.
- Have some kind of compact lookup table with pointers into main table, to narrow down search space.
- Do linear search instead of binary search when we are "near" the ID we are looking for, for better cache performance.
- Remember position of last ID(s) we looked up in a mini-cache as starting point for next search. Because node IDs in ways are often close, this could be a huge speedup.
And here some background reading:
- https://geidav.wordpress.com/2013/12/29/optimizing-binary-search/
- http://www.cs.utexas.edu/~pingali/CS395T/2013fa/lectures/MemoryOptimizations_2013.pdf
- http://blog.teamleadnet.com/2014/06/beating-binary-search-algorithm.html
- http://bannalia.blogspot.de/2015/06/cache-friendly-binary-search.html
- http://cglab.ca/~morin/misc/arraylayout/
We have to try different ideas and benchmark them.