-
Notifications
You must be signed in to change notification settings - Fork 0
Improve coalesce performance #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Before: ``` -------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... -------------------------------------------------------------------------------------- CoalesceBench64/0 3366683 ns 3364669 ns 213 bytes_per_second=2.32192G/s items_per_second=311.643M/s length=1048.58k null%=1 num_args=2 CoalesceBench64/1 2851663 ns 2849846 ns 245 bytes_per_second=2.74138G/s items_per_second=367.941M/s length=1048.58k null%=1 num_args=4 CoalesceBench64/2 7499865 ns 7495813 ns 95 bytes_per_second=1067.26M/s items_per_second=139.888M/s length=1048.58k null%=25 num_args=2 CoalesceBench64/3 11773437 ns 11766272 ns 60 bytes_per_second=679.909M/s items_per_second=89.1171M/s length=1048.58k null%=25 num_args=4 CoalesceBench64/4 9636207 ns 9631169 ns 73 bytes_per_second=830.636M/s items_per_second=108.873M/s length=1048.58k null%=50 num_args=2 CoalesceBench64/5 19456855 ns 19445858 ns 36 bytes_per_second=411.399M/s items_per_second=53.9228M/s length=1048.58k null%=50 num_args=4 CoalesceBench64/6 3288217 ns 3286426 ns 214 bytes_per_second=2.3772G/s items_per_second=319.063M/s length=1048.58k null%=99 num_args=2 CoalesceBench64/7 7603232 ns 7599720 ns 92 bytes_per_second=1052.67M/s items_per_second=137.976M/s length=1048.58k null%=99 num_args=4 CoalesceScalarBench64/0 775260 ns 774797 ns 904 bytes_per_second=10.0833G/s items_per_second=1.35336G/s length=1048.58k null%=1 num_args=2 CoalesceScalarBench64/2 3500267 ns 3498388 ns 201 bytes_per_second=2.23317G/s items_per_second=299.731M/s length=1048.58k null%=25 num_args=2 CoalesceScalarBench64/4 4815186 ns 4812821 ns 146 bytes_per_second=1.62327G/s items_per_second=217.871M/s length=1048.58k null%=50 num_args=2 CoalesceScalarBench64/6 446897 ns 446783 ns 1541 bytes_per_second=17.4861G/s items_per_second=2.34695G/s length=1048.58k null%=99 num_args=2 CoalesceScalarStringBench/0 74138532 ns 74089097 ns 10 bytes_per_second=6.72872G/s items_per_second=14.1529M/s length=1048.58k null%=1 num_args=2 CoalesceScalarStringBench/2 58106933 ns 58064020 ns 9 bytes_per_second=6.52407G/s items_per_second=18.059M/s length=1048.58k null%=25 num_args=2 CoalesceScalarStringBench/4 52094990 ns 52064312 ns 10 bytes_per_second=4.88432G/s items_per_second=20.14M/s length=1048.58k null%=50 num_args=2 CoalesceScalarStringBench/6 5136540 ns 5133121 ns 138 bytes_per_second=1.7244G/s items_per_second=204.276M/s length=1048.58k null%=99 num_args=2 ``` After: ``` -------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... -------------------------------------------------------------------------------------- CoalesceBench64/0 1047061 ns 1046399 ns 661 bytes_per_second=7.46608G/s items_per_second=1002.08M/s length=1048.58k null%=1 num_args=2 CoalesceBench64/1 1377282 ns 1376405 ns 511 bytes_per_second=5.67602G/s items_per_second=761.822M/s length=1048.58k null%=1 num_args=4 CoalesceBench64/2 2804061 ns 2802178 ns 251 bytes_per_second=2.78801G/s items_per_second=374.2M/s length=1048.58k null%=25 num_args=2 CoalesceBench64/3 5234633 ns 5230898 ns 134 bytes_per_second=1.49353G/s items_per_second=200.458M/s length=1048.58k null%=25 num_args=4 CoalesceBench64/4 3700820 ns 3698116 ns 190 bytes_per_second=2.11256G/s items_per_second=283.543M/s length=1048.58k null%=50 num_args=2 CoalesceBench64/5 7731316 ns 7726379 ns 90 bytes_per_second=1035.41M/s items_per_second=135.714M/s length=1048.58k null%=50 num_args=4 CoalesceBench64/6 1004359 ns 1003745 ns 693 bytes_per_second=7.78335G/s items_per_second=1044.66M/s length=1048.58k null%=99 num_args=2 CoalesceBench64/7 4660379 ns 4658001 ns 151 bytes_per_second=1.67722G/s items_per_second=225.113M/s length=1048.58k null%=99 num_args=4 CoalesceScalarBench64/0 656265 ns 655870 ns 1067 bytes_per_second=11.9117G/s items_per_second=1.59876G/s length=1048.58k null%=1 num_args=2 CoalesceScalarBench64/2 2889294 ns 2887898 ns 242 bytes_per_second=2.70525G/s items_per_second=363.093M/s length=1048.58k null%=25 num_args=2 CoalesceScalarBench64/4 4015990 ns 4014054 ns 175 bytes_per_second=1.94629G/s items_per_second=261.226M/s length=1048.58k null%=50 num_args=2 CoalesceScalarBench64/6 390245 ns 390138 ns 1800 bytes_per_second=20.025G/s items_per_second=2.68771G/s length=1048.58k null%=99 num_args=2 CoalesceScalarStringBench/0 82277097 ns 82223643 ns 9 bytes_per_second=6.06303G/s items_per_second=12.7527M/s length=1048.58k null%=1 num_args=2 CoalesceScalarStringBench/2 70821126 ns 70771323 ns 10 bytes_per_second=5.35265G/s items_per_second=14.8164M/s length=1048.58k null%=25 num_args=2 CoalesceScalarStringBench/4 47119447 ns 47087724 ns 13 bytes_per_second=5.40053G/s items_per_second=22.2686M/s length=1048.58k null%=50 num_args=2 CoalesceScalarStringBench/6 4579486 ns 4576728 ns 150 bytes_per_second=1.93403G/s items_per_second=229.11M/s length=1048.58k null%=99 num_args=2 ```
|
@lidavidm You should probably enable Github Actions on this fork. |
|
I re-enabled them (I keep them disabled usually because they generate so much email spam on every push 🙁) |
|
Ah, I disabled those notifications (not sure how I did that, but I don't get emails from Github Actions). |
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look at this, this is quite the impressive improvement.
Before:
After: