Skip to content

Remove the cloning of scan_start & scan_end in longest_match_help #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 6, 2025

Conversation

brian-pane
Copy link

Benchmark 1 (39 runs): ./compress-baseline 2 rs silesia-small.tar
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           130ms ± 3.34ms     128ms …  145ms          3 ( 8%)        0%
  peak_rss           25.0MB ± 71.9KB    24.8MB … 25.0MB          0 ( 0%)        0%
  cpu_cycles          513M  ± 8.90M      509M  …  560M           2 ( 5%)        0%
  instructions       1.11G  ±  266      1.11G  … 1.11G           1 ( 3%)        0%
  cache_references    379K  ± 7.46K      374K  …  422K           1 ( 3%)        0%
  cache_misses        290K  ± 4.86K      273K  …  298K           2 ( 5%)        0%
  branch_misses      6.85M  ± 6.61K     6.84M  … 6.87M           0 ( 0%)        0%
Benchmark 2 (39 runs): ./target/release/examples/compress 2 rs silesia-small.tar
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           129ms ±  939us     127ms …  132ms          1 ( 3%)          -  1.2% ±  0.8%
  peak_rss           25.0MB ± 66.2KB    24.9MB … 25.0MB          0 ( 0%)          +  0.0% ±  0.1%
  cpu_cycles          507M  ± 1.71M      504M  …  513M           1 ( 3%)          -  1.2% ±  0.6%
  instructions       1.06G  ±  361      1.06G  … 1.06G           0 ( 0%)        ⚡-  4.1% ±  0.0%
  cache_references    379K  ± 5.29K      374K  …  399K           1 ( 3%)          +  0.1% ±  0.8%
  cache_misses        291K  ± 4.47K      276K  …  299K           1 ( 3%)          +  0.4% ±  0.7%
  branch_misses      6.86M  ± 12.2K     6.84M  … 6.89M           0 ( 0%)          +  0.1% ±  0.1%

@brian-pane
Copy link
Author

This change seems to improve performance at compression levels 2 and higher on my test system. Maybe I'm missing something, though, because zlib-ng copies into scan_start and scan_end. Was the copying originally done as a performance optimization?

Copy link
Collaborator

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives a consistent reduction in instruction count for me, for levels 2..=9, and simplifies the code, so I'm happy with this change (with the comment fixed).

I believe the idea was that using an array would put the value into a register, hopefully speeding up the comparisons.

@folkertdev folkertdev requested a review from bjorn3 December 20, 2024 12:22
@brian-pane
Copy link
Author

I also tried a version of this patch that read scan_start and scan_end into u64 to try to encourage the compiler to use registers, but this version outperformed it.

@brian-pane
Copy link
Author

Sorry about the extraneous merge commit. I clicked the button on github that I thought would rebase, and it generated a merge commit instead.

@folkertdev
Copy link
Collaborator

nothing some git-foo can't fix :)

I'm not sure why github does that. Anyway, if there is no conflict, there is no need to have github update that branch, and when there are conflicts, in practice I end up just resolving the locally with full editor support etc instead of on github where it is so easy to e.g. mess up formatting.

Copy link
Collaborator

@bjorn3 bjorn3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is ok, but this is a rather non-trivial function that I'm not all that familiar with.

@folkertdev folkertdev merged commit 5478f9d into trifectatechfoundation:main Jan 6, 2025
20 checks passed
@brian-pane brian-pane deleted the longest-match branch April 1, 2025 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants