Skip to content

moka's benchmarking issues #1

@tatsuya6502

Description

@tatsuya6502

Hi. First of all, thank you for trying out moka in early days in Apollo Router. I wish you all the best with deduplicate!

I was quite surprised with the moka results. The performance didn't seem to improve when I varied the size of the cache.
...
It's also possible that I've not written my benchmarking code correctly, but I've tried to be consistent between the two crates, so please let me know if you spot any errors I can correct.

Actually, I spotted an error in your benchmarking code. I will create a pull request with the fix and some other enhancements.

In the PR, I did the followings:

  1. In moka benchmark, avoid calling get_text() in every iterations.
    • For moka, get_text() is called in every iterations even though the key already exists.
    • This is the error causing moka's performance was not improved when the cache size was increased.
    • Moving the line let words = get_text(); into the async block fixes the error.
      • After the fix, moka's performance became competitive to deduplicate's performance.
  2. Call moka's cache.sync() in every iterations to process pending eviction.
    • Without doing so, moka will only do batch eviction of entries.
      • This means moka will hold more entries than the specified max capacity, which will improve the hit ratio.
    • This will cause the benchmark to be unfair to deduplicate. So pending evictions in moka should be processed in every iterations.
  3. Show the number of get calls in both benchmarks.
    • This is to check if the benchmark is fair.
      • Since caches are stateful, differences in the number of get calls will cause skews in the hit ratio.
      • For this benchmark, hit ratio will be the most important metric, because the time spent on cache misses is much greater than the time spent on cache hits.
    • To be a fair benchmark, the number of get calls should be the same for both caches.
      • However, Criterion will run the benchmark multiple times to get a stable result. So the number of get calls can be different for each run.
    • I have not found a way to address this issue.
  4. Show the hit ratio (hit_count / get_count) in moka benchmark.
    • This requires moka v0.10.0 or later. So I updated the dependency to moka v0.10.0.
    • I could not found a way to track the hit ratio in deduplicate benchmark.
  5. Add more cache sizes 64, 128 and 256 to reduce the hit ratio.
    • moka's hit ratio do not change much when the cache size is greater than 1024 (hit ratio is ~65% in my env).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions