Skip to content

Conversation

romange
Copy link
Collaborator

@romange romange commented Apr 9, 2025

This should improve situation around #4787 -
maybe not solve it completely but improve it significantly.

On my tests when doing snapshotting under read traffic with master (memtier_benchmark --ratio 0:1 -d 256 --test-time=400 --distinct-client-seed --key-maximum=2000000 -c 5 -t 2 --pipeline=3) I got drop from 250K qps to 8K qps during the full sync phase.

With this PR, the throughput went up to 45K qps - 5x more but still much lower than 250K qps.

@romange romange requested review from Copilot and kostasrim April 9, 2025 12:56
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

Files not reviewed (1)
  • helio: Language not supported
Comments suppressed due to low confidence (2)

src/server/snapshot.cc:176

  • Removing the conditional PushSerialized calls and related yield logic may lead to less frequent flushing of the serialization buffer, potentially impacting memory usage and server responsiveness under heavy load; please verify that this change maintains the intended performance characteristics.
ThisFiber::Yield();

src/server/snapshot.cc:427

  • Ignoring the return value in the OnJournalEntry method might suppress error conditions that could be critical; ensure that this behavior is safe for all execution scenarios.
std::ignore = serializer_->WriteJournalEntry(item.data);

@romange romange force-pushed the TryBackground branch 3 times, most recently from 063cc01 to 271ec8d Compare April 9, 2025 13:10
kostasrim
kostasrim previously approved these changes Apr 9, 2025
@adiholden
Copy link
Contributor

why do you pull helio in this PR? lets make it in a seprate PR

@adiholden adiholden self-requested a review April 9, 2025 14:05
@adiholden
Copy link
Contributor

please dont merge yet, wait for my review

@romange
Copy link
Collaborator Author

romange commented Apr 9, 2025

I pulled helio because of cycle timer that I added recently for this.

adiholden
adiholden previously approved these changes Apr 10, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 6 out of 7 changed files in this pull request and generated no comments.

Files not reviewed (1)
  • helio: Language not supported
Comments suppressed due to low confidence (1)

src/server/rdb_save.cc:913

  • Returning an empty buffer when mem_buf_.InputLen() is zero may lead to unexpected behavior if downstream code expects the original mem_buf_.InputBuffer() output. Consider reviewing the downstream usage and either preserving the original behavior or ensuring that returning an empty buffer is handled appropriately.
return {};

This should improve situation around #4787 -
maybe not solve it completely but improve it significantly.

On my tests when doing snapshotting under read traffic with master
(memtier_benchmark --ratio 0:1 -d 256  --test-time=400  --distinct-client-seed --key-maximum=2000000 -c 5 -t 2 --pipeline=3)
I got drop from 250K qps to 8K qps during the full sync phase.

With this PR, the throughput went up to 70-80K qps.

Signed-off-by: Roman Gershman <[email protected]>
@romange romange force-pushed the TryBackground branch 2 times, most recently from 2b69677 to caabc1d Compare April 11, 2025 16:45
Signed-off-by: Roman Gershman <[email protected]>
@romange
Copy link
Collaborator Author

romange commented Apr 13, 2025

@adiholden PTAL

@romange romange merged commit eb80d57 into main Apr 16, 2025
10 checks passed
@romange romange deleted the TryBackground branch April 16, 2025 04:48
kostasrim pushed a commit that referenced this pull request Apr 22, 2025
* chore: Make snapshotting more responsive

This should improve situation around #4787 -
maybe not solve it completely but improve it significantly.

On my tests when doing snapshotting under read traffic with master
(memtier_benchmark --ratio 0:1 -d 256  --test-time=400  --distinct-client-seed --key-maximum=2000000 -c 5 -t 2 --pipeline=3)
I got drop from 250K qps to 8K qps during the full sync phase.

With this PR, the throughput went up to 70-80K qps.

---------

Signed-off-by: Roman Gershman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants