Skip to content

Conversation

chakaz
Copy link
Contributor

@chakaz chakaz commented Dec 21, 2023

While at it:

  • Make PreUpdate() not decrease object size
  • Remove redundant leftover call to PreUpdate() outside DbSlice

While at it:
* Make `PreUpdate()` not decrease object size
* Remove redundant leftover call to `PreUpdate()` outside `DbSlice`
@chakaz chakaz requested a review from adiholden December 21, 2023 08:39
@adiholden
Copy link
Contributor

As this is the last PR for this change I think its worth adding a test for the memory accounting
I can think of a pytest that runs the seeder when seeder finished check the memory in info command than create a snapshot and than start the server again with loading from snapshot and check that the memory in info counted the same

@chakaz
Copy link
Contributor Author

chakaz commented Dec 21, 2023

I can think of a pytest that runs the seeder when seeder finished check the memory in info command than create a snapshot and than start the server again with loading from snapshot and check that the memory in info counted the same

Unfortunately it seems like it's not trivial to directly compare memory counters, as some values are slightly different after save/load:

>       assert memory_before == memory_after
E       AssertionError: assert equals failed
E         {                                                                                                  {                                                                                                 
E           'object_used_memory': '31515056',                                                                  'object_used_memory': '30997480',                                                               
E           'type_used_memory_HASH': '648960',                                                                 'type_used_memory_HASH': '750880',                                                              
E           'type_used_memory_JSON': '109040',                                                                 'type_used_memory_JSON': '108320',                                                              
E           'type_used_memory_LIST': '1877240',                                                                'type_used_memory_LIST': '1856400',                                                             
E           'type_used_memory_SET': '1775960',                                                                 'type_used_memory_SET': '1643720',                                                              
E           'type_used_memory_STRING': '989936',                                                               'type_used_memory_STRING': '986880',                                                            
E           'type_used_memory_ZSET': '26113920',                                                               'type_used_memory_ZSET': '25651280',                                                            
E         }                                                                                                  }

(note the scroll to the right - these are long lines)

I'll continue looking into this next week though

@chakaz
Copy link
Contributor Author

chakaz commented Dec 24, 2023

The best solution I can think of (which will keep tests robust and not flaky) is just to make sure that all of these counters are > 0 (or some other value).
I think that it's valid for counters to have different values, as memory usage may very be different depending on how certain containers are populated (order, no deletions when loading from file, etc).
Do you have a better suggestion?

@chakaz
Copy link
Contributor Author

chakaz commented Dec 24, 2023

I also thought about asserting that the values are at least 50% of the original values, I'm not sure if we'll have a failure very once in a while because of that though. wdyt?

@chakaz chakaz requested a review from adiholden December 24, 2023 11:52
@adiholden
Copy link
Contributor

checking that the values are at least %50 after reload sounds good.
How about adding a test that runs the seeder checks the values are > 0 than delete all the keys (not flush all), we will need to add a dedicated python logic for this, and than we can check that the values are 0. This way we check that all the insert / updates done in the test are updating correctly the counting

@chakaz
Copy link
Contributor Author

chakaz commented Dec 24, 2023

That's a great idea! Done.

for counter, value in memory_before.items():
assert memory_after[counter] >= 0.5 * value

# Delete all keys from all DBs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think we do this logic of delete and check not after we load from snapshot, because this way we dont check all the update commands accounting correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an even better idea!
(I added that on top of what I already had)

@chakaz chakaz requested a review from adiholden December 25, 2023 06:31
@@ -270,6 +311,13 @@ async def test_snapshot(self, df_seeder_factory, df_server):
await a_client.connection_pool.disconnect()

assert await seeder.compare(start_capture, port=df_server.port)
memory_after = await self._get_info_memory_fields(a_client)
for counter, value in memory_before.items():
assert memory_after[counter] >= 0.5 * value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment why we check > 0.5 *

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@chakaz chakaz requested a review from adiholden December 25, 2023 06:46
await self._delete_all_keys(a_client)
memory_counters = await self._get_info_memory_fields(a_client)
assert memory_counters == {"object_used_memory": 0}

@pytest.mark.asyncio
@pytest.mark.slow
async def test_snapshot(self, df_seeder_factory, df_server):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment what we check in this tests

  1. check that after reloading the snapshot file the data is the same
  2. check memory counting after loading from snapshot is similar to before creating a snapshot
  3. check memory counting after deleting all keys loaded by snapshot - this validates the memory counting when loading from snapshot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you did not push this commit as I dont see this changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, sorry!

@chakaz chakaz requested a review from adiholden December 25, 2023 06:56
@chakaz chakaz enabled auto-merge (squash) December 25, 2023 07:34
@chakaz chakaz merged commit a360b30 into main Dec 25, 2023
@chakaz chakaz deleted the post-update6 branch December 25, 2023 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants