Skip to content

Conversation

chakaz
Copy link
Contributor

@chakaz chakaz commented Jan 17, 2024

Before this change, OOM in shard callbacks could have led to data inconsistency between the master and the replica. For example, commands which mutated data on 1 shard but failed on another, like LMOVE.

After this change, callbacks that result in an OOM will correctly replicate their work (none, partial or complete) to replicas.

Note that MSET and MSETNX required special handling, in that they are the only commands that can create multiple keys, and so some of them can fail.

Fixes #2381

Before this change, OOM in shard callbacks could have led to data
inconsistency between the master and the replica. For example, commands
which mutated data on 1 shard but failed on another, like `LMOVE`.

After this change, callbacks that result in an OOM will correctly
replicate their work (none, partial or complete) to replicas.

Note that `MSET` and `MSETNX` required special handling, in that they are
the only commands that can _create_ multiple keys, and so some of them
can fail.

Fixes #2381
@chakaz chakaz requested a review from adiholden January 17, 2024 06:59
if (result.status != OpStatus::OK) {
// We log NOOP even for NO_AUTOJOURNAL commands because the non-success status could have been
// due to OOM in a single shard, while other shards succeeded
journal->RecordEntry(txid_, journal::Op::NOOP, db_index_, unique_shard_cnt_,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out JournalStreamer::Start.
It does not replicate NOOP, but instead today we use this to trigger journal writes to sink. so this will not get to replica , did you run your tests with the flag enable_multi_shard_sync set to true ? I think this will fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch.
How about I'll send a PING, but mark the opcode as COMMAND instead? Do you see any problem with that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw it does pass with --df enable_multi_shard_sync=true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think PING is ok.
Oh I see that the test is using proactor_threads=1 so there is only one shard.
Lets change the test to run on 2 shards. you will need to increase the max memory size as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see the test with proactor_threads=1, can you change it ? see my comment above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot about it :(
Done now.



@pytest.mark.asyncio
async def test_policy_based_eviction_propagation(df_local_factory, df_seeder_factory):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this test above, I asked Yue on his PR to add it and mark it as skip. You can remove remove his test.
Note that there is unsupported_types in seeder in his test because another bug that Kostas fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it messes up the diff, but the change is really only adding the test below

// journal [0, i)
payload = make_pair("MSET", ArgSlice(&args[0], i));
}
tx->LogJournalOnShard(op_args.shard, std::move(payload), tx->GetUniqueShardCnt(), false, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you still dont use RecordJournal function. why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this for later and did it now :)

@chakaz chakaz requested a review from adiholden January 18, 2024 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dragonfly replica inconsistent with master data
2 participants