Skip to content

Conversation

chakaz
Copy link
Contributor

@chakaz chakaz commented Dec 6, 2023

This is a draft of how we can modify DbSlice's API so that Pre and PostUpdate() are run automatically, unless explicitly needed.

At this point I'm almost done with converting Find(), which is the most common API. If this looks good, I'll proceed to finish with Find() and will then continue to other calls.

#2252

@chakaz chakaz requested a review from adiholden December 6, 2023 15:03
@dranikpg dranikpg self-requested a review December 6, 2023 15:39
src/core/dash.h Outdated
@@ -372,16 +372,15 @@ class DashTable<_Key, _Value, Policy>::Iterator {
return *this;
}

detail::IteratorPair<Key_t, Value_t> operator->() {
auto operator->() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes here is that you return const Key_t if the operator is const?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to see 2 implementations for const and non const operator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes here is that you return const Key_t if the operator is const?

Yes, but more importantly const Value_t

I'd prefer to see 2 implementations for const and non const operator

There's a single const implementation, the non-const one (which is removed) was not in use..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if we use const operator that gives us ability to modify something it's a little incosistent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's inconsistent, but it's the existing code and beyond the scope of my changes :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if we use const operator that gives us ability to modify something it's a little incosistent

I agree that it's inconsistent

Why? It really depends on how you treat the types, especially non owning or views, and what safety you want. I don't see anything wrong with it with iterators. In our case the iterator itself provides conditional mutability based on its own type

For example std::vector::iterator does the same:

vector<int> v = {1,2,3};
const auto it = v.begin();
*it = 11; // works even if `it` is "const", and cbegin() wouldn't even if mutable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on how you look at it, but I think a better example would be:

    vector<int> v = {1,2,3};
    const vector<int>& vr = v;
    auto it = vr.begin();
    *it = 1;  // Does not compile

As in, a const method returning a (const) iterator should not allow mutating data pointer to by the iterator...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but again, beyond the scope of this PR)

Copy link
Contributor

@dranikpg dranikpg Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we are already talking about the iterator function, don't we? The vector doesn't exist for us here, it's the dashtable itself, and it shouldn't allow getting a mutable iterator with a const function (i assume that's also impossible to implement?)

@romange
Copy link
Collaborator

romange commented Dec 6, 2023

a suggestion: maybe we can divide into two commits with with non-threatening changes first: dash table + FindReadOnly.
then we can review the second (smaller) commit of adding FindV2

}

return {{it, AutoPostUpdate(
{AutoPostUpdate::DestructorAction::kRun, this, cntx.db_index, it, key, true})}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that it should be always true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that means that the value exists. If it does not exist - we return before.
But it might be better to name this, as it might be confusing.

@BorysTheDev
Copy link
Contributor

I like these changes)

@chakaz
Copy link
Contributor Author

chakaz commented Dec 6, 2023

a suggestion: maybe we can divide into two commits with with non-threatening changes first: dash table + FindReadOnly. then we can review the second (smaller) commit of adding FindV2

I bet reviewing this is not nearly as fun as implementing it 🫤
This is currently a draft to see that I'm taking a direction that you think is reasonable. I guess, by your comments, that you're ok with it. As such, I'll finish converting all (possible) calls from Find() to FindReadOnly() and FindMutable(), and will then split if you'll think it'd be easier to review.

@chakaz chakaz marked this pull request as ready for review December 6, 2023 20:51
@chakaz
Copy link
Contributor Author

chakaz commented Dec 6, 2023

I converted this PR to non-draft, and am ready for review.
Roman, splitting the change to 2 makes sense, but it'll require some work. If you think it's significantly better or easier to review - I'm happy to do it (really, no problem). Otherwise, let's iterate on this as is.

src/core/dash.h Outdated
@@ -372,16 +372,15 @@ class DashTable<_Key, _Value, Policy>::Iterator {
return *this;
}

detail::IteratorPair<Key_t, Value_t> operator->() {
auto operator->() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if we use const operator that gives us ability to modify something it's a little incosistent

I agree that it's inconsistent

Why? It really depends on how you treat the types, especially non owning or views, and what safety you want. I don't see anything wrong with it with iterators. In our case the iterator itself provides conditional mutability based on its own type

For example std::vector::iterator does the same:

vector<int> v = {1,2,3};
const auto it = v.begin();
*it = 11; // works even if `it` is "const", and cbegin() wouldn't even if mutable

@@ -62,6 +62,42 @@ class DbSlice {
void operator=(const DbSlice&) = delete;

public:
class AutoPostUpdate {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoPostUpdate is confusing given it also calls PreUpdate 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is an implementation detail, not exposed to users of this class.
It could have been called even before this class was implemented.
Will you prefer another name? Maybe AutoUpdater? LMK.

Comment on lines 189 to 192
struct ItAndUpdater {
PrimeIterator it;
AutoPostUpdate post_updater;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed something, but can't we in the future just add it to the PrimeIterator itself? Or are there cases where this is undesirable? I mean the iterator can even track if it was mutably dereferenced

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very interesting proposal, that I thought about quite a lot. While I would have loved to do it, I think that it might be problematic because we copy PrimeIterator all over the place, and that might create "nested" calls such as PreUpdate(), PreUpdate(), PostUpdate(), PostUpdate().
We could maybe handle them (although PreUpdate() calls external callbacks, so it's harder to keep in mind), but perhaps the explicit nature of this PR has some advantage to it..

kRun,
};

// Wrap members in a struct to auto generate operator=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason you actually need this field is to clear it in the move constructor instead of dropping in an undefined state 🤔 Not handy, but I don't know any other solution instead of having a move-clearable flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by a move-clearable flag?
(I don't see any issue with this implementation though..)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an interesting thought I had... So this example

struct PostUpdater {
  PostUpdater& operator=(PostUpdater&& o) = default;

  ~PostUpdater() { cout << a << " and " << uint64_t(b.get()) << endl; }

  int a;
  unique_ptr<int> b;
};

int main() {
  PostUpdater p1(1, make_unique<int>(11));
  PostUpdater p2(2, nullptr);

  p2 = std::move(p1);
}

will print

1 and 94585575722672
1 and 0

because moving an int is just copying it.... A move-clearable flag would behave like unique_ptr (but without the heap allocation). This way to don't need the nested struct, but can rely on default move constructor / operator to store "default" values in the moved-from object that can then be checked inside the destructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Well, it's simpler currently :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Well, it's simpler currently :)

Not arguing, just if we or absl had this type, it'd be one line

BorysTheDev
BorysTheDev previously approved these changes Dec 7, 2023
class AutoPostUpdate {
public:
AutoPostUpdate();
AutoPostUpdate(AutoPostUpdate&& o);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoPostUpdate(AutoPostUpdate&& o) = default does not work here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because we need to set o.action to kDoNothing, otherwise both will run...

Copy link
Contributor

@dranikpg dranikpg Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, it's literally my comment from above about move-clearable flags 🤣

@@ -1097,7 +1097,9 @@ void HSetFamily::HRandField(CmdArgList args, ConnectionContext* cntx) {
}

if (string_map->Empty()) {
db_slice.Del(db_context.db_index, *it_res);
auto it_mutable = db_slice.FindMutable(db_context, key, OBJ_HASH);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its a change in this PR bu I am confused here. This is readonly command, and it can remove key from db.. does this makes sense? we should replicate a del command here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the weirdness of our field expiration feature..
In Redis, an empty set is non-existing. Here, if we discover that all fields have expired, we remove the key. Yes, it could happen in a readonly command :(

We should do it in all commands I guess, or better yet, implement it as a similar mechanism to expiration. Otherwise, an attempt to, say, set a string to what was previously a hash key will fail, despite it being empty.

I'd say this is of low priority though (and also outside the scope here)

@@ -172,15 +169,16 @@ OpStatus UpdateEntry(const OpArgs& op_args, std::string_view key, std::string_vi
// Make sure that we don't have other internal issue with the operation
OpStatus res = verify_op(json_entry);
if (res == OpStatus::OK) {
db_slice.PostUpdate(db_index, entry_it, key);
it_res->post_updater.Run();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the AutoPostUpdater runs the Run function in the destructor if we it was not canceled right? what happens, here if res != OpStatus::OK? Run will be executed in the destructor, but in the original flow it did not run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow your question. By calling post_updater.Run() you're running PostUpdate() now instead of in d'tor (it will not run in the d'tor after this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I wasn't even sure if we should explicitly Run() it here, as it would have run anyway right after AddDoc() below, but I didn't know if it's ok to run afterwards (although it probably is?)
@dranikpg can we run PostUpdate() after AddDoc()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow your question. By calling post_updater.Run() you're running PostUpdate() now instead of in d'tor (it will not run in the d'tor after this)

Adi means that you don't prevent it from running even if opstatus wasn't OK, which didn't happen before

@dranikpg can we run PostUpdate() after AddDoc()?

I assume so, because it only reads the value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adi means that you don't prevent it from running even if opstatus wasn't OK, which didn't happen before

This is a good thing though, in that it fixes a bug. If we call PreUpdate() and not call PostUpdate() we're doing wrong accounting..

@@ -274,13 +271,11 @@ OpResult<string> OpMoveSingleShard(const OpArgs& op_args, string_view src, strin
db_slice.PreUpdate(op_args.db_cntx.db_index, dest_it);
}

db_slice.PreUpdate(op_args.db_cntx.db_index, src_it);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check out line 268. It could be that src_it was invalidated by the insertion of dest_it. This is bad. You will call post update on invalid it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good catch! Fixed.

@@ -274,13 +271,11 @@ OpResult<string> OpMoveSingleShard(const OpArgs& op_args, string_view src, strin
db_slice.PreUpdate(op_args.db_cntx.db_index, dest_it);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So incase dest_it->second.ObjType() != OBJ_LIST you just call the source pre and post update though nothing changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think that's a small price to pay.

@@ -265,7 +262,9 @@ OpResult<string> OpMoveSingleShard(const OpArgs& op_args, string_view src, strin
dest_it->second.ImportRObj(obj);

// Insertion of dest could invalidate src_it. Find it again.
src_it = db_slice.GetTables(op_args.db_cntx.db_index).first->Find(src);
src_res = db_slice.FindMutable(op_args.db_cntx, src, OBJ_LIST);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on this copy you will run the original src_res post update which might hold invalid iterator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're very much right!
I added a DCHECK() to make sure that the DB is not changed (based on its size).
I think that this is another point for this PR, as these kinds of checks help reduce bugs.

If you can think of other cases we should DCHECK() please let me know!

adiholden
adiholden previously approved these changes Dec 10, 2023
adiholden
adiholden previously approved these changes Dec 10, 2023
@chakaz chakaz merged commit 1ce3f98 into main Dec 11, 2023
@chakaz chakaz deleted the post-update branch December 11, 2023 08:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants