Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Conversation

erikjohnston
Copy link
Member

@erikjohnston erikjohnston commented Mar 29, 2022

This is a first step in dealing with #7721.

The idea is basically that rather than calculating the full set of users a device list update needs to be sent to up front, we instead simply record the rooms the user was in at the time of the change. This will allow a few things:

  1. we can defer calculating the set of remote servers that need to be poked about the change; and
  2. during /sync and /keys/changes we can avoid also avoid calculating users who share rooms with other users, and instead just look at the rooms that have changed.

However, care needs to be taken to correctly handle server downgrades. As such this PR writes to both device_lists_changes_in_room and the device_lists_outbound_pokes table synchronously. In a future release we can then bump the database schema compat version to 69 and then we can assume that the new device_lists_changes_in_room exists and is handled.

There is a temporary option to disable writing to device_lists_outbound_pokes synchronously, allowing us to test the new code path does work (and by implication upgrading to a future release and downgrading to this one will work correctly).

Note: Ideally we'd do the calculation of room to servers on a worker (e.g. the background worker), but currently only master can write to the device_list_outbound_pokes table.

Reviewable commit-by-commit.

@erikjohnston erikjohnston marked this pull request as ready for review March 29, 2022 10:41
@erikjohnston erikjohnston requested a review from a team as a code owner March 29, 2022 10:41
@erikjohnston erikjohnston force-pushed the erikj/device_list_perf branch from 78e2c0a to 7707cab Compare March 29, 2022 12:12
@erikjohnston erikjohnston force-pushed the erikj/device_list_perf branch from 7707cab to f8af30f Compare March 29, 2022 12:13
@erikjohnston erikjohnston force-pushed the erikj/device_list_perf branch from 0af131a to 7266580 Compare March 29, 2022 13:16
@richvdh richvdh self-assigned this Apr 1, 2022
await self.db_pool.runInteraction(
"add_device_change_to_stream",
self._add_device_change_to_stream_txn,
num_stream_ids = max(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be good to add a comment clarifying why this is a max rather than a total (vis, that we use the same set of stream ids for each of the three tables)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max got replaced, but I've added a comment the equivalent code

"stream_id": stream_id,
"room_id": room_id,
},
updatevalues={"converted_to_destinations": True},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be a silly question, but why do we do it this way, instead of keeping track of the point we've processed up to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err, we could. I don't think I had much reasoning beyond not having to worry about extra tables or handling the duplicate stream IDs.

I've also been thinking about how you'd distribute this across multiple workers, but I'm not sure this helps that or not 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmmhmm. Generally I prefer to keep our tables append-only where possible (it saves a world of caching problems later on), but fair enough.

joined_users = await self.store.get_users_in_room(room_id)
hosts.update(get_domain_from_id(u) for u in joined_users)
hosts: Optional[Set[str]] = None
if self.use_new_device_lists_changes_in_room:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the wrong way round? we should skip calculating hosts if use_new_device_lists_changes_in_room?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bah, yes. And this is the only place that uses it directly ─ everything else infers it from hosts being None ─ so nothing broke.

I'm not sure if there is an easy way to test that this is the right way round?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might be able to block the _handle_new_device_update_async job in the tests (eg by setting _handle_new_device_update_is_processing), and check what actually gets written to the db?

Not sure it's worth it though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this code is going to be ripped out in the next release anyway

@richvdh richvdh removed their assignment Apr 1, 2022
@erikjohnston erikjohnston requested a review from richvdh April 4, 2022 10:03
@erikjohnston
Copy link
Member Author

(The last three commits are bug fixes, rather than responding to review comments)

"stream_id": stream_id,
"room_id": room_id,
},
updatevalues={"converted_to_destinations": True},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmmhmm. Generally I prefer to keep our tables append-only where possible (it saves a world of caching problems later on), but fair enough.

# No changes to notify about, so this is a no-op.
return

users_who_share_room = await self.store.get_users_who_share_room_with_user(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FederationSenderDevicesTestCases still stubs this out. Is that still necessary?

joined_users = await self.store.get_users_in_room(room_id)
hosts.update(get_domain_from_id(u) for u in joined_users)
hosts: Optional[Set[str]] = None
if self.use_new_device_lists_changes_in_room:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might be able to block the _handle_new_device_update_async job in the tests (eg by setting _handle_new_device_update_is_processing), and check what actually gets written to the db?

Not sure it's worth it though.

Comment on lines 21 to 23
-- This initialy matches `device_lists_stream.stream_id`. Note that we
-- delete older values from `device_lists_stream`, so we can't use a foreign
-- constraint here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer true as of cf04f1a, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That commit is about device_list_outbound_pokes? Rather than device_lists_stream or device_lists_changes_in_room?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. confused. sorry.

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!


async def get_uncoverted_outbound_room_pokes(
self, limit: int = 10
) -> List[Tuple[str, str, str, int, Optional[Dict[str, str]]]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the return type here is incorrect: the optional dict should be Optional[str]. (Presumably we json-decode the string elsewhere when needed.)

Xref #12485 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants