Skip to content
Open
Changes from 5 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
8cc34df
Simplified sliding sync
erikjohnston Aug 30, 2024
456db3b
Add example usage
erikjohnston Aug 30, 2024
6fe4ba7
Add security and unstable prefix
erikjohnston Aug 30, 2024
a5dc74b
Update proposals/4186-simplified-sliding-sync.md
erikjohnston Jan 8, 2025
4c47844
Some clarifications
erikjohnston Jan 8, 2025
3fc851a
Rewrite MSC to include full API description
erikjohnston Sep 9, 2025
a1bd6bf
Use full URLs
erikjohnston Sep 9, 2025
839eb35
Clarify $LAZY
erikjohnston Sep 9, 2025
a9141ab
Note that bump stamps can decrease
erikjohnston Sep 9, 2025
dbf593a
Add an open question for pre-flight CORS
erikjohnston Sep 9, 2025
a325def
Clarifications
erikjohnston Sep 9, 2025
992007a
Add support for `set_presence`
erikjohnston Sep 11, 2025
165aaff
Rename invite_state to stripped_state
erikjohnston Sep 11, 2025
495963f
Handle state deletion
erikjohnston Sep 11, 2025
c6cd26e
Rename unstable_expanded_timeline
erikjohnston Sep 11, 2025
486efe2
Add new lists field
erikjohnston Sep 11, 2025
97feb73
Add `membership` field in room response
erikjohnston Sep 11, 2025
1786eeb
Move unread threads to thread extension
erikjohnston Sep 11, 2025
d156cb9
Change the required_state request format
erikjohnston Sep 12, 2025
de0fe55
Move URL params to request body
erikjohnston Sep 12, 2025
b7b363e
Update proposals/4186-simplified-sliding-sync.md
erikjohnston Sep 22, 2025
2d2890d
Apply suggestions from code review
erikjohnston Sep 24, 2025
b405735
0-indexed
erikjohnston Sep 24, 2025
888e070
s/RoomFilter/SlidingRoomFilter/
erikjohnston Sep 24, 2025
d6d5edd
Make RequiredStateRequest.include optional
erikjohnston Sep 24, 2025
5b5e82b
Link to lazy-loaded membership
erikjohnston Sep 24, 2025
a2d3684
Link to tagging
erikjohnston Sep 24, 2025
4a38bb9
is_dm being omitted implies not a DM room
erikjohnston Sep 24, 2025
1465e25
Define 'just occured'
erikjohnston Sep 24, 2025
0fbd845
Note about resource consumption
erikjohnston Sep 24, 2025
ce6f0f3
Note about resource consumption
erikjohnston Sep 24, 2025
985e71f
Add note that servers can expire sync connections if response is too …
erikjohnston Sep 24, 2025
492db05
Remove ability to peek via room subscriptions
erikjohnston Sep 25, 2025
1058ad7
Change 'ranges' to 'range'
erikjohnston Sep 25, 2025
bade6bb
Move room configs section for clarity
erikjohnston Sep 25, 2025
474e681
Allow 'room_name_like' to be implementation dependent
erikjohnston Sep 25, 2025
6e96ddb
Clarify notifications are unthreaded
erikjohnston Sep 25, 2025
2db8940
Update proposals/4186-simplified-sliding-sync.md
erikjohnston Sep 25, 2025
cc759db
Remove misleading sentence
erikjohnston Sep 25, 2025
98307d1
Update proposals/4186-simplified-sliding-sync.md
erikjohnston Sep 25, 2025
ca38ae8
Apply suggestions from code review
erikjohnston Sep 25, 2025
e5b047a
Explain why we don't include more fields in invited rooms
erikjohnston Sep 25, 2025
a48eb1b
Remove 'room_name_like' as not implemented or used
erikjohnston Sep 25, 2025
25479be
Review comment
erikjohnston Sep 25, 2025
ac356a7
Point to existing sync v2 heroes spec
erikjohnston Sep 25, 2025
31d18c1
Reword heroes section
erikjohnston Sep 25, 2025
bbc82d2
Note in lazy loading section state resolution
erikjohnston Sep 25, 2025
49bbe41
Note why caching members is useful
erikjohnston Sep 25, 2025
c4bf57d
Mention `expanded_timeline`
erikjohnston Sep 25, 2025
74e8e76
Make the 'lists' request field sticky
erikjohnston Sep 26, 2025
2cf4783
Clear up situation with rejected invites
erikjohnston Sep 29, 2025
f5f6f83
Add note about 'last activity' being server dependent
erikjohnston Sep 29, 2025
2e8be0a
Remove sticky lists
erikjohnston Oct 1, 2025
ef197a4
Add double initial sync as an alternative
erikjohnston Oct 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
392 changes: 392 additions & 0 deletions proposals/4186-simplified-sliding-sync.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

  • Client (ideally multiple)
  • Server

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementations to my knowledge:

  • Client implementation: matrix-rust-sdk (PR)
  • Server implementation: has been implemented in Synapse across 10s of PRs. (Perhaps @MadLittleMods, the implementer, can give better link(s) here.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conduwuit has implemented simplified sliding sync in https://github.com/girlbossceo/conduwuit/pull/666

Original file line number Diff line number Diff line change
@@ -0,0 +1,392 @@
# MSC4186: Simplified Sliding Sync

The current `/sync` endpoint scales badly as the number of rooms on an account increases. It scales badly because all
rooms are returned to the client, incremental syncs are unbounded and slow down based on how long the user has been
offline, and clients cannot opt-out of a large amount of extraneous data such as receipts. On large accounts with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't data be removed with filters? (See RoomFilter)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, I believe you can opt-out of receipts with RoomFilter.ephemeral.not_types = [ "m.read.*" ] or similar. When I was working on filtering support in grapevine, we had a fast path for filters like this that would skip reading any of the receipt data from the db server-side. That work hasn't been finished yet, and I have no idea if synapse has a similar fast-path.

One of the difficulties with v3 sync filtering in general is that performance can vary wildly wildly depending on server implementation details. The spec doesn't indicate which fast-paths should exist, and whether or not you hit a fast-path can also affect how many events are returned in the sync window (matrix-org/matrix-spec#1887).

thousands of rooms, the initial sync operation can take tens of minutes to perform. This significantly delays the
initial login to Matrix clients, and also makes incremental sync very heavy when resuming after any significant pause in
usage.

Note: this is a “simplified” version of the sliding sync API proposed in
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575), based on paring back that API based
on real world use cases and usages.


# Goals

This improved `/sync` mechanism has a number of goals:

- Sync time should be independent of the number of rooms you are in.
- Time from opening of the app (when already logged in) to confident usability should be as low as possible.
- Time from login on existing accounts to usability should be as low as possible.
- Bandwidth should be minimized.
- Support lazy-loading of things like read receipts (and avoid sending unnecessary data to the client)
- Support informing the client when room state changes from under it, due to state resolution.
- Clients should be able to work correctly without ever syncing in the full set of rooms they’re in.
- Don’t incremental sync rooms you don’t care about.
- Servers should not need to store all past since tokens. If a since token has been discarded we should gracefully
degrade to initial sync.

These goals shaped the design of this proposal.


# Proposal

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does SSS interact with the ignored user list? In /v3/sync, the server omits events sent by ignored users, requiring the client to perform a second initial sync if they want to retroactively see messages when unignoring. The choice to do server-side ignored user filtering in sync also doesn't really simplify client implementation, because you still need client-side filtering for events received before the user was ignored.

My preference would be to drop this requirement entirely for SSS, now that we have a chance to do it in a backwards-compatible way.


The core differences between sync v2 and simplified sliding sync are:

- The server initially only sends the most recent N rooms to the client (where N is specified by the client), which then
can paginate in older rooms in subsequent requests
- The client can configure which information the server will return for different sets of rooms (e.g. a smaller timeline
limit for older rooms).
- The client can filter what rooms it is interested in
- The client can maintain multiple sync loops (with some caveats)
- This is useful for e.g. iOS clients which have a separate process to deal with notifications, as well as allowing
the app to split handling of things like encryption entirely from room data.

The basic operation is similar between sync v2 and simplified sliding sync: both use long-polling with tokens to fetch
updates from the server. I.e., the basic operation of both APIs is to do an “initial” request and then repeatedly call
the API supplying the token returned in the previous response in the subsequent “incremental” request.
Copy link

@dead-claudia dead-claudia May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using server-sent events to expose this instead?

  • An option (like "persist": false or similar) could be accepted to tell the server to only send the initial sync, for the few clients who don't want to watch for more events.
  • Clients don't need multiple sync loops, just multiple persistent connections they're watching.
  • Event IDs and restarting is built into the protocol. You don't need to design for it yourself. And you also don't need to store the event IDs any longer than it takes to send them.
  • The streaming nature of the protocol makes it naturally adaptive to client bandwidth. If necessary, the server can handle database pagination.
  • Response compression can drive down bandwidth significantly for messages, provided you flush the stream after sending each batch of messages.
  • Fewer requests means much less compute load on the server. And HTTP state can be discarded after receiving the full request. This lets it support more connections. And given filters can get quite large, this is significant.
  • Database requests need to occur only for initial sync. Once the initial sync occurs, all subsequent events can just be directly broadcast via message passing, reducing database load and especially cache load.
  • High-latency networks like geosync satellite (500ms-800ms one-way) would see message receipt latency cut in half by not having to send a request to receive subsequent events. Note that geosync satellite physically can't get below about 500ms of latency due to the speed of light.

You can still use POST for the request, since while native browser APIs don't support it, userland libraries do.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to follow up with this in a later proposal, to not block this MSC (as it's already de-facto standard).



## Lists and room subscriptions

The core component of a sliding sync request is “lists”, which specify what information to return about which rooms.
Each list specifies some filters on rooms (e.g. ignore spaces), the range of filtered rooms to select (e.g. the most
recent 20 filtered rooms), and the config for the data to return for those rooms (e.g. the required state, timeline
limit, etc). The order of rooms is always done based on when the server received the most recent event for the room.

The client can also specify config for specific rooms if it has their room ID, these are known as room subscriptions.

Multiple lists and subscriptions can be specified in a request. If a room matches multiple lists/subscriptions then the
config is “combined” to be the superset of all configs (e.g. take the maximum timeline limit). See below for the exact
algorithm.

The server tracks what data has been sent to the client in which rooms. If a room matches a list or subscription that
hasn’t been sent down before, then the server will respond with the full metadata about the room indicated by `initial:
true`. If a room stops matching a list (i.e. it falls out of range) then no further updates will be sent until it starts
matching a list again, at which point the missing updates (limited by the `timeline_limit`) will be sent down. However,
as clients are now expected to paginate all rooms in the room list in the background (in order to correctly order and
search them), the act of a room falling out of range is a temporary edge-case.


## Pagination

Pagination is achieved by the client increasing the ranges of one (or more) lists.

For example an initial request might have a list called `all_rooms` specifying a range of `0..20` in the initial
request, and the server will respond with the top 20 rooms (by most recently updated). On the second request the client
may change the range to `0..100`, at which point the server will respond with the top 100 rooms that either a) weren’t
sent down in the first request, or b) have updates since the first request.

Clients can increase and decrease the ranges as they see fit. A common approach would be to start with a small window
and grow that until the range covers all the rooms. After some threshold of the app being offline it may reduce the
range back down and incrementally grow it again. This allows for ensuring that a limited amount of data is requested at
once, to improve response times.


## Connections

Clients can have multiple “connections” (i.e. sync loops) with the server, so long as each connection has a different
`conn_id` set in the request.

Clients must only have a single request in-flight at any time per connection (clients can have multiple connections by
specifying a unique `conn_id`). If a client needs to send another request before receiving a response to an in-flight
request (e.g. for retries or to change parameters) the client *must* cancel the in-flight request (at the HTTP level)
and *not* process any response it receives for it.

In particular, a client must use the returned `pos` value in a response as the `since` param in exactly one request that
the client will process the response for. Clients must be careful to ensure that when processing a response any new
requests use the new `pos`, and any in-flight requests using an old `pos` are canceled.

The server cannot assume that a client has received a response until it receives a new request with the `since` token
set to the `pos` it returned in the response. The server must ensure that any per-connection state it tracks correctly
handles receiving multiple requests with the same `since` token (e.g. the client retries the request or decides to
cancel and resend a request with different parameters).

A server may decide to “expire” connections, either to free resources or because the server thinks it would be faster
for the client to start from scratch (e.g. because there are many updates to send down). This is done by responding with
a 400 HTTP status and an error code of `M_UNKNOWN_POS`.


## List configuration

**TODO**, these are the same as in [MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575):

- Required state format
- The filters
- Lazy loading of members
- Combining room config


## Room config changes

When a room comes in and out of different lists or subscriptions, the effective `timeline_limit` and `required_state`
parameters may change. This section outlines how the server should handle these cases.

If the `timeline_limit` *increases* then the server *may* choose to send down more historic data. This is to support the
ability to get more history for certain rooms, e.g. when subscribing to the currently visible rooms in the list to
precache their history. This is done by setting `unstable_expanded_timeline` to true and sending down the last N events
(this may include events that have already been sent down). The server may choose not to do this if it believes it has
already sent down the appropriate number of events.

If new entries are added to `required_state` then the server must send down matching current state events.


## Extensions

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this MSC doesn't have any support for timeline filtering. The SS MSC called this out explicitly, and it would probably be good to do that here as well. Timeline filtering on /v3/sync is used heavily by some bot and tool clients.


We anticipate that as more features land in Matrix, different kinds of data will also want to be synced to clients. Sync
v2 did not have any first-class support to opt-in to new data. Sliding Sync does have support for this via "extensions".
Extensions also allow this proposal to be broken up into more manageable sections. Extensions are requested by the
client in a dedicated extensions block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how should a client be aware of whether a given extension is supported by the server or not?

It seems Synapse currently omits extensions in the response object, if there is no new data for that extension.

But this makes it impossible for a client to distinguish between no data and an unsupported extension.


In an effort to reduce the size of this proposal, extensions will be done in separate MSCs. There will be extensions
for:

- To Device Messaging \- MSC3885
- End-to-End Encryption \- MSC3884
- Typing Notifications \- MSC3961
- Receipts \- MSC3960
- Presence \- presence in sync v2: spec
- Account Data \- account\_data in sync v2: MSC3959
- Threads

**TODO** explain how these interact with the room lists, this is the same as in
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this is inlined here, I'd like to remove the concept of "*" for lists/rooms in extension requests. The behaviour of MSC3575 ["*"] would be achieved by not specifying the parameters at all, and for rooms, there can be an additional boolean field like include_global_room_subscriptions to get the behaviour of MSC3575 ["*", "!foo:bar.baz"].


## Request format

```javascript
{
"conn_id": "<conn_id>", // Client chosen ID of the connection, c.f. "Connections"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSC3575 limits the conn_id to 16 chars. Does that restriction still apply? If the omission was intentional it should be noted. As a preference I favor the limitation, or at least some limitation, or simply using an integer for this.


// The set of room lists
"lists": {
// An arbitrary string which the client is using to refer to this list for this connection.
"<list-name>": {

// Sliding window ranges, c.f. Lists and room subscriptions
"ranges": [[0, 10]],

// Filters to apply to the list.
"filters": {
// Flag which only returns rooms present (or not) in the DM section of account data.
// If unset, both DM rooms and non-DM rooms are returned. If false, only non-DM rooms
// are returned. If true, only DM rooms are returned.
"is_dm": true|false|null,

// Flag which only returns rooms which have an `m.room.encryption` state event. If unset,
// both encrypted and unencrypted rooms are returned. If false, only unencrypted rooms
// are returned. If true, only encrypted rooms are returned.
"is_encrypted": true|false|null,

// Flag which only returns rooms the user is currently invited to. If unset, both invited
// and joined rooms are returned. If false, no invited rooms are returned. If true, only
// invited rooms are returned.
"is_invite": true|false|null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will knocks and left rooms be handled? Maybe this should be a string and be the current membership of the user?


// If specified, only rooms where the `m.room.create` event has a `type` matching one
// of the strings in this array will be returned. If this field is unset, all rooms are
// returned regardless of type. This can be used to get the initial set of spaces for an account.
// For rooms which do not have a room type, use 'null' to include them.
"room_types": [ ... ],

// Same as "room_types" but inverted. This can be used to filter out spaces from the room list.
// If a type is in both room_types and not_room_types, then not_room_types wins and they are
// not included in the result.
"not_room_types": [ ... ],
},

// The maximum number of timeline events to return per response.
"timeline_limit": 10,

// Required state for each room returned. An array of event type and state key tuples.
// Elements in this array are ORd together to produce the final set of state events
// to return. One unique exception is when you request all state events via ["*", "*"]. When used,
// all state events are returned by default, and additional entries FILTER OUT the returned set
// of state events. These additional entries cannot use '*' themselves.
// For example, ["*", "*"], ["m.room.member", "@alice:example.com"] will _exclude_ every m.room.member
// event _except_ for @alice:example.com, and include every other state event.
// In addition, ["*", "*"], ["m.space.child", "*"] is an error, the m.space.child filter is not
// required as it would have been returned anyway.
"required_state": [ ... ],
}
},

// The set of room subscriptions
"room_subscriptions": {
// The key is the room to subscribe to.
"!foo:example.com": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-adding this discussion from the other MSC3575:

What should happen if you try to subscribe to a room you are not part of (or not allowed)?

The easy option would be to ignore it but then clients have no feedback about what they did wrong.

Perhaps entries in the rooms map could also have an errcode/error field. We could also just blow up the whole request with an error and point out which room_id is wrong although that isn't very machine-readable.


Can you subscribe to a public world_readable room that you're not part of?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related language added to the MSC:

The server MUST ensure that user has permission to see any information the server returns. However, the user need not be
in the room, e.g.clients can specify room IDs for world-readable rooms and they would be returned.

This doesn't completely address the other questions though.

Copy link
Contributor

@MadLittleMods MadLittleMods Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The world_readable part of the MSC was just removed as well

Everything needs clarification again.

// These have the same meaning as in `lists` section
"timeline_limit": 10,
"required_state": [ ... ],
}
},

// c.f. "Extensions"
"extensions": {
}
}
```


## Response format

```javascript
{
// The position to use as the `since` token in the next sliding sync request.
// c.f. Connections.
"pos": "<opaque string>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was an effort to standardize pagination terminology?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently SCT is leaning towards: from and next_batch instead of pos everywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having thought about it a little more, I'm not sure that next_batch as a name really fits with how the API works. It does match pagination APIs, which is a plus, but not totally convinced about it.


// Information about the lists supplied in the request.
"lists": {
// Matches the list name supplied by the client in the request
"<list-name>" {
// The total number of rooms that match the list's filter. Note that rooms can be in
// multiple lists, so may be double counted.
"count": 1234,
}
},

// Aggregated rooms from lists and room subscriptions. There will be one entry per room, even if
// the room appears in multiple lists and/or room subscriptions.
"rooms": {
"!foo:example.com": {
// The room name (as specified by any `m.room.name` event), if one exists. Only sent initially
// and when it changes.
"name": str|null,
// The room avatar, if one exists. Only sent initially and when it changes.
"avatar_url": str|null,
// The "heroes" for the room, if there is no room name. Only sent initially and when it changes.
"heroes": [
{"user_id":"@alice:example.com","displayname":"Alice","avatar_url":"mxc://..."},
],
Copy link
Contributor

@MadLittleMods MadLittleMods Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the heroes membership state events need to be included in required_state response when requesting required_state: ["m.room.member", "$LAZY"] (lazy-loading room members)?

Sync v2 says this:

When lazy-loading room members is enabled, the membership events for the heroes MUST be included in the state, unless they are redundant. When the list of users changes, the server notifies the client by sending a fresh list of heroes. If there are no changes since the last sync, this field may be omitted.

But I think that's because m.heroes in Sync v2 is only a list of user ID's. Whereas, in the sliding sync response here, we already have all of the info necessary in these stripped events.

One alternative is to match Sync v2 and only list the user ID's in heroes and include the membership events in required_state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One weird corner with the current spec: If the room doesn't have a name set and the user is invited/knock to the room, we don't have any heroes to provide here. This is because heroes membership isn't included in the stripped state that the server receives when they are invited/knocked over federation.

Related to matrix-org/matrix-spec#380

This was solved for partial-joins via MSC3943

Not necessarily a blocker but just calling out something that won't work completely until that spec issue is solved.


// Flag which is set when this is the first time the server is sending this data on this connection.
// When set the client must replace any stored metadata for the room with the new data. In
// particular, the state must be replaced with the state in `required_state`.
"initial": true|null,

// Same as in sync v2. Indicates whether there are more events to fetch than those in the timeline.
"limited:" true|null,
// Indicates if we have "expanded" the timeline due to the timeline_limit changing, c.f. Room config
// changes above.
"unstable_expanded_timeline": true|null,
// The list of events, sorted least to most recent.
"timeline": [ ... ],
// The current state of the room as a list of events. This is the full state if `initial`
// state is set, otherwise it is a delta from the previous sync.
"required_state": [ ... ],
// The number of timeline events which have just occurred and are not historical.
// The last N events are 'live' and should be treated as such.
"num_live": 1,
// Same as sync v2, passed to `/messages` to fetch more past events.
"prev_batch": "...",

// For invites this is the stripped state of the room at the time of invite
"invite_state": [ .. ],

// For knocks this is the stripped state of the room at time of knock
"knock_state": [ .. ],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should also be a knock_servers field, copying #4233


// Whether the room is a DM room.
"is_dm": true|null,

// An opaque integer that can be used to sort the rooms by "Bump Stamp"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// An opaque integer that can be used to sort the rooms by "Bump Stamp"
// An opaque unsigned 64-bits integer that can be used to sort the rooms by "Bump Stamp"

We may also precise that bump_stamp has a total order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u64 would be a problem, or at least an inconvenience, for JavaScript clients. Should this be an unsigned 53-bit integer instead?

"bump_stamp": 1,

// These are the same as sync v2.
"joined_count": 1,
"invited_count": 1,
"notification_count": 1,
"highlight_count": 1,
Copy link
Contributor

@MadLittleMods MadLittleMods Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should unread and highlight count even be here? Notification/highlight count can't even be done correctly for encrypted rooms and it's just extra work for us to calculate if clients ignore our flawed values anyway.

We've just left them as dummy values in Synapse, see https://github.com/element-hq/synapse/blob/0e3c0aeee833e52121b3167de486dff34018ab27/synapse/handlers/sliding_sync/__init__.py#L1332-L1336

Related discussion: element-hq/synapse#17546 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But they can be done in unencrypted rooms so would save a bunch of processing time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is misleading to keep these values here. A client has to compute these values for encrypted rooms anyway. There is no extra exotic costs to do it for unencrypted rooms client-side. It would save computations for the server though, which is a good thing.

Consistent responses. No misleading values that can be misused. I think removing these from the server is a good thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversation also continued at #4186 (comment)

}
},

"extensions": {
},
}
```

# Example usage

This section gives an example of how a client can use this API (roughly based on how Element X currently uses the API).

When the app starts up it configures a single list with a range of `[0, 19]` (to get the top 20 rooms) and a
`timeline_limit` of 1. This returns quickly with the top 20 rooms (or just the changes in the top 20 rooms if a token
was specified).

The client then increases the range (in the next request) to `[0, 99]`, which will return the next 80 rooms. The server
may sort the rooms differently than they are returned by the server (e.g. they may ignore reactions for sorting
purposes). Note: the range here matches 100 rooms, however we only send the 80 rooms that we didn't send down in the
previous request.

The client can use room subscriptions, with a `timeline_limit` of 20, to preload history for the top rooms. This means
that if the user clicks on one of the top rooms the app can immediately display a screens worth of history. (An
alternative would be to have a second list with a static range of `[0, 19]` and a `timeline_limit` of 20. The downside
is that the clients may use a different order for the room list and so always fetching extra events for the top 20 rooms
may return more data than required.)

The client can keep increasing the list range in increments to pull in the full list of rooms. The client uses the
returned `count` for the list to know when to stop expanding the list.

The client *may* decided to reduce the range back to `[0, 19]` (and then subsequently incrementally expand the range),
this can be done.

When the client is expecting a fast response (e.g. while expanding the lists), it should set the `timeout` parameter to
0 to ensure the server doesn't block waiting for new data. This can easily happen if the app starts and sends the first
request with a `since` parameter, if the client shows a spinner but doesn't set a timeout then the request may take a
long time to return (if there were no updates to return).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also describe how to handle the scenario where you're requesting rooms in the range 0..20 for an incremental sync and there are more than 20 rooms with updates:

Clients can tell whether there is more to paginate if they compare the bump_stamp of the last room in the window to what they previously saw. They would need to expand the window until they see an unchanged room or room they haven't seen before. Instead of using bump_stamp, clients could use the last event_id in the timeline to compare if they want to make sure they've seen all events and not just the bumped things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related info added to the MSC:

The client can keep increasing the list range in increments to pull in the full list of rooms. The client uses the
returned `count` for the list to know when to stop expanding the list.


# Alternatives / changes

There are a number of potential changes that we could make.

## Pagination

In practice, having the client specify the ranges to use for the lists is often sub-optimal. The client generally wants
to have the sync request return as quickly as possible, but it doesn't know how much data the server has to return and
so whether to increase or decrease the range.

An alternative is for the client to specify a `page_size`, where the server sends down at most `page_size` number of
rooms. If there are more rooms to send to the client (beyond `page_size`), then the client can request to "paginate" in
these missed updates in subsequent updates.

Since this would require client side changes, this should be explored in a separate MSC.

## Timeline event trickling

If the `timeline_limit` is increased then the server will send down historic data (c.f. "Room config changes"), which
allows the clients to easily preload more history in recent rooms.

This mechanism is fiddly to implement, and ends up resending down events that we have previously sent to the client.

A simpler alternative is to use `/messages` to fetch the history. This has two main problems: 1) clients generally want
to preload history for multiple rooms at once, and 2) `/messages` can be slow if it tries to backfill over federation.
Comment on lines +602 to +603
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds a lot simpler. Why do clients want to backfill multiple rooms at once?

Would adding a flag to /mesages to not backfill be useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a UX perf thing: when a user opens the app they will likely click one of the top 10 or 20 rooms and so having those 10 or 20 rooms have pre-fetched history as quickly as possible reduces the number of times that a user will see a blank room when opened.

It's hard for e.g. web to do multiple concurrent /messages requests due to max in-flight request limits. Though certainly adding a flag to /messages to make it return quicker would help a bunch (and I think is being proposed in an MSC somewhere already)

Copy link
Contributor

@MadLittleMods MadLittleMods Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Previous discussion: #4186 (comment)


Would adding a flag to /messages to not backfill be useful?

Yes, although may be slightly unrelated to Sliding Sync itself. I think the original point of this comment is just trying to hint at being able to get more messages without delay (from backfilling events over federation).

👍 This also plays nicely with other ideas being explored in MSC3871 (gaps) and MSC4282 (conditional backfill in /messages), see #4282 (comment)

-- @MadLittleMods, #4186 (comment)

which points to this idea ->

For the hack day, I'll be tackling the backfill problem with this strategy:

  • Default to fast responses with gaps: As a default, we can always respond quickly and indicate gaps (MSC3871) for clients to paginate at their leisure.
  • Fast back-pagination: Clients back-paginate with /messages?dir=b&backfill=false, and Synapse skips backfilling entirely, returning only local history with gaps as necessary.
  • Explicit gap filling: To fill in gaps, clients use /messages?dir=b&backfill=true which works just like today to do a best effort backfill.

This allows the client to back-paginate the history we already have without delay. And can fill in the gaps as they see fit.

Gaps can be represented in the UI with a message like "We failed to get some messages in this gap, try again 🗘.", giving users clear feedback. Regardless of clients trying to fill in gaps automatically, I would still suggest to display gaps so people can tell what's happening.

This is basically a simplified version of MSC4282 leveraging MSC3871: Gappy timelines to get proper client feedback to indicate where the gaps are so we can skip backfill without worrying. For reference, skipping backfill without letting clients know where the gaps are just means they won't ever know that they are missing messages.

-- @MadLittleMods, https://github.com/element-hq/how-hard-can-it-be-2025/issues/47#issuecomment-3234339497

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connection exhaustion makes sense, and I guess having a separate API that is bulk /messages is essentially this API so...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see #4186 (comment) that describes another alternative for removing expanded_timeline. We don't even need to add a new bulk /messages endpoint. The client can just do another initial Sliding Sync request with the increased timeline_limit to get all of the messages that they want.


We could implement a bulk `/messages` endpoint, where the client would specify multiple rooms and `prev_batch` tokens.
Copy link
Contributor

@MadLittleMods MadLittleMods Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described in element-hq/synapse#17579 (comment), another alternative would be to just use another initial sync request with an increased timeline_limit. The linked comment describes more details for client implementations as well.

Cross-link to the main discussion about moving away from expanded_timeline -> #4186 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it needs to be incorporated into the MSC? What are the downsides of this? It seems fairly elegant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added it as an alternative. I think the main downsides are: a) duplication of data (e.g state), and b) only supports the one use case.

Copy link
Contributor

@MadLittleMods MadLittleMods Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) duplication of data (e.g state)

The goal is to get timeline events so you would just make an initial Sliding Sync request without any required_state:

Example initial sync request

Request body:

{
    "conn_id": "fetch_timeline",

    "lists": {
        "foo-list": {
            "range": [0, 99],
            "required_state": {},
            "timeline_limit": 20,
            "filters": {
                "is_dm": true
            },
        }
    },
    "room_subscriptions": {
        "!sub1:bar": {
            "required_state": {},
            "timeline_limit": 20,
        }
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only supports the one use case.

What are the other use cases not covered?

expanded_timeline's goal is to get more timeline messages

Copy link
Contributor

@MadLittleMods MadLittleMods Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the use case (from ef197a4):

The approach also doesn't support the use case for having two lists so that rooms that bubble to the
top of the room list automatically get expanded timelines.

You would just make another initial Sliding Sync request to get the messages again.

If it's only a few rooms, then it would be fine to use /messages as normal to fetch more timeline individually. But the whole crux of the problem being solved is fetching timeline for a bulk number of rooms so if you see a whole bunch of rooms bubble up, doing another initial Sliding Sync request covers this perfectly (same as what expanded_timeline is trying to do).

We can also add a flag to disable attempting to backfill over pagination (to match the behaviour of the sync timeline).
Comment on lines 595 to 606
Copy link
Contributor

@MadLittleMods MadLittleMods Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please kill unstable_expanded_timeline with one of these alternatives. It was implemented in Simplified Sliding Sync solely to match what the Sliding Sync proxy did. And the proxy behavior was a bug not a feature (self-described)

Previous conversation:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-link to another alternative about using another initial sync request with an increased timeline_limit -> #4186 (comment)

Copy link
Contributor

@MadLittleMods MadLittleMods Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also add a flag to disable attempting to backfill over pagination (to match the behaviour of the sync timeline).

👍 This also plays nicely with other ideas being explored in MSC3871 (gaps) and MSC4282 (conditional backfill in /messages), see #4282 (comment)


Related conversation: #4186 (comment)


## `required_state` response format

The format of returned state in `required_state` is a list of events. This does now allow the server to indicate if a
"state reset" has happened which removed an entry from the state entirely (rather than it being replaced with another
event).

This is particularly problematic if the user gets "state reset" out of the room, where the server has no mechanism to
indicate to the client that the user has effectively left the room (the server has no leave event to return).

We may want to allow special entries in the `required_state` list of the form
`{"type": .., "state_key": .., content: null}` to indicate that the state entry has been removed.


# Security considerations

Care must be taken, as with sync v2, to ensure that only the data that the user is authorized to see is returned in the
response.
Comment on lines +615 to +616
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be expanded to explain what this means.

In terms of state:

  • join: Access to all of the room state.
  • invite/knock: Can only derive state from the stripped state events in the event itself (invite_room_state, knock_room_state)
  • leave/ban: Depends on the previous membership.
    • If the previous membership was join, access to the state at the time of the leave/ban (historical state)
    • If the previous membership was invite, fallback to the stripped state on the invite event
    • If the previous membership was leave/ban, repeat this logic
    • If no previous membership, no access to anything.

In terms of timeline events, this probably depends solely on m.room.history_visibility

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this information is in "Room results" section.



# Unstable prefix

The unstable URL for simplified sliding sync is `/org.matrix.simplified_msc3575/sync`. The flag in `/versions` is
`org.matrix.simplified_msc3575`.
Loading