-
-
Notifications
You must be signed in to change notification settings - Fork 121
Description
This is a mess so strap yourself in.
Background
Knocking was introduced in MSC2403 and got merged in March 2021. It introduced a new membership state "knock" along with a set of membership transitions to and from it. A puzzling question revolved around the transition from "invite" to "knock". Generally, this makes no sense. A user knocks on a room to receive an invite. If the inviting user wishes to disinvite the user, there is an existing transition from "invite" to "leave" to reflect this. Adding extra transitions increases complexity of the already complex membership state transition diagram. Extra complexity increases the risk of implementation errors or protocol errors.
Timeline
MSC2043, at the point it is accepted in March 2021, does not permit a transition from invite to knock in the auth rules: https://github.com/matrix-org/matrix-spec-proposals/blob/8f304ca9e567ca5b5630992025453b9a21a81b60/proposals/2403-knock.md#auth-rules.
Likewise, the April 2021 spec PR which adopted MSC2043 into the spec, forbade invite->knock
transitions in both the auth rules and the state transition table.
The Synapse implementation also did not allow this transition.
Now, this is what happened next..
- Feb 2022: the membership state transition diagram in the spec is updated to allow
invite->knock
: Fix membership state table and diagram matrix-spec-proposals#3730. The auth rules are not changed at this point. - June 2022: an issue is filed
invite -> knock
membership change is actually illegal #1142 stating that this transition is invalid by reading the auth rules. It is claimed that this is an error in the auth rules and (wrongly) that the Synapse implementation allows this transition. Due to the implementation apparently allowing it, consensus forms around letting the transition be allowed, despite it being "a fairly silly thing to do". - July 2022: the spec is further updated to retrospectively change the auth rules in room versions 7, 8 and 10 to allow the transition: Remove declared-invalid
invite->knock
restriction from auth rules #1175.
At the same time, MSC2043 is retrospectively updated to claim that invite->knock transitions are legal (despite the auth rules specified elsewhere in the MSC still not allowing it).
Current situation
Now Dendrite is looking to implement this, and I see this transition and think this is silly. I look for the rationale why this was allowed and uncover a series of unfortunate events:
- As of this writing, Synapse still does not allow this transition, and never has. https://github.com/element-hq/synapse/blob/69637f8bac0959386108edeea11ba456b9334486/synapse/event_auth.py#L689-L706
- Questions were raised about the transition in Nov 2020 whilst the implementation was being done. https://github.com/matrix-org/matrix-spec-proposals/pull/2403/files/6c75b83b9906e5edbcf312d39faefb9609abf95e#r531145850 - as a result of this, the transition was not allowed, see matrix-org/matrix-spec-proposals@7dcff8f
The end result is that we have a specification which was clarified incorrectly, meanwhile the Synapse implementation has always disallowed the transition since the MSC was merged.
Next steps
The immediate issue is to urgently clarify the clarification to disallow the state transition before more of the ecosystem allows it. Broad consensus is that is "silly" and really not needed, so we should let sanity prevail and not allow the transition.
There are core underlying issues here though which need to be mentioned. A critical part of the specification was changed (event auth rules) without [adequately] checking what the original implementation did, but yet the implementation was used as a rationale for the change in the first place. Clearly this should never have been permitted and the spec process should have caught this. We need more oversight on this. Accidents happen, and we should not be relying on human memory for critical parts of the specification. In this particular case, multiple people incorrectly claimed the transition was allowed in the implementation. I propose any change to event auth rules, or other access-control-like sections of the specification MUST have external validation in code, to prove out any changes. In practice, I would propose this means having Complement tests (or some other server-agnostic test suite) whenever we wish to change the specification due to the implementation having a certain behaviour.