Skip to content

Commit 6db29bb

Browse files
lepsafisxbattermannsmattingsupersven
authored
WPB-240: Generate and fan out events about stopping to federate (#3397)
* Breadcrumbs. * Better breadcrumbs. We don't know which release this will land in yet, PR is more future-proof. * Polish docs. * afterthought. * Update services/brig/test/integration/API/Federation.hs * More tests. * Remove a lying comment. * Refactor: keep the semantics of `_runSettings` intact. (This is part of the `Opts` data structure read from a file. it's very confusing to start updating part of that on the fly, and ignoring whatever comes from the config file.) * Remove a lying comment. * Fixup / WIP * Fixup * I take it back! * more docs. * rm bogus TODO, add a non-bogus one :) * revert changes of status code in integration tests. (Double-check if we can't keep the prod code behavior intact to avoid upgrade / fed-to-fed api / client api issues.) * Various updates, mainly moving some common code into wire-api and minimising types that are used in several common paths. * FS-1115: More common code for federation domain updates * Use tinylog instead of `print`. * sanitize-pr * Use retry instead of threadDelay-loop. * FS-1115: Updating design docs on how update intervals are supplied * Clarify source comment. * FS-1115: Cleaning up un-needed changes to options * update docs.wire.com * docs. * docs. * Fix: `/i/user/meta-info` (stern) (#3281) * Handle race conditions in /integration (#3278) * Update docs.wire.com (#3284) * Deprecate start-services-only.sh * Fix outdated docs on how to run swagger docs locally. * Restore deleted scripts (#3287) * Refactor federation domain configuration. - move strategy config from federator to brig. - remove domain list from strategy type. - add 'none' strategy (more explicit that 'list' with empty list, but both still work) * Fixup * Complete CRUD api for federator remotes (Update is missing) [WIP]. * Complete CRUD api for federator remotes (Update is missing): tests. * docs. * Cleanup * FS-1179: Setting up a location in galley for processing deleted domains NOTE: Broken! Currently not compiling Adding a new query to cassandra for getting remote members and conversation IDs that they are in, setting up loops for removing these members and potentially conversations using existing APIs. * FS-1179: Fixing the polysemy error. Picking an error type and documenting what might need to be done for better error mapping. * FS-1179: Initial processing for deleting remote domains. Setting up Galley to remove remote users from local conversations, local users from remote conversations, and to delete all user connections for the federation domain that is being dropped. * FS-1179: Fixing an error http reponse code. * wip * Remove dead code. * nit-pick. * Typo. * Fine-tune logging. * Fix haddocks. * Implement put. * Clariy updateFrequency everywhere. * nit-pick. * Fixup * Cleanup * Implement put. (For real this time.) * Move integration tests to /integration. * FS-1179: Work in progress. Tests are broken due to timing issues * FS-1179: Fixing an errant delete * FS-1179: Commiting and switching to FS-1115 * ... * wip * ... * FS-1115: Setting up calls to Brig from integration tests. JSON, calling, etc. * ... * ... * ... * ... * ... * ... * Fix * ... * WIP: Chasing down cache issues between some bug fixes * ... * ... * ... * ... * Tests are passing! * process leif's feedback. * ... * Fixing compile issues after a merge * Updating templates * Removing a JSON roundtrip, better using aeson. * Moving more of the federation domain update code into wire-api. Tests are broken with these changes. * wip * sanitize-pr * FS-1115: Updating brig integration config to help tests * s/AllowList/AllowDynamic/g * rm trailing whitespace. * docs * Updating with PR feedback * Have federation domain tests use MakesValue more * FS-1115: Removing more FedConn type specific code from federation tests * More code leaning on typeclasses * Pre-emptive rework before merging * Mark flaky test case. * docs. * Formatting * FS-1179: Setting up the receiving RabbitMQ loop. Setup a reading loop for RabbitMQ to get the domains that need to be deleted. We're using a queue here so that we have a non-volatile store for these domains that can automatically handle galley instances dying or being scaled down. * wip * FS-1179: Refactoring the RabbitMQ code Refactoring RabbitMQ code so that there is a single channel for both production and consumption of messages. There is also a function for "simple" processing where a message queuing system isn't available, but it still needs to have persistence and task co-ordination defined for it. This may not be a problem if we decide to mandate the use of RabbitMQ, or a similar queue. * Code formatting * FS-1179: Fixing tests using the wrong domain. Using the wrong domain in the galley code, not the test directly, was causing errors that were annoying to debug. * docs. * Changelog. * docs. * sanitize-pr * Fixup * FS-1179: Moving a delete call into Brig, and calling it from Galley. Moving the deletion of one-on-one conversations into Brig, as Galley can't access the tables where that info is stored. This added a new internal API endpoint. Integration tests are currently failing on trying to delete one-on-one connections in Brig. * FS-1179: Fixing tests by telling cassandra to run queries anyway * Update docs/src/understand/configure-federation.md Co-authored-by: Sven Tennie <[email protected]> * Update docs/src/developer/developer/federation-design-aspects.md Co-authored-by: Sven Tennie <[email protected]> * FS-1179: Coping code from other tests to setup remote conversations. Setitng up remote conversations, using code from another test. Querying this database status requires messing around with Cql and polysemy rather than being able to look at API requests and reponses. The test is currently broken and I need to dig into why, and it is possible that this is a timing issue between brig and galley, but I'm not certain. * Fix cannon config map * whitespace * FS-1179: Improving tests in galley, and adding a new test to brig * FS-1179: PR formatting * Adding a round-trip test for rabbitmq. Using the pub and sub functions from the main galley code * hi ci * docs. * docs. * FS-1179: Add an internal endpoint for deleting a federation domain * FS-1179: Adding a defederation worker to background worker Adding a new worker on background worker that reads from a queue and calls Galley to delete a federation domain. * FS-1179: Some timeout config for the background worker. * Keeping brig and backend worker in sync * WIP: New tests and various changes to background worker * hi ci * Tweak docs. * nit-pick. * Mark test case as flaky. * Tweak log levels. * Tweak log msg. * wip * Update libs/wire-api/src/Wire/API/Routes/Internal/Brig.hs Co-authored-by: fisx <[email protected]> * Update docs/src/understand/configure-federation.md Co-authored-by: fisx <[email protected]> * Update services/brig/src/Brig/API/Internal.hs Co-authored-by: fisx <[email protected]> * Update services/brig/src/Brig/API/Internal.hs Co-authored-by: fisx <[email protected]> * Update services/brig/src/Brig/API/Internal.hs Co-authored-by: fisx <[email protected]> * Update services/federator/src/Federator/Run.hs Co-authored-by: fisx <[email protected]> * Update services/galley/src/Galley/App.hs Co-authored-by: fisx <[email protected]> * FS-1115: PR notes * try fix helm charts * WIP * hi ci * changed federation strategy in CI from allowDynamic to allowAll * correct error in case of allowDynamic * default case made explicit * FS-1179: Modifications to tests to help stop hangups * Cleaning up comments * Create use domain names in integration tests. (this avoids hypothetical concurrency problems.) * Better errors in /integration. * nit-picks. * Better errors in /integration. * revert commit noise. * Fix `make list-flaky-tests`. * nit-picks, renames, minor refactorings. * s/type/newtype/ * Polish Wire.API.FederationUpdate (names and declaration order). * Fix: Always cancel `syncFedDomainUpdateThread`. * hi ci * Fixing things post-merge * Adding a changelog entry * FS-1179: Brig now deletes the notification queue for deleted domains * FS-1179: Removing and errant import * FS-1179: PR sanitisation. * FS-1179: Updating a TODO * FS-1179: Writing cleanup code so that we don't have dangling threads * FS-1179: PR formatting * PR formatting * More comments and exception handling * Fixing tests now that they have moved to hspec * Formatting * Fix background-worker integration tests. * Updating the internal notes * WPB-240: Initial implementation of notifications sending for defederation * Removing a redundant language extension * Removing a redundant language extension * WPB-240: Matching event fanout count to the limit in options. Adding tests for event fanout. * WPB-240: Comments and code in tests for quickly filling the DB * FS-1179: Reworking after a discussion with Akshay about how to use AMQP. Workers are going to have their own connection management to RabbitMQ which will keep things simpler and help reduce potential blocks in processing. It also helps us not step on each other's toes when writing this service. * PR formatting * Hi CI * FS-1179: Adding galley and brig to background-worker's configmap * FS-1179: Removing the confusing "integration" tests for background-worker They were never actual integration tests, I was just confused about why `make ci` wasn't picking up the existing tests, so I made it run them and that was a mistake and caused confusion. * PR notes * `make sanitize-pr`. * Fixup * Changing the format of federation.delete. * Sanitizing code * FS-1179: Removing dead code and bumping schema migrations * FS-1179: Removing more dead code * PR formatting * Hi CI * WPB-240: Formatting code * WPB-240: Changelog entries * WPB-240: Removing dead code * WPB-240: Moving notification code into an effect, as mentioned in PR. --------- Co-authored-by: Matthias Fischmann <[email protected]> Co-authored-by: Leif Battermann <[email protected]> Co-authored-by: Stefan Matting <[email protected]> Co-authored-by: Sven Tennie <[email protected]>
1 parent df4c0b9 commit 6db29bb

File tree

18 files changed

+272
-10
lines changed

18 files changed

+272
-10
lines changed

changelog.d/1-api-changes/WPB-240

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Added a new notification event type, "federation.delete".
2+
This event contains a single domain for a remote server that the local server is de-federating from.
3+
This notification is sent twice during de-federation. Once before and once after cleaning up and removing references to the remote server from the local database.

changelog.d/6-federation/WPB-240

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
De-federating from a remote server sends a pair of notifications to clients, announcing which server will no longer be federated with.
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
module Wire.API.Event.Federation
2+
( Event (..),
3+
EventType (..),
4+
)
5+
where
6+
7+
import Data.Aeson (FromJSON, ToJSON)
8+
import qualified Data.Aeson as A
9+
import qualified Data.Aeson.KeyMap as KeyMap
10+
import Data.Domain
11+
import Data.Json.Util (ToJSONObject (toJSONObject))
12+
import Data.Schema
13+
import qualified Data.Swagger as S
14+
import Imports
15+
import Wire.Arbitrary
16+
17+
data Event = Event
18+
{ _eventType :: EventType,
19+
_eventDomain :: Domain
20+
}
21+
deriving (Eq, Show, Ord, Generic)
22+
23+
instance Arbitrary Event where
24+
arbitrary =
25+
Event
26+
<$> arbitrary
27+
<*> arbitrary
28+
29+
data EventType
30+
= FederationDelete
31+
deriving (Eq, Show, Ord, Generic)
32+
deriving (Arbitrary) via (GenericUniform EventType)
33+
deriving (A.FromJSON, A.ToJSON, S.ToSchema) via Schema EventType
34+
35+
instance ToSchema EventType where
36+
schema =
37+
enum @Text "EventType" $
38+
mconcat
39+
[ element "federation.delete" FederationDelete
40+
]
41+
42+
eventObjectSchema :: ObjectSchema SwaggerDoc Event
43+
eventObjectSchema =
44+
Event
45+
<$> _eventType .= field "type" schema
46+
<*> _eventDomain .= field "domain" schema
47+
48+
instance ToSchema Event where
49+
schema = object "Event" eventObjectSchema
50+
51+
instance ToJSONObject Event where
52+
toJSONObject =
53+
KeyMap.fromList
54+
. fromMaybe []
55+
. schemaOut eventObjectSchema
56+
57+
instance S.ToSchema Event where
58+
declareNamedSchema = schemaToSwagger
59+
60+
instance FromJSON Event where
61+
parseJSON = schemaParseJSON
62+
63+
instance ToJSON Event where
64+
toJSON = schemaToJSON

libs/wire-api/wire-api.cabal

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ library
3737
Wire.API.Error.Gundeck
3838
Wire.API.Event.Conversation
3939
Wire.API.Event.FeatureConfig
40+
Wire.API.Event.Federation
4041
Wire.API.Event.Team
4142
Wire.API.FederationStatus
4243
Wire.API.FederationUpdate

services/galley/galley.cabal

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ library
9999
Galley.Effects.CodeStore
100100
Galley.Effects.ConversationStore
101101
Galley.Effects.CustomBackendStore
102+
Galley.Effects.DefederationNotifications
102103
Galley.Effects.ExternalAccess
103104
Galley.Effects.FederatorAccess
104105
Galley.Effects.FireAndForget

services/galley/src/Galley/API/Internal.hs

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ import Galley.Effects
6565
import Galley.Effects.BackendNotificationQueueAccess
6666
import Galley.Effects.ClientStore
6767
import Galley.Effects.ConversationStore
68+
import Galley.Effects.DefederationNotifications (DefederationNotifications, sendDefederationNotifications)
6869
import Galley.Effects.FederatorAccess
6970
import Galley.Effects.GundeckAccess
7071
import Galley.Effects.LegalHoldStore as LegalHoldStore
@@ -85,8 +86,8 @@ import Imports hiding (head)
8586
import qualified Network.AMQP as Q
8687
import Network.HTTP.Types
8788
import Network.Wai
88-
import Network.Wai.Predicate hiding (Error, err, setStatus)
89-
import qualified Network.Wai.Predicate as Predicate
89+
import Network.Wai.Predicate hiding (Error, err, result, setStatus)
90+
import qualified Network.Wai.Predicate as Predicate hiding (result)
9091
import Network.Wai.Routing hiding (App, route, toList)
9192
import Network.Wai.Utilities hiding (Error)
9293
import Network.Wai.Utilities.ZAuth
@@ -538,12 +539,18 @@ internalDeleteFederationDomainH ::
538539
Member TeamStore r,
539540
Member BrigAccess r,
540541
Member GundeckAccess r,
541-
Member ExternalAccess r
542+
Member ExternalAccess r,
543+
Member DefederationNotifications r
542544
) =>
543545
Domain ::: JSON ->
544546
Sem r Response
545547
internalDeleteFederationDomainH (domain ::: _) = do
548+
-- We have to send the same event twice.
549+
-- Once before and once after defederation work.
550+
-- https://wearezeta.atlassian.net/wiki/spaces/ENGINEERIN/pages/809238539/Use+case+Stopping+to+federate+with+a+domain
551+
sendDefederationNotifications domain
546552
deleteFederationDomain domain
553+
sendDefederationNotifications domain
547554
pure (empty & setStatus status200)
548555

549556
-- Remove remote members from local conversations

services/galley/src/Galley/App.hs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -272,6 +272,7 @@ evalGalley e =
272272
. interpretFederatorAccess
273273
. interpretExternalAccess
274274
. interpretGundeckAccess
275+
. interpretDefederationNotifications
275276
. interpretSparAccess
276277
. interpretBrigAccess
277278
where

services/galley/src/Galley/Cassandra/Conversation/Members.hs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@
1818
module Galley.Cassandra.Conversation.Members
1919
( addMembers,
2020
members,
21+
allMembers,
22+
toMember,
2123
lookupRemoteMembers,
2224
removeMembersFromLocalConv,
2325
toMemberStatus,
@@ -122,6 +124,11 @@ members conv =
122124
fmap (mapMaybe toMember) . retry x1 $
123125
query Cql.selectMembers (params LocalQuorum (Identity conv))
124126

127+
allMembers :: Client [LocalMember]
128+
allMembers =
129+
fmap (mapMaybe toMember) . retry x1 $
130+
query Cql.selectAllMembers (params LocalQuorum ())
131+
125132
toMemberStatus ::
126133
( -- otr muted
127134
Maybe MutedStatus,
@@ -386,6 +393,7 @@ interpretMemberStoreToCassandra = interpret $ \case
386393
CreateBotMember sr bid cid -> embedClient $ addBotMember sr bid cid
387394
GetLocalMember cid uid -> embedClient $ member cid uid
388395
GetLocalMembers cid -> embedClient $ members cid
396+
GetAllLocalMembers -> embedClient allMembers
389397
GetRemoteMember cid uid -> embedClient $ lookupRemoteMember cid (tDomain uid) (tUnqualified uid)
390398
GetRemoteMembers rcid -> embedClient $ lookupRemoteMembers rcid
391399
CheckLocalMemberRemoteConv uid rcnv -> fmap (not . null) $ embedClient $ lookupLocalMemberRemoteConv uid rcnv

services/galley/src/Galley/Cassandra/Queries.hs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -339,6 +339,9 @@ selectMember = "select user, service, provider, status, otr_muted_status, otr_mu
339339
selectMembers :: PrepQuery R (Identity ConvId) (UserId, Maybe ServiceId, Maybe ProviderId, Maybe MemberStatus, Maybe MutedStatus, Maybe Text, Maybe Bool, Maybe Text, Maybe Bool, Maybe Text, Maybe RoleName)
340340
selectMembers = "select user, service, provider, status, otr_muted_status, otr_muted_ref, otr_archived, otr_archived_ref, hidden, hidden_ref, conversation_role from member where conv = ?"
341341

342+
selectAllMembers :: PrepQuery R () (UserId, Maybe ServiceId, Maybe ProviderId, Maybe MemberStatus, Maybe MutedStatus, Maybe Text, Maybe Bool, Maybe Text, Maybe Bool, Maybe Text, Maybe RoleName)
343+
selectAllMembers = "select user, service, provider, status, otr_muted_status, otr_muted_ref, otr_archived, otr_archived_ref, hidden, hidden_ref, conversation_role from member"
344+
342345
insertMember :: PrepQuery W (ConvId, UserId, Maybe ServiceId, Maybe ProviderId, RoleName) ()
343346
insertMember = "insert into member (conv, user, service, provider, status, conversation_role) values (?, ?, ?, ?, 0, ?)"
344347

services/galley/src/Galley/Effects.hs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ import Galley.Effects.ClientStore
6969
import Galley.Effects.CodeStore
7070
import Galley.Effects.ConversationStore
7171
import Galley.Effects.CustomBackendStore
72+
import Galley.Effects.DefederationNotifications
7273
import Galley.Effects.ExternalAccess
7374
import Galley.Effects.FederatorAccess
7475
import Galley.Effects.FireAndForget
@@ -99,6 +100,7 @@ import Wire.Sem.Paging.Cassandra
99100
type GalleyEffects1 =
100101
'[ BrigAccess,
101102
SparAccess,
103+
DefederationNotifications,
102104
GundeckAccess,
103105
ExternalAccess,
104106
FederatorAccess,

0 commit comments

Comments
 (0)