Skip to content

Final piece to remove for generic_workers blanket use. #69

@realtyem

Description

@realtyem

Make media_repo be a generic_worker

Quoted from the modernize configure workers script PR:

  1. For Details 1: synapse.app.media_repository is a synapse.app.generic_worker, however because of this line, enable_media_repo won't work with synapse.app.generic_worker. As such, this won't be changed now. I feel like this logic should be in the workers.py file with similar type options(like run_background_tasks_on, notify_appservices_from_worker and update_user_directory_from_worker), and possibly adapted to use _should_this_worker_perform_duty(). Ideally, this should all be on a map instance that hashes by worker_name in the shared config.

Having looked into this a bit more, it's slightly more involved than a simple removal.
Things we know:

  1. In:

    # whether to enable the media repository endpoints. This should be set
    # to false if the media repository is running as a separate endpoint;
    # doing so ensures that we will not run cache cleanup jobs on the
    # master, potentially causing inconsistency.
    self.enable_media_repo = config.get("enable_media_repo", True)

    setup looks for enable_media_repo in the yaml, sets to True by default, and adds it to the server section of ConfigObject.

  2. It is then passed into /config/repository.py at:

    # Only enable the media repo if either the media repo is enabled or the
    # current worker app is the media repo.
    if (
    self.root.server.enable_media_repo is False
    and config.get("worker_app") != "synapse.app.media_repository"
    ):
    self.can_load_media_repo = False
    return
    else:
    self.can_load_media_repo = True

    where can_load_media_repo is set for that single instance(in this case a worker). If set to True then it enables the endpoint processing for media bits, according to comments in the source (included in "thing 1 above) also prevents cache cleanup inappropriately on master. Then it's added to the media section of the ConfigObject.

  3. can_load_media_repo (specifically, (hs.config.|self.config|config.)media.can_load_media_repo) is then used in potentially four other files. I think that's far enough for the purpose of this issue. This will be the last place to search for "True" or "False".

  1. Further, there is the option at this line that looks for media_instance_running_background_jobs and assigns that worker_name to handle those tasks(which I believe is only url previews presently?)

After pondering about this for a few days, I've come to the conclusion that there are two possible paths forward in order to remove synapse.app.media_repository from the source in "Thing 2" above. It will still need a yaml setting of some kind so that the worker knows it will be responsible for handling the media repo.

  1. There is a similar setting already existing that can be placed in the worker yaml to handle this, enable_media_repo. Right now, this is used to explicitly disable the media functions on master(and other workers). I think if it's put into the worker yaml fragment, that will be appropriate. Logic will have to be added to look for this, I recommend in repository.py next to the existing logic, for backwards compatibility.
  2. Build a new instance style map, similar to pusher_instances and federation_sender_instances. A quick mockup would look like:
media_repo_instances:
  - media_repository1

Bonus to this approach would be that there would be no need for an enable_media_repo setting in potentially more than two other yaml files, it could be in either homeserver.yaml or shared.yaml. Additional opportunity exists to add the media_instance_running_background_jobs function to this at the same time and kill two birds with one stone.

Additional questions:

  1. Why is enable_media_repo in the server config section? Would it be more consistent to have it in workers?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions