Skip to content

Scale up of statefulsets in a rollout group to follow leader doesn't write back the replica change to reference resource #255

@gmanghna

Description

@gmanghna

Scenario:
Consider Mimir ingester statefulsets for zone A, zone B and zone C. Statefulsets in each zone mirror a separate reference resource.
zone C follows zone B and zone B follows zone A.

Lets say statefulsets for zone A, zone B and zone C are configured to have 2 replicas each.
If the reference resource (ReplicaTemplate) for zone A is updated to 3, zone A statefulset is updated to 3 replicas

level=info ts=2025-08-03T11:26:37.302959256Z msg="scaling up statefulset to match replicas in the reference resource" group=ingester name=mimir-ingester-zone-a currentReplicas=2 referenceResourceDesiredReplicas=3 computedDesiredReplicas=3 referenceResource=replicatemplates/mimir-ingester-zone-a-replicas

On the next subsequent reconciles (triggered due to statefulset update), zone B and zone C statefulsets are increased from 2 to 3 to match leader.

level=info ts=2025-08-03T11:26:42.430600104Z msg="scaling up statefulset to match leader" group=ingester name=mimir-ingester-zone-b replicas=3
level=debug ts=2025-08-03T11:26:42.465978344Z msg="observed StatefulSet updated" name=mimir-ingester-zone-b namespace=xxx old_replicas=2 new_replicas=3 old_generation=132 new_generation=133

This update to statefulset doesn't trigger an update to replicatemplate (Reference resource) and hence in the subsequent reconcile, it is scaled down to 2 to mirror the reference resource.

level=info ts=2025-08-03T11:28:18.131736055Z msg="scaling down statefulset to computed desired replicas, based on replicas in the reference resource and elapsed downscale delays" group=ingester name=mimir-ingester-zone-b currentReplicas=3 referenceResourceDesiredReplicas=2 computedDesiredReplicas=2 referenceResource=replicatemplates/mimir-ingester-zone-b-replicas

If there were an update to the respective reference resource when statefulsets are updated to match leader, the above down scale could have been avoided.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions