-
Notifications
You must be signed in to change notification settings - Fork 280
Description
/kind bug
What steps did you take and what happened:
@mdbooth your input is treasured as per usual...
When I delete a cluster (properly this time, you can prevent Argo from deleting anything and let CAPI manage the whole process), what I'm seeing is control plane machines hanging.
From what I can tell there is a race where:
- CAPI adds a owner reference on the infrastructure
- I delete the Cluster
- CAPI should delete... the MDs, the KCP, then the infrastructure
- However the owner reference trigger the infrastructure delete early...
What happens is CAPO keeps trying to delete the infra and the CP machines at the same time.
As soon as the ports are detached from the network, that completes and the infrastructure gets deleted.
However the CP machines haven't been fully deleted yet, and cannot be deleted because they need to see the infrastructure resource in order to determine whether there's a loadbalancer that needs reconciling.
I expect either:
- CAPI needs to stop messing with owner references and let its internal ordering take precedence
- CAPO needs to cache infrastructure configuration in the machines so it doesn't need to refer to the OSC resource
What did you expect to happen:
You can actually delete things without a hang.
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
- Cluster API Provider OpenStack version (Or
git rev-parse HEAD
if manually built): 0.7.1 - Cluster-API version: 1.3.2
- OpenStack version: Zed
- Minikube/KIND version:
- Kubernetes version (use
kubectl version
): 1.27 - OS (e.g. from
/etc/os-release
):