-
Notifications
You must be signed in to change notification settings - Fork 488
Closed
Labels
Description
/kind bug
What steps did you take and what happened:
Created any kind of experiment and run it.
When I randomly removed the trial created by katib controller, it was unrecoverable forever.
Since the experiment controller reconciles trial assignments by only comparing the number of running trials and desired trials, removed trials can not be recovered.
What did you expect to happen:
I expect trials should be recovered and completed even if I randomly remove them.
Anything else you would like to add:
Uploaded #1831 to fix the problem. Please take a look, thank you.
Environment:
- Katib version (check the Katib controller image version): v0.12
- Kubernetes version: (
kubectl version
): v1.19.9 - OS (
uname -a
): 3.10.0-1160.31.1.el7.x86_64
Impacted by this bug? Give it a 👍 We prioritize the issues with the most 👍