Skip to content

Commit c81818d

Browse files
epa095k8s-ci-robot
authored andcommitted
MetricController: Run only a single job per task (#660)
This changes the `spec.concurrencyPolicy` of the metric collector cron-job from "Allow" (default) to "Forbid". The cronjob used to create a new job even if the previous job had not succeeded. On high-load clusters this could lead to a high number of jobs which never finished. This fixed #659
1 parent 702703b commit c81818d

File tree

2 files changed

+2
-0
lines changed

2 files changed

+2
-0
lines changed

manifests/v1alpha1/studyjobcontroller/metricsControllerConfigMap.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ data:
1414
schedule: "*/1 * * * *"
1515
successfulJobsHistoryLimit: 0
1616
failedJobsHistoryLimit: 1
17+
concurrencyPolicy: Forbid
1718
jobTemplate:
1819
spec:
1920
backoffLimit: 0

manifests/v1alpha2/katib-controller/metricsControllerConfigMap.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ data:
1414
schedule: "*/1 * * * *"
1515
successfulJobsHistoryLimit: 0
1616
failedJobsHistoryLimit: 1
17+
concurrencyPolicy: Forbid
1718
jobTemplate:
1819
spec:
1920
backoffLimit: 0

0 commit comments

Comments
 (0)