-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Feature request
Through some mechanism provide the scheduler enough information so that the workspace PV(s) are scheduled somewhere the pipeline can complete. Nice to have, if a node scale-up is required allow that to happen while other less resource intensive tasks are being performed.
Use case
As a cluster admin I would like to run pipelines that are hard to schedule due to resource constraints in the cluster.
Consider the following pipeline:
(git-clone) -> (build) -> (test) -> (copy-assets)
The test task in my pipeline is hard to schedule, it requires 75% of the CPU and Memory of a node.
I have a cluster divided across 3 availability zones, each PV is only available in a single availability zone. I have Machine Autoscaler for each availability zone that can scale up a new node but some of the Autoscalers may be at their limit and will not create any additional nodes.
When the git-clone task is scheduled it doesn't provide enough "hints" to the scheduler so that the PV gets created in an availability zone where the test task can be completed. The "affinity-assistant" tends to make matters worse.
I can currently work around this limitation (assuming affinity-assistant is disabled) by adding a task that has resource requests that match the test task so my pipeline looks like this:
(preallocate-resources) -> (git-clone) -> (build) -> (test) -> (copy-assets)
This is an ugly hack especially because I can no longer parameterize the resource requests due to this validation failure (for reference I am including my task__reallocate-resources.yaml
at the end of this issue):
Error from server (BadRequest): error when creating "task__preallocate-resources.yaml": admission webhook "webhook.pipeline.tekton.dev" denied the request: mutation failed: cannot decode incoming new object: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'
Additionally, the affinity-assistant feature breaks this workaround because the affinity-assitant pod is scheduled before the first task pod.
Even before implementing the preallocate-resources workaround described above I set a priorityClassName attribute on the test task that is slightly higher than default and allows for preemption. This allows the scheduler to evict pods from a node in order to make space for the test task pod. I'm not sure exactly why but when the affinity-assistant feature is enabled this workaround doesn't appear to work any longer.
I know that documentation states in many places to use a limitRange to automatically provide for resource allocation for the pods. I feel that limitRanges are a poor solution to the problem. It seems that every pod schedule in the namespace will use this limitRange so in my use case the test task pod would never schedule because the affinity-assistant pod has such large resource requests.
For reference task__preallocate-resources.yaml
which fails validation:
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: preallocate-resources
spec:
description: >-
Use this task first in a pipeline to make sure that the workspace PV is allocated somewhere that the pipeline can complete.
params:
- name: cpu
default: '1000m'
description: cpu required for largest pipeline task
type: string
- name: memory
default: '4Gi'
description: memory required for largest pipeline task
type: string
- name: ephemeral-storage
default: '5Gi'
description: ephemeral-storage required for largest pipeline task
type: string
steps:
- name: nop
image: registry.redhat.io/openshift-pipelines-tech-preview/pipelines-git-init-rhel8@7e18e13a94c9c82e369274984500401e27684d3565393cf6ed2bad55f2d751bc
command: [echo]
args: ["Startup Complete"]
resources:
requests:
cpu: $(params.cpu)
ephemeral-storage: $(params.ephemeral-storage)
memory: $(params.memory)
workspaces:
- name: source