Skip to content

Katib doesn't support mpijob #1181

@YuxiJin-tobeyjin

Description

@YuxiJin-tobeyjin

/kind bug

What steps did you take and what happened:
Deploy katib and mpi-operator in my local kubernetes cluster,

kubectl get po -n kubeflow
NAME                                   READY   STATUS    RESTARTS   AGE
katib-controller-b6dc87fcb-2lrtj       1/1     Running   0          26h
katib-db-manager-79fd46648b-scxx8      1/1     Running   0          2d3h
katib-mysql-7f8bc6956f-fxkgl           1/1     Running   0          13d
katib-ui-74bcbd8b75-bwppw              1/1     Running   0          13d

Use kubectl to create an experiment using MPIJob, the creating result is failed, log is as follows:

Error from server: error when creating "tt-katib.yaml": admission webhook "validating.experiment.katib.kubeflow.org" denied the request: Invalid spec.trialTemplate: Job type kubeflow.org/v1alpha2, Kind=MPIJob not supported.

What did you expect to happen:
Experiment created successfully, Trial and MPIJob can run properly.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Now that only job、tfjob、pytorchJob are supported,conside to support mpi-operator.

Environment:

  • Kubernetes version: (use kubectl version): 1.14.1
  • OS (e.g. from /etc/os-release): Ubuntu 16.04.4

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions