-
Notifications
You must be signed in to change notification settings - Fork 488
Closed
Labels
Description
After deploying katib following getting started guide, I've seen the following errors:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
katib dlk-manager-698ccb5fdc-hb7xc 0/1 CrashLoopBackOff 6 13m
katib modeldb-backend-6855d95fb4-2sxw9 1/1 Running 0 14m
katib modeldb-db-6cf5bb764-5s65f 1/1 Running 0 14m
katib modeldb-frontend-5868bffc64-rhrr7 1/1 Running 0 14m
katib vizier-core-86c5566c88-kvsp9 0/1 CrashLoopBackOff 6 13m
katib vizier-db-64557596dc-mpgh4 1/1 Running 0 13m
katib vizier-suggestion-random-6b4d6db6-m8l94 0/1 CrashLoopBackOff 6 13m
kube-system kube-dns-5c6c5b55b-qmd9l 3/3 Running 0 16m
I've managed to get it running; it turns out the command is not correct. For example, I have to change this:
spec:
serviceAccountName: vizier-core
containers:
- name: vizier-core
image: katib/vizier-core
args:
- "-w"
- "dlk"
ports:
- name: api
containerPort: 6789
to
spec:
serviceAccountName: vizier-core
containers:
- name: vizier-core
image: katib/vizier-core
args:
- ./vizier-manager <-- add this line
- "-w"
- "dlk"
ports:
- name: api
containerPort: 6789
However, based on docker file for vizier-core, vizier-manager
is already set as entrypoint,
FROM golang:alpine AS build-env
# The GOPATH in the image is /go.
ADD . /go/src/github.com/kubeflow/hp-tuning
WORKDIR /go/src/github.com/kubeflow/hp-tuning/manager
RUN go build -o vizier-manager
FROM alpine:3.7
WORKDIR /app
COPY --from=build-env /go/src/github.com/kubeflow/hp-tuning/manager/vizier-manager /app/
COPY --from=build-env /go/src/github.com/kubeflow/hp-tuning/manager/visualise /
ENTRYPOINT ["./vizier-manager"]
CMD ["-w", "dlk"]
Anything wrong with the above 👆 setup?