Skip to content

Suggestions pod does not fail when exception is raised #1120

@StefanoFioravanzo

Description

@StefanoFioravanzo

/kind bug

What steps did you take and what happened:
Created a Katib experiment, using grid search and two int parameters. CRD looks like this:

apiVersion: kubeflow.org/v1alpha3
kind: Experiment
metadata:
  labels:
    controller-tools.k8s.io: '1.0'
  name: katib-simple-trial
spec:
  algorithm:
    algorithmName: grid
  parallelTrialCount: 1
  maxFailedTrialCount: 6
  maxTrialCount: 12
  objective:
    additionalMetricNames:
    goal: 100
    objectiveMetricName: result
    type: maximize
  parameters:
  - feasibleSpace:
      max: '50'
      min: '1'
      step: '10'
    name: a
    parameterType: int
  - feasibleSpace:
      max: '1'
      min: '50'
      step: '9'
    name: b
    parameterType: int
  trialTemplate:
    goTemplate:
      rawTemplate: |
        apiVersion: batch/v1
        kind: Job
        metadata:
          name: {{.Trial}}
          namespace: {{.NameSpace}}
        spec:
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
            spec:
              restartPolicy: Never
              containers:
                - name: {{.Trial}}
                  image: <myimage>
                  command:
                    - python3 -u -c "<some_command>"

Since parameter b has min=50 and max=1, I would expect the submission of the CRD to fail.

What did you expect to happen:
What happens is that the suggestions pod is created and it starts to continuously produce the following error:

ERROR:grpc._server:Exception calling application: Low must be lower than high
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/grpc/_server.py", line 434, in _call_behavior
    response_or_iterator = behavior(argument, context)
  File "/usr/src/app/github.com/kubeflow/katib/pkg/suggestion/v1alpha3/chocolate_service.py", line 39, in GetSuggestions
    search_space, trials, request.request_number)
  File "/usr/src/app/github.com/kubeflow/katib/pkg/suggestion/v1alpha3/chocolate/base_chocolate_service.py", line 33, in getSuggestions
    int(param.min), int(param.max), 1)
  File "/usr/local/lib/python3.6/site-packages/chocolate/space.py", line 140, in __init__
    assert low < high, "Low must be lower than high"
AssertionError: Low must be lower than high

So the katib experiment runs indefinitely without ever failing and without producing any trials. The controller logs don't help either and no events are generated, it's just the suggestions pod that produces these errors.

Environment:

  • Kubeflow version: 1.0
  • Minikube version: 1.2.0 (MiniKF latest)
  • Kubernetes version: 1.14

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions