Skip to content

experiment controller is not showing any events when fails to reconcile all trials #1663

@henrysecond1

Description

@henrysecond1

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

The experiment controller is not showing any events when fails to reconcile all trials.

For example, consider the situation the trial parameter reference is misconfigured as below. Assume that parameter is given as num-layers, and if we do not correctly set its reference as num-layer (typo) in trialParameters , all trials fail to be created.

parameters:
  - feasibleSpace:
    ...
    name: num-layers
    parameterType: int
trialTemplate:
    ...
    trialParameters:
    - name: numberLayers
      reference: num-layer # typo

We can check the reason for the failure in the controller log. However, users not authorized to access the controller can not find the reason that why their trials are not created since no events are emitted by the experiment controller.

$ kubectl describe experiment random-experiment -n user
...
Status:
  Completion Time:  <nil>
  Conditions:
    Message:               Experiment is created
    Reason:                ExperimentCreated
    Status:                True
    Type:                  Created
  Current Optimal Trial:
    Observation:
Events:              <none>

What did you expect to happen:

The experiment controller emits events when fails to reconcile all trials.

Anything else you would like to add:

Relevant logs in Katib controller

Fail to get RunSpec from experiment","Experiment":"user/random-experiment","error":"Unable to find parameter: num-layer in parameter assignment map[lr:0.026271422193467404 num-layers:5 optimizer:sgd

Environment:

  • Kubeflow version (kfctl version): v1.3
  • Kubernetes version: (use kubectl version): v1.18.10
  • OS (e.g. from /etc/os-release): CentOS 7.9

If it's okay, I'd like to contribute to solving the issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions