Skip to content

Conversation

@elenzio9
Copy link
Contributor

In this PR:

  • Create a distinct LOGS tab, which displays the trial's logs in the Trial details page.
  • Don't show the backend's error popup for logs, but show the message error in the admonition.
  • Update the message the backend sends to not just expose that logs are not there because retain might not be set, but also because the cluster was scaled down.

Screenshots

Show the trial's logs

image

If the user is in the logs tab and the logs request is not 200, then show the message error in the admonition

image

Related issue: #1764
Related PR: #2039

@johnugeorge
Copy link
Member

Thanks @elenzio9 for this

/assign @kimwnasptd

@tenzen-y
Copy link
Member

/assign @orfeas-k

return "", errors.New(`Logs for the trial could not be found.
Was 'retain: true' specified in the Experiment definition?
An example can be found here: https://github.com/kubeflow/katib/blob/7bf39225f7235ee4ba6cf285ecc2c455c6471234/examples/v1beta1/argo/argo-workflow.yaml#L33`)
An example can be found here: https://github.com/kubeflow/katib/blob/7bf39225f7235ee4ba6cf285ecc2c455c6471234/examples/v1beta1/argo/argo-workflow.yaml#L33.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one seems to be verbose for a data scientist. May be just log message instead of showing in UI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnugeorge Do you mean to remove the example completely?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this might be too verbose for end users. I'd suggest to change this text to something like:

Failed to find logs for this Trial. Make sure you've set "spec.trialTemplate.retain" field to "true" in the Experiment definition. If this error persists then the Pod's logs are not currently persisted in the cluster.```

@andreyvelich
Copy link
Member

Thank you very much for this amazing feature, many users wait for this so long!
cc @d-gol

@tenzen-y
Copy link
Member

@elenzio9 Can you rebase and resolve conflicts?

@kimwnasptd
Copy link
Member

@andreyvelich @johnugeorge @tenzen-y I won't be able to review this today, to make it to the RC0. But we can markthis as one of the PRs to merge through the feature freeze. cc @DomFleischmann

This is a bigger one and I wouldn't want to rush it today

@tenzen-y
Copy link
Member

tenzen-y commented Jan 26, 2023

@andreyvelich @johnugeorge @tenzen-y I won't be able to review this today, to make it to the RC0. But we can markthis as one of the PRs to merge through the feature freeze. cc @DomFleischmann

This is a bigger one and I wouldn't want to rush it today

@kimwnasptd
Thanks for your effort. Does that mean you would like to include this after the first RC release (e.g., RC1)?

@andreyvelich
Copy link
Member

@kimwnasptd I think, it should be fine to merge this PR after the RC0. cc @johnugeorge

@tenzen-y
Copy link
Member

I'm ok with not including this in RC0.

@elenzio9 elenzio9 force-pushed the feature-elena-kwa-trials-logs-tab branch from 1f8185b to cbc8644 Compare January 26, 2023 14:39
@elenzio9
Copy link
Contributor Author

@elenzio9 Can you rebase and resolve conflicts?

@tenzen-y Just rebased and pushed again!

Copy link
Contributor

@orfeas-k orfeas-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @elenzio9 for the PR! It looks great, here are some minor comments.

Regarding the error from the backend, I 'm not really sure what would be better to show to the user so I 'll also @kimwnasptd comment on this.

this.backendService.getTrialLogs(this.trialName, this.namespace).subscribe(
logs => {
this.trialLogs = logs;
this.logsRequestError = '';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit but I think it would make more sense semantically pass null to logsRequestError since there is no error.

return;
}

let arrayOfLogs = trialLogs.split('\n');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a const instead of let.


return this.http.get(url).pipe(
catchError(error => this.handleError(error, false)),
map((resp: any) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think .map doesn't serve a purpose here. We could change this to

    return this.http
      .get(url)
      .pipe(catchError(error => this.handleError(error, false)));

WDYT @elenzio9 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree! @orfeas-k Thanks for your comments! I'll make the changes and push the branch again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I 'm good then, regarding the frontend: /lgtm

@elenzio9 elenzio9 force-pushed the feature-elena-kwa-trials-logs-tab branch 2 times, most recently from ed25489 to 8c65da2 Compare February 1, 2023 10:39
@andreyvelich
Copy link
Member

@orfeas-k @kimwnasptd @elenzio9 Are we ready to merge this PR ?

@andreyvelich
Copy link
Member

@kimwnasptd Please review this PR today if you can.

@elenzio9 elenzio9 force-pushed the feature-elena-kwa-trials-logs-tab branch from 8c65da2 to 7ae152c Compare February 14, 2023 11:18
Copy link
Member

@kimwnasptd kimwnasptd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a pass, had some small comments but the code LGTM! Also deployed it in my cluster and everything works as expected

return "", errors.New(`Logs for the trial could not be found.
Was 'retain: true' specified in the Experiment definition?
An example can be found here: https://github.com/kubeflow/katib/blob/7bf39225f7235ee4ba6cf285ecc2c455c6471234/examples/v1beta1/argo/argo-workflow.yaml#L33`)
An example can be found here: https://github.com/kubeflow/katib/blob/7bf39225f7235ee4ba6cf285ecc2c455c6471234/examples/v1beta1/argo/argo-workflow.yaml#L33.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this might be too verbose for end users. I'd suggest to change this text to something like:

Failed to find logs for this Trial. Make sure you've set "spec.trialTemplate.retain" field to "true" in the Experiment definition. If this error persists then the Pod's logs are not currently persisted in the cluster.```


if (error instanceof HttpErrorResponse) {
if (this.getBackendErrorLog(error) !== undefined) {
return `[${error.status}] ${this.getBackendErrorLog(error)}`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this line is the only one that differs from the inherited function. I'd propose to not show the error in this case and not override this function, and just keep the changes from getBackendError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the error status and now the message looks like:

image

Comment on lines 19 to 20
const arrayOfLogs = trialLogs.split('\n');
this.logs = arrayOfLogs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we combine these 2 in one line? I think there's no need to create the arrayOfLogs var

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sure!

<!--if no logs are present at all then show a warning message-->
<ng-container *ngIf="logsRequestError">
<lib-panel icon="error" color="warn">
Error: {{ logsRequestError }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just keep this as {{ logsRequestError }}. The icon in this panel and the text should make it clear that something unexpected has happened

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kimwnasptd Thanks for the above comments! I force pushed the branch with the changes!

@elenzio9 elenzio9 force-pushed the feature-elena-kwa-trials-logs-tab branch from 7ae152c to 1449b9b Compare February 14, 2023 12:16
@andreyvelich
Copy link
Member

It seems that Charmed actions are failing, please can you take a look @DomFleischmann @knkski @ca-scribner?

@kimwnasptd
Copy link
Member

All the changes look good from my side, thanks for the work on this feature @elenzio9 @d-gol!

/lgtm
/approve

@tenzen-y
Copy link
Member

@elenzio9 Can you rebase? Since I fixed CI in #2114.

@ca-scribner
Copy link
Contributor

@tenzen-y beat me to it, but that's what I would have done as well. Seems like black's defaults changed between v22.x and v23.x.

@tenzen-y
Copy link
Member

@tenzen-y beat me to it, but that's what I would have done as well. Seems like black's defaults changed between v22.x and v23.x.

@ca-scribner Thanks for letting me know. We may want to pin the black version.

* Update the message the backend sends to not just expose that logs are
  not there because 'retain' might not be set, but also because the
  cluster was scaled down.

Signed-off-by: Elena Zioga <[email protected]>
In this commit:

* Create a distinct LOGS tab, which displays the trial's logs in the
  Trial details page.
* Don't show the backend's error popup for logs, but show the message
  error in the admonition.

Signed-off-by: Elena Zioga <[email protected]>
@elenzio9 elenzio9 force-pushed the feature-elena-kwa-trials-logs-tab branch from 1449b9b to 340add3 Compare February 14, 2023 18:05
@google-oss-prow google-oss-prow bot removed the lgtm label Feb 14, 2023
@elenzio9
Copy link
Contributor Author

@elenzio9 Can you rebase? Since I fixed CI in #2114.

@tenzen-y Just rebased!

Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elenzio9 Greate contribution! Many users are looking forward to this feature!

/lgtm
/approve

@kubeflow/wg-automl-leads Can you restart the flaky tests?

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elenzio9, kimwnasptd, tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 1429d61 into kubeflow:master Feb 14, 2023
@tenzen-y
Copy link
Member

@elenzio9 IIRC, we plan to include this feature in RC.1 release. So can you create a PR for cherry-picking of this PR like #2115?

elenzio9 added a commit to arrikto/katib that referenced this pull request Feb 15, 2023
…s details page in KWA

Signed-off-by: Elena Zioga <[email protected]>

---------

Signed-off-by: Elena Zioga <[email protected]>
elenzio9 added a commit to arrikto/katib that referenced this pull request Feb 15, 2023
@elenzio9
Copy link
Contributor Author

@elenzio9 IIRC, we plan to include this feature in RC.1 release. So can you create a PR for cherry-picking of this PR like #2115?

@tenzen-y Just sent the PR: #2117.

johnugeorge pushed a commit to johnugeorge/katib that referenced this pull request Feb 15, 2023
…ubeflow#2101)

* backend: Update error message when no logs could be found

* Update the message the backend sends to not just expose that logs are
  not there because 'retain' might not be set, but also because the
  cluster was scaled down.

Signed-off-by: Elena Zioga <[email protected]>

* frontend: Add LOGS tab in Trial details page

In this commit:

* Create a distinct LOGS tab, which displays the trial's logs in the
  Trial details page.
* Don't show the backend's error popup for logs, but show the message
  error in the admonition.

Signed-off-by: Elena Zioga <[email protected]>

---------

Signed-off-by: Elena Zioga <[email protected]>
@johnugeorge johnugeorge mentioned this pull request Feb 15, 2023
google-oss-prow bot pushed a commit that referenced this pull request Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants