Skip to content

Conversation

JamesNK
Copy link
Member

@JamesNK JamesNK commented Aug 20, 2024

Description

Fixes #4893

This PR fixes log order changing and line numbers increasing. Note that it doesn't address anything related to logs from ILogger inside the app host. That can be split out into its own issue and fixed later.

Changes:

  • Turns on timestamp in executable logs
  • Host parses date from incoming log lines
  • Inserts log lines in backlog in date sorted order
  • Adds some logging unit tests

console-logs-line-number-fix

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?
    • Yes
      • Link to aspire-docs issue:
    • No
Microsoft Reviewers: Open in CodeFlow

@JamesNK JamesNK added the area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication label Aug 20, 2024
@JamesNK JamesNK requested a review from mitchdenny as a code owner August 20, 2024 03:59
@JamesNK JamesNK force-pushed the jamesnk/console-logs-fix branch from cece5a8 to 947a905 Compare August 20, 2024 07:20
@JamesNK JamesNK merged commit 2416287 into main Aug 20, 2024
@JamesNK JamesNK deleted the jamesnk/console-logs-fix branch August 20, 2024 14:08
{
var startupStderrStream = await kubernetesService.GetLogStreamAsync(resource, Logs.StreamTypeStartupStdErr, follow: true, timestamps: timestamps, cancellationToken).ConfigureAwait(false);
var startupStdoutStream = await kubernetesService.GetLogStreamAsync(resource, Logs.StreamTypeStartupStdOut, follow: true, timestamps: timestamps, cancellationToken).ConfigureAwait(false);
var startupStderrStream = await kubernetesService.GetLogStreamAsync(resource, Logs.StreamTypeStartupStdErr, follow: true, timestamps: true, cancellationToken).ConfigureAwait(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we only ever call GetLogStreamAsync with timestamps: true. So we can remove that parameter to make the code easier to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep the timestamps parameter on the KubernetesService class in case we find a use case for logs without timestamps in future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always add it back, once we find a need. It is internal.

Comment on lines +313 to +319
var delay = TimeSpan.FromMilliseconds(100);

// There could be an initial burst of logs as they're replayed. Give them the opportunity to be loaded
// into the backlog in the correct order and returned before streaming logs as they arrive.
for (var i = 0; i < 3; i++)
{
await Task.Delay(delay, cancellationToken).ConfigureAwait(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fragile. Is there a way to use concurrency controls (reset events, semaphores, etc) to ensure the ordering happens correctly instead of delaying based on time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that's not the best.

The problem is that ordering values in the backlog doesn't mean much once you start to stream log items as they're received. Sending items as they're received is fine after they logs have been replayed from DCP.

I think what is needed from DCP is an indication of how many lines are being replayed. Then we can tell the logger service that the backlog is full (and all the log items in it are sorted correctly), and that it can send the backlog and send new log lines in real time.

Copy link
Member

@eerhardt eerhardt Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the existing flushBacklogSync doesn't work. We delay writing to the channel until we've grabbed the backlog snapshot.

And then we check to ensure the same message wasn't already in the backlog.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s not the problem this change is fixing

see #4893 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to "Add a small delay to ensure the backlog is replayed from DCP and ordered correctly."? Why isn't the order of operations:

  1. Grab the backlog snapshot
  2. The following can happen concurrently
    a. After the snapshot has been made, as new logs get written, write them to the channel
    b. Write the snapshot to the listener
  3. After the snapshot has been written, write the messages from the channel

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grab the backlog snapshot

This doesn't work because it's empty. When you view the console logs page, and the request goes to the host to display logs, the snapshot has nothing in it. We can't immediately start streaming all the initial logs that DCP returns to us because we receive them out of order.

The delay is to give the opportunity for DCP to send us the initial logs, them to be put in order in the backlog, then the backlog returned to the UI, then new logs are streamed as they arrive.

@davidfowl
Copy link
Member

I don't fully understand this change tbh. Why is there a random delay? Does that happen for custom resources or is this only for DCP based logs? Why isn't the date parsing in that logic and why is it in the core of the resource logger?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication

Projects

None yet

5 participants