Skip to content

Conversation

fhanau
Copy link
Contributor

@fhanau fhanau commented Jun 26, 2025

Change how we drain the TailStreamCustomEventImpl incoming request to how it was before #4030 (fixing stream cancellation warning): This caused drain() to be invoked too late, resulting in "failed to invoke drain()" warnings and stream cancellation issues.

===============
Unfortunately I do not remember why I made this change – it seemed important, but testing suggests that tests still work fine without it. This may have caused the "failed to invoke drain()" warnings we've seen in testing (I believe that getting those is still possible if there is very bad timing where a custonEvent gets canceled right after run() is called, but it should be less common).
As to why this was only noticed after the seemingly unrelated #4354 and may have caused cancellation issues: I don't have any conclusive explanation, but it seems possible that this started happening because the streaming tail worker needs to do more work (3 more events from the top-level "worker"), making it more likely that it gets canceled when running under load (I was able to reproduce the issue locally when testing under load with CPU time limits set to near-zero). When there is a cancellation before drain() is called, we get a warning about it and may also see more cancellation issues since draining the IncomingRequest is needed for promises that are attached to the IoContext to not get cancelled.

@fhanau fhanau requested review from jasnell and mar-cf June 26, 2025 18:32
@fhanau fhanau requested review from a team as code owners June 26, 2025 18:32
@mar-cf
Copy link
Contributor

mar-cf commented Jun 26, 2025

Add tests to validate the effects of this change.

@fhanau
Copy link
Contributor Author

fhanau commented Jun 27, 2025

Add tests to validate the effects of this change.

How? The only observable effect of this change is drain() happening when a request gets canceled since drain is called earlier, and the "failed to invoke drain()" warning as well as cancellation warnings not happening. Requests being canceled generally causes tests to fail, and we have no "check if a log is absent from a failing test" mechanism.
As a corollary, we don't know for sure if this fixes the issues we've seen in testing (I would like to validate that too), but I expect that the given change will resolve at least one source of failed to invoke drain and stream/promise cancellations. There may be more so cases, but those can't always be avoided (if an entire request gets canceled) and we may want to limit logging for them at a later time. @jasnell should weigh in here too, but I think this change can largely address the issues in testing, but we can't really know until we've tried landed it.

Change how we drain the TailStreamCustomEventImpl incoming request to how it was
before #4030 (fixing stream cancellation warning): This caused drain() to be
invoked too late, resulting in "failed to invoke drain()" warnings and stream
cancellation issues.
@fhanau fhanau force-pushed the felix/062625-STW-drain branch from 56cfe60 to 50d514e Compare July 2, 2025 16:00
@mar-cf mar-cf changed the title [o11y] Revert STW event drain() change [o11y] Drain incoming request io context when an js exception is thrown in tail stream custom event handler. Jul 7, 2025
@fhanau fhanau merged commit 3b20700 into main Jul 7, 2025
19 checks passed
@fhanau fhanau deleted the felix/062625-STW-drain branch July 7, 2025 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants