[o11y] Drain incoming request io context when an js exception is thrown in tail stream custom event handler. #4415
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change how we drain the TailStreamCustomEventImpl incoming request to how it was before #4030 (fixing stream cancellation warning): This caused drain() to be invoked too late, resulting in "failed to invoke drain()" warnings and stream cancellation issues.
===============
Unfortunately I do not remember why I made this change – it seemed important, but testing suggests that tests still work fine without it. This may have caused the "failed to invoke drain()" warnings we've seen in testing (I believe that getting those is still possible if there is very bad timing where a custonEvent gets canceled right after run() is called, but it should be less common).
As to why this was only noticed after the seemingly unrelated #4354 and may have caused cancellation issues: I don't have any conclusive explanation, but it seems possible that this started happening because the streaming tail worker needs to do more work (3 more events from the top-level "worker"), making it more likely that it gets canceled when running under load (I was able to reproduce the issue locally when testing under load with CPU time limits set to near-zero). When there is a cancellation before drain() is called, we get a warning about it and may also see more cancellation issues since draining the IncomingRequest is needed for promises that are attached to the IoContext to not get cancelled.