-
Notifications
You must be signed in to change notification settings - Fork 814
Fix race between deadline and Stream.ReadAsync #1550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
please give us clear example about this issue, i am using Grpc.Net.Client 2.56.0 but exception occur and stopped the service from working i did not find any solution. System.InvalidOperationException: 'Can't write the message because the previous write is in progress.' This exception was originally thrown at this call stack: |
…on tests. (#7457) ## Summary of changes [This error](https://app.datadoghq.com/error-tracking?query=source%3Adotnet%20-%2AStackOverflowException%2A%20-%2Adebugger%2A%20-%2Aappsec%2A&et-side=data&et-viz=events&order=total_count&refresh_mode=sliding&source=all&sp=%5B%7B%22p%22%3A%7B%22issueId%22%3A%225b4a9f2a-3ae6-11f0-9077-da7ad0900002%22%7D%2C%22i%22%3A%22error-tracking-issue%22%7D%5D&view=spans&from_ts=1756801478434&to_ts=1756887878434&live=true) was logged: ``` Error : Exception occurred when calling the CallTarget integration continuation. System.TypeInitializationException ---> Datadog.Trace.ClrProfiler.CallTarget.CallTargetInvokerException ---> System.ArgumentException at Datadog.Trace.Util.ThrowHelper.ThrowArgumentException(String message) at Datadog.Trace.ClrProfiler.CallTarget.Handlers.IntegrationMapper.CreateBeginMethodDelegate(Type integrationType, Type targetType, Type[] argumentsTypes) at Datadog.Trace.ClrProfiler.CallTarget.Handlers.BeginMethodHandler`6..cctor() ``` This error would only happen when method Grpc.AspNetCore.Server.Internal.GrpcProtocolHelpers.BuildHttpErrorResponse is called, which happens when there is an error during the Grpc call. In order to fix it, the signature of OnMethodBegin<TTarget, TGrpcStatusCode> in GrpcProtocolHelpersBuildHttpErrorResponseIntegration.cs had to be changed by adding generic types. To test this change, a new case has been added to the GRPC sample application. By adding this header to the request (context.Request.Headers["content-type"] = "application/json";), we can test this instrumentation by causing an error during the GRPC call: Content-Type 'application/json' is not supported Additionally, some fixes have been done in the test app. In the following code, we were always throwing the exception because the delay set in theVerySlow method was 300 msecs, lower than 600, so the method would finish on time, which is not the expected outcome : ``` try { _logger.LogInformation("Sending very slow request to self"); await client.VerySlowAsync(new HelloRequest { Name = "GreeterClient" }, deadline: DateTime.UtcNow.AddMilliseconds(600)); throw new Exception("Received reply, when should have exceeded deadline"); } ``` By setting a lower deadline, we make sure that the deadline is exceeded. We were not detecting this unexpected behavior because in grpctests.cs, we were capturing the exception ExitCodeException: ``` catch (ExitCodeException) { // There is a race condition in GRPC version < v2.43.0 that can cause ObjectDisposedException // when a deadline is exceeded. Skip the test if we hit it: grpc/grpc-dotnet#1550 if (!string.IsNullOrEmpty(packageVersion) && new Version(packageVersion) < new Version("2.43.0") && processResult is not null && processResult.StandardError.Contains("ObjectDisposedException")) { throw new SkipException("Hit race condition in GRPC deadline exceeded"); } } ``` After capturing the exception, we would not launch an SkipException but we would end the test without errors. A new throw statement has been added to control that we only skip the tests in the case of ObjectDisposedException. This change in the GRPC tests has also effectively enabled the GRPC legacy tests, that have a different instrumentation but were suffering the same issue (throw exception and skip). The snapshots of the legacy tests were not matching exactly the result obtained after enabling these tests. The instrumentation of the legacy methods or the legacy sample app have not changed significantly, so we would probably need to evaluate if the results that we are getting are the expected ones. Anyway, this work would probably be out of the scope of this PR. ## Reason for change ## Implementation details ## Test coverage ## Other details <!-- Fixes #{issue} --> <!--⚠️ Note: Where possible, please obtain 2 approvals prior to merging. Unless CODEOWNERS specifies otherwise, for external teams it is typically best to have one review from a team member, and one review from apm-dotnet. Trivial changes do not require 2 reviews. MergeQueue is NOT enabled in this repository. If you have write access to the repo, the PR has 1-2 approvals (see above), and all of the required checks have passed, you can use the Squash and Merge button to merge the PR. If you don't have write access, or you need help, reach out in the #apm-dotnet channel in Slack. -->
Fixes #1547
Http2ReadStream.ReadAsyncchecks for dispose before cancellation. If cancellation + dispose happens on another threadin between ReadAsync calls, e.g. deadline timer, thenObjectDisposedExceptioncan be thrown. This PR ensuresDeadlineExceededstatus is correctly reported.