-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Fix deadlock when creating threads from ModuleInitializer in a shared library #118928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… library The problem is that the first time we reverse p/invoke into an API in the shared library, we need to do two things: initialize runtime and run module initializers. Running module initializers is done as part of runtime initialization in the shared library case because it was convenient to do it there (in the EXE case, we run them after runtime initialization, right before Main). The problem with running it as part of the runtime initialization is that the arbitrary user code can deadlock us (see the issues for stack). This separates runtime initialization and running module initializers. Doing just that doesn't fix the problem though because the deadlock we saw is part of thread creation. So we'd just deadlock on the module initializer runner. This introduces a new kind of reverse p/invoke enter routine that doesn't wait for module initializers. Instead we wait for module initializers before we are ready to run user code on the thread (and the thread is initialized just enough that ThreadStart can make progress). This also fixes a potential issue in the EXE case where threads created from module initializers could start running before we're finished running module initializers. Instead this introduces a new opportunity for deadlocking :(. I don't actually like this fix. But we need to do something about it because EventSource does exactly this (dotnet#118773). Fixes dotnet#118773. Fixes dotnet#107699.
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a deadlock issue that occurs when threads are created from ModuleInitializer methods in shared libraries. The problem was that runtime initialization and module initializer execution were tightly coupled, causing deadlocks when module initializers created threads that needed reverse p/invoke transitions.
Key changes:
- Separates runtime initialization from module initializer execution
- Introduces a "lite" reverse p/invoke transition that doesn't wait for module initializers
- Adds manual synchronization to ensure module initializers complete before user code runs on new threads
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
SharedLibrary.cs | Adds regression test that creates a thread in ModuleInitializer |
NativeLibraryStartupMethod.cs | Removes module initializer execution from runtime initialization |
CorInfoImpl.cs | Special-cases ThreadEntryPoint to use lite reverse p/invoke transition |
Thread.NativeAot.cs | Adds manual wait for module initializers before running user code |
Thread.NativeAot.Windows.cs | Documents special compiler handling of ThreadEntryPoint |
Thread.NativeAot.Unix.cs | Documents special compiler handling of ThreadEntryPoint |
thread.h | Adds EnsureModuleInitializersExecuted method and parameter to existing method |
thread.cpp | Implements separated runtime and module initializer logic |
StartupCodeHelpers.cs | Adds volatile flag and C-callable wrapper for module initializers |
main.cpp | Updates callback registration to handle both runtime and module initialization |
src/coreclr/nativeaot/System.Private.CoreLib/src/System/Threading/Thread.NativeAot.cs
Outdated
Show resolved
Hide resolved
I think also if a module initializer drops a finalizable object, finalizer thread may run without waiting. That is because we pre-attach the finalizer thread. Perhaps we should not. I do not think it is necessary to pre-attach. |
src/coreclr/nativeaot/System.Private.CoreLib/src/System/Threading/Thread.NativeAot.cs
Outdated
Show resolved
Hide resolved
I kind of feel the same. We may yet find some other ways the scenario can get us in trouble. The fix does seem to solve the issue at hand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise. Thanks!
Co-authored-by: Jan Kotas <[email protected]>
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-nativeaot-outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
It is fine with me to take it for .NET 10. |
/backport to release/10.0 |
Started backporting to release/10.0: https://github.com/dotnet/runtime/actions/runs/17228890766 |
The problem is that the first time we reverse p/invoke into an API in the shared library, we need to do two things: initialize runtime and run module initializers. Running module initializers is done as part of runtime initialization in the shared library case because it was convenient to do it there (in the EXE case, we run them after runtime initialization, right before Main).
The problem with running it as part of the runtime initialization is that the arbitrary user code can deadlock us (see the issues for stack).
This fixes the problem by attaching the newly created thread to the runtime before we do the reverse p/invoke. Since thread is already attached, we don't wait for runtime initialization. Waiting is unnecessary because the runtime needs to be really far along the initialization path if we got to a managed
Thread.Start
call.Fixes #118773. Fixes #107699.
Cc @dotnet/ilc-contrib