Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/9.0-staging] Fix race condition in cleanup of collectible thread static variables #111275

Open
wants to merge 5 commits into
base: release/9.0-staging
Choose a base branch
from

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Jan 10, 2025

Backport of #111257 to release/9.0-staging

/cc @davidwrighton

Customer Impact

  • Customer reported
  • Found internally

This race condition causes an access violation in the EE accessing null when a collectible assembly is partially collected and a thread is terminated. And that thread used a collectible tls static. Found by dnceng in a CI environment.

Regression

  • Yes
  • No

This was introduced with the statics rewrite.

Testing

New stress test was written to verify the fix. A cut down variant of the stress test has been added as part of the fix.

Risk

Low , fix is effectively a null check that just skips doing the problematic operation.

IMPORTANT: If this backport is for a servicing release, please verify that:

  • The PR target branch is release/X.0-staging, not release/X.0.

Package authoring no longer needed in .NET 9

IMPORTANT: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version.
Keep in mind that we still need package authoring in .NET 8 and older versions.

davidwrighton and others added 5 commits January 10, 2025 15:16
There was a race condition where we could have collected all of the managed state of a LoaderAllocator, but not yet started cleaning up the actual LoaderAllocator object in native code. If a thread which had a TLS variable defined in a code associated with a collectible loader allocator was terminated at that point, then the runtime would crash.

The fix is to detect if the LoaderAllocator managed state is still alive, and if so, do not attempt to clean it up.
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 10, 2025
@jeffschwMSFT jeffschwMSFT added the Servicing-consider Issue for next servicing release review label Jan 10, 2025
@jeffschwMSFT jeffschwMSFT modified the milestones: 9.0.2, 9.0.x Jan 10, 2025
Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. we will take for consideration in 9.0.x

@carlossanlop
Copy link
Member

@davidwrighton @jeffschwMSFT friendly reminder that today's code complete for the Feb 2025 Release. Please merge this change by 4pm PT if you'd like it included in that release version. Otherwise, it will have to wait until next month.

@davidwrighton davidwrighton self-assigned this Jan 14, 2025
@jeffhandley jeffhandley added area-VM-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 21, 2025
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-coreclr Servicing-consider Issue for next servicing release review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants