-
Notifications
You must be signed in to change notification settings - Fork 485
Reduce excessive allocations in BannedSymbols analyzer. #6568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
return ParseDeclaredId(id, ref index); | ||
} | ||
|
||
private static (string ParentName, string SymbolName)? ParseDeclaredId(string id, ref int index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copied from https://github.com/dotnet/roslyn/blob/45128e6b4cf49d63c734877c9552d6764831b9b7/src/Compilers/Core/Portable/DocumentationCommentId.cs#L691 with everything extraneous cut out.
{ | ||
foreach (var bannedFileEntry in entries) | ||
{ | ||
foreach (var bannedSymbol in bannedFileEntry.Symbols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we defer creating the 'Symbols' for an entry until the point it is actually needed. this is hte majority of hte perf in here.
@@ -325,6 +361,9 @@ public BanFileEntry(string text, TextSpan span, SourceText sourceText, string pa | |||
Span = span; | |||
SourceText = sourceText; | |||
Path = path; | |||
|
|||
_lazySymbols = new Lazy<ImmutableArray<ISymbol>>( | |||
() => DocumentationCommentId.GetSymbolsForDeclarationId(DeclarationId, compilation)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is calling into the real (expensive) roslyn API again to find the symbols.
string parentName = ""; | ||
|
||
// process dotted names | ||
while (true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly, this could be a lot better. HOwever, i don't care much simple the logic is:
- trivial
- the same as the version in Roslyn
- there are usually only a couple hundred banned API items. So the allocations here are not a concern and not something to worry about. The thing we care about fixing is avoiding hte massively expensive symbol walk.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6568 +/- ##
==========================================
- Coverage 96.43% 96.43% -0.01%
==========================================
Files 1372 1373 +1
Lines 320265 320341 +76
Branches 10293 10309 +16
==========================================
+ Hits 308844 308908 +64
- Misses 8967 8977 +10
- Partials 2454 2456 +2 |
src/Microsoft.CodeAnalysis.BannedApiAnalyzers/Core/SymbolIsBannedAnalyzerBase.cs
Show resolved
Hide resolved
LGTM, signing off for what it's worth. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core problem here is that hte BannedSymbols analyzer starts with a top down walk that attempts to find the corresponding ISymbol for a symbol in the symbol tree. This can be quite expensive as the compiler has to realize all the symbols downwards (in order to look up those names), including walking into symbols that might be enormous (consider large namespaces, or types with 10s of thousands of members).
This PR flips the general idea of hte banned-analyzer around. Instead of upfront computing the list of banned symbols, we instead look for the names (and container names) specified in BannedSymbols.txt. Then, as we're analyzing, only if we hit a real symbol with those names, do we attempt to them go map the banned-symbol line back to actual ISymbols, which we then compare the current symbol being analyzed.
This means, if the code being analyzed never actually refers to the potential bad symbol, we don't pay any price looking it up.
Addresses 600 MB of allocations in a simple typing/lightbulb scenario: