-
Notifications
You must be signed in to change notification settings - Fork 485
Cache the ApiData values we produce for an AdditionalText file. #6556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -147,83 +169,110 @@ static void RegisterImplActions(CompilationStartAnalysisContext compilationConte | |||
} | |||
} | |||
|
|||
private static ApiData ReadApiData(List<(string path, SourceText sourceText)> data, bool isShippedApi) | |||
private static ApiData ReadApiData(string path, SourceText sourceText, bool isShippedApi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this now computes teh ApiData for a single file. handling of multiple files is taken care of by the caller.
@@ -20,21 +20,24 @@ namespace Microsoft.CodeAnalysis.PublicApiAnalyzers | |||
{ | |||
public partial class DeclarePublicApiAnalyzer : DiagnosticAnalyzer | |||
{ | |||
private sealed record AdditionalFileInfo(string Path, SourceText SourceText, bool IsShippedApi); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
common data shared among all ApiLine instances.
private sealed class ApiLine | ||
private sealed record AdditionalFileInfo(string Path, SourceText SourceText, bool IsShippedApi); | ||
|
||
private readonly record struct ApiLine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made into a simple struct, which also has a pointer to the common data stored across all lines from teh same file.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6556 +/- ##
=======================================
Coverage 96.43% 96.43%
=======================================
Files 1372 1372
Lines 320278 320264 -14
Branches 10295 10293 -2
=======================================
- Hits 308848 308839 -9
+ Misses 8975 8971 -4
+ Partials 2455 2454 -1 |
} | ||
public SourceText SourceText => FileInfo.SourceText; | ||
public string Path => FileInfo.Path; | ||
public bool IsShippedApi => FileInfo?.IsShippedApi is true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a struct, and used in an fashion where it can be in the default state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should the other ones be checked then too?
maxNullableRank = Math.Max(rank, maxNullableRank); | ||
continue; | ||
} | ||
lineNumber++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup yup. im preserving semantics here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @sharwell can verify it's the intended behavior, it definitely doesn't match what the variable name indicates to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my preference is to not change this. this was the behavior from before. If we change it, we risk breaking things for some customer who does have blank lines and then the nullable directive :)
foreach (var (path, sourceText) in data) | ||
{ | ||
int rank = -1; | ||
// current line we're on. Note: we ignore whitespace lines when computin gthis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ | ||
if (!TryGetApiText(analyzerOptions.AdditionalFiles, isPublic, cancellationToken, out var shippedText, out var unshippedText)) | ||
using var allShippedData = ArrayBuilder<ApiData>.GetInstance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Roslyn one is disposable as well, just a slightly different pattern (due to compiler team concerns)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, I'm not seeing the Roslyn ones as IDisposable from code inspection, but I'm probably missing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have a struct disposable wrapper around ours. look for using var _ = ArrayBuilder...
in roslyn.sln :)
/// Cache from additional text instance to the api data we have read out for that specific file. We only store | ||
/// data for additional texts that explicitly match the public/internal api file names we expect. | ||
/// </summary> | ||
private static readonly ConditionalWeakTable<AdditionalText, ApiData> s_additionalTextToApiData = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don’t use a CWT here. We already have an API on AnalysisContext to allow caching per-file data across compilations: https://github.com/dotnet/roslyn/blob/3a7a7407ea3c831630ecf2754092c33df3a6e452/src/Compilers/Core/Portable/DiagnosticAnalyzer/DiagnosticAnalysisContext.cs#L234
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally, we also use a CWT, so the API already provides a simple way to achieve caching:
Note
Example test analyzer for this API: https://github.com/dotnet/roslyn/blob/3a7a7407ea3c831630ecf2754092c33df3a6e452/src/Compilers/Test/Core/Diagnostics/CommonDiagnosticAnalyzers.cs#L1605
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I can do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So i don't seem to be able to use that helper. it is keyed off of SourceText. But avoiding the source-text is the point here. I need to keep off of AdditionalFile. Thoughts @mavasani ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But avoiding the source-text is the point here
I am not sure I understand. SourceText is already strongly held by the AdditionalText: https://github.com/dotnet/roslyn/blob/3a7a7407ea3c831630ecf2754092c33df3a6e452/src/Compilers/Core/Portable/AdditionalTextFile.cs#L49-L53
Isn't the key point here avoiding repeated compute of the data associated with each AdditionalText
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Went deep down the rabit hole of trying to make this keyed off sourcetext. It's painful. Effectively, the data we compute off the source-text holds onto data outside of the source text (specifically, things like the path and other information about the original additionaltext). I went down the path of trying to extract this out (e.g. RawApiData, ApiData, RawApiLine, ApiLine) and have the map point from the SourceText to the raw data. But then there was so much wrapping of this i needed to do to make working with things palatable (and not have additional copies/allocatins happening).
If we make it so that hte context objects allow for storing arbitrary K/V and not just Text/Value or Tree/Value, then this will be trivial to do.
I discussed this with manish, and we decided that would take too much time. So we're going to accept that this is a CWT. However, i'm making it non-static so it has the same lifetime as the analyzer (And whatever owns it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I'd like a signoff from @ToddGrun before merge.
src/PublicApiAnalyzers/Core/Analyzers/DeclarePublicApiAnalyzer.cs
Outdated
Show resolved
Hide resolved
LGTM. I am curious why we reach this code so often (I believe it's multiple times per keystroke). I thought Cyrus mentioned that we had a cache that should hit before this point. |
@mavasani it would def be good to see if there's something causing caching of diagnostic info to not work properly. Or if the diagnostic subsystem itself is calling this stuff multiple times over different codepaths. |
Typing in roslyn shows this to be the highest allocation hitter. This is because diagnostics will end up calling into this analyzer, which then rereads teh SourceText for these additional files, over and over again.
For example, this was typing just for just a few seconds: