Skip to content

Conversation

ccastanedaucf
Copy link
Contributor

@ccastanedaucf ccastanedaucf commented Aug 12, 2025

Fixes

Reduces a significant amount of CPU and allocations in Lookup related to ItemDictionary overhead.

Before:
image

After (-2.8s of CPU - most of the remaining time is related to metadata Modifies, PropertyDictionary, and the outer scope which still uses ItemDictionary)
image

Allocations are harder to 1:1 granularly, but you can see on the left a total ~870MB reduction.

Before:
image

After (-870MB):
image

Context

Lookup uses a stack of Scope objects which each track multiple optional tables of ProjectItemInstance (implemented via ItemDictionary).

internal class Scope
{
      internal ItemDictionary<ProjectItemInstance> Items { get; set; }
      internal ItemDictionary<ProjectItemInstance> Adds { get; set; }
      internal ItemDictionary<ProjectItemInstance> Removes { get; set; }

      // More properties not relevant to this PR
}

For the context of the PR, here's a quick overview of how this currently works:

  • When a scope is "popped", each table except for Items is merged into the next scope. After this point, the popped scope is effectively discarded (see MergeScopeIntoNotLastScope()).
  • When the outer scope is reached, the set of item adds/removes + any metadata modifications are applied to the base scope's Items (see MergeScopeIntoLastScope()). Unlike other scopes, this base table is externally provided at the construction of the first Lookup.
    • The outer scope only exists to track the base collections passed into the first scope - as such, its adds/removes/modification tables are always empty.
  • GetItems() returns a non-destructive merge down the scope stack, stopping if either the outer scope is reached, or a scope is found with its Items table set.
    • Note that this currently creates an additional ItemDictionary, even though we only return a single item type list.
    • Intermediate Items tables generally only contain one item type with items, with others containing "empty markers". These are used to mask the base item dictionary, primarily in ItemBucket where each bucket (and therefore scope) holds a single type.

Currently, each of these intermediate tables are implemented via ItemDictionary, which internally uses a Dictionary<string, LinkedList> to represent item lists and a Dictionary<ProjectItemInstance, LinkedList> to provide O(1) access to indices. Empty markers are implemented by creating empty entries. Every method also takes a lock, since ItemDictionary is designed for use in other areas of MSBuild - whereas Lookup is only used in single-threaded contexts.

internal sealed class ItemDictionary<T> : IItemDictionary<T>
    where T : class, IKeyed, IItem
{
    private readonly Dictionary<string, LinkedList<T>> _itemLists;
    private readonly Dictionary<T, LinkedListNode<T>> _nodes;
}

This all adds overhead that can be avoided by replacing ItemDicitonary with basic collections for all inner scopes.

Changes Made

  • Introduce a lighter-weight ItemDictionarySlim to replace ItemDictionary inside of Lookup, while still providing similar helper methods.
internal class ItemDictionarySlim : IEnumerable<KeyValuePair<string, List<ProjectItemInstance>>>
{
    private readonly Dictionary<string, List<ProjectItemInstance>> _itemLists;

   // helper methods...
}
  • Implement empty markers as a reuseable frozen set - ItemTypesToTruncateAtThisScope
internal class Scope
{
      internal FrozenSet<string> ItemTypesToTruncateAtThisScope { get; set; }

      // Other properties...
}
  • Track removes as a simple list concatenation, instead of deduplicating at each scope
  • Avoid a host of intermediate allocations during GetItems

@Copilot Copilot AI review requested due to automatic review settings August 12, 2025 04:00
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes performance by replacing the ItemDictionary implementation in Lookup.Scope with a lighter-weight ItemDictionarySlim, reducing CPU usage and memory allocations. The changes eliminate locking overhead and simplify data structures used for tracking item additions, removals, and empty markers within lookup scopes.

  • Introduces ItemDictionarySlim to replace ItemDictionary for scope-specific item tracking
  • Replaces empty markers with a frozen set-based truncation approach using ItemTypesToTruncateAtThisScope
  • Optimizes item merging and removal logic to avoid duplicate checking and intermediate allocations

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
Lookup.cs Core performance optimization replacing ItemDictionary with ItemDictionarySlim and implementing new truncation mechanism
ItemDictionary.cs Removes unused methods (AddRange, Replace, AddEmptyMarker, HasEmptyMarker) to simplify API
IItemDictionary.cs Updates interface by removing unused method signatures
ImmutableItemDictionary.cs Removes overridden implementations of deleted interface methods
ItemBucket.cs Updates to use new truncation API instead of empty markers
BatchingEngine.cs Changes to use FrozenSet for item names in bucket creation
TargetEntry.cs Adds explicit truncation calls for incremental build scenarios
TargetUpToDateChecker.cs Replaces AddEmptyMarker calls with ImportItemsOfType
ItemGroupIntrinsicTask.cs Updates RemoveItems call signature to include item type parameter
Test files Updates test code to work with new APIs and data structures

/// </remarks>
internal void TruncateLookupsForItemTypes(ICollection<string> itemTypes)
{
ErrorUtilities.VerifyThrow(_lookupScopes.ItemTypesToTruncateAtThisScope == null, "Cannot add an itemgroup of this type.");
Copy link
Preview

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message "Cannot add an itemgroup of this type." is misleading and doesn't accurately describe the validation being performed. The check is verifying that truncation hasn't been set up yet, not that an item group can't be added.

Suggested change
ErrorUtilities.VerifyThrow(_lookupScopes.ItemTypesToTruncateAtThisScope == null, "Cannot add an itemgroup of this type.");
ErrorUtilities.VerifyThrow(_lookupScopes.ItemTypesToTruncateAtThisScope == null, "Item type truncation has already been set up for this scope.");

Copilot uses AI. Check for mistakes.

// Start with the item spec length as a fast filter for false matches.
if (evaluatedInclude.Length == removeAsTaskItem.EvaluatedIncludeEscaped.Length
&& StringComparer.Ordinal.Equals(evaluatedInclude, removeAsTaskItem.EvaluatedIncludeEscaped)
&& itemAsTaskItem == removeAsTaskItem)
Copy link
Preview

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference equality check itemAsTaskItem == removeAsTaskItem should come before the string comparison for better performance. Reference equality is much faster than string comparison, so checking it first can short-circuit the more expensive operations.

Suggested change
&& itemAsTaskItem == removeAsTaskItem)
if (itemAsTaskItem == removeAsTaskItem
|| (evaluatedInclude.Length == removeAsTaskItem.EvaluatedIncludeEscaped.Length
&& StringComparer.Ordinal.Equals(evaluatedInclude, removeAsTaskItem.EvaluatedIncludeEscaped)))

Copilot uses AI. Check for mistakes.

@@ -1025,7 +1082,7 @@ private void MustNotBeInAnyTables(ProjectItemInstance item)
/// </summary>
private void MustNotBeOuterScope()
{
ErrorUtilities.VerifyThrow(_lookupScopes.Count > 1, "Operation in outer scope not supported");
ErrorUtilities.VerifyThrow(_lookupScopes.Parent != null, "Operation in outer scope not supported");
Copy link
Preview

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition _lookupScopes.Parent != null seems inverted. Based on the error message "Operation in outer scope not supported" and the previous implementation that checked _lookupScopes.Count > 1, this should likely be checking that we're NOT in the outer scope, which would be _lookupScopes.Parent == null.

Suggested change
ErrorUtilities.VerifyThrow(_lookupScopes.Parent != null, "Operation in outer scope not supported");
ErrorUtilities.VerifyThrow(_lookupScopes.Parent == null, "Operation in outer scope not supported");

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants