Skip to content

Conversation

@andrewlock
Copy link
Member

@andrewlock andrewlock commented Sep 16, 2025

Summary of changes

  • Extract a MutableSettings type from TracerSettings
  • Extract a Raw type from ExporterSettings

Reason for change

We're working on refactoring how we handle dynamic/remote/config in code settings i.e. settings which can change at runtime. As a first step, this PR extracts those settings to their own type, called MutableSettings because they mutable during the lifetime of the app.

Feel free to suggest other names for this type, or we can alternatively bikeshed it later.

Additionally, extracted the "raw" settings from ExporterSettings. These are the values which are read from config sources. The actual values of ExporterSettings are set based on these values, using a highly convoluted (backward compatible) series of methods, but the idea is: if the Raw settings haven't changed, the ExporterSettings haven't changed.

This isn't strictly true due to the File.Exists call we have, but I believe this is good enough for our purposes.

Implementation details

  • Create MutableSettings and move all the properties from TracerSettings that can change to it.
  • Update TracerSettings to create an instance of MutableSettings, and simply pass-through properties to it.
    • This should mean existing functionality is unaffected by this PR
  • Implement IEquatable<MutableSettings> for future comparisons between MutableSettings instances
    • Unfortunately, can't use a record here or auto-gen the implementation, because we need to handle equivalence of the dictionaries.
  • Extract the "raw" setting reading to an ExporterSettings nested-type
    • Opted for nested here, because unlike MutableSettings (which will eventually live separately from TracerSettings) we won't expose Raw to consumers - they'll still use ExporterSettings

Test coverage

This is just a refactoring, so it's covered by existing tests.

Additionally, I added a test for the IEquatable implementation (which is similar to the test we have for ImmutableDynamicSettings) to ensure the implementation is updated if we add more properties.

Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack

@andrewlock andrewlock added area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) type:refactor labels Sep 16, 2025
@dd-trace-dotnet-ci-bot
Copy link

dd-trace-dotnet-ci-bot bot commented Sep 16, 2025

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7522) - mean (72ms)  : 71, 73
     .   : milestone, 72,
    master - mean (72ms)  : 71, 73
     .   : milestone, 72,

    section Baseline
    This PR (7522) - mean (68ms)  : 66, 70
     .   : milestone, 68,
    master - mean (68ms)  : 66, 70
     .   : milestone, 68,

    section CallTarget+Inlining+NGEN
    This PR (7522) - mean (1,006ms)  : 984, 1027
     .   : milestone, 1006,
    master - mean (1,002ms)  : 968, 1036
     .   : milestone, 1002,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7522) - mean (106ms)  : 105, 108
     .   : milestone, 106,
    master - mean (106ms)  : 105, 107
     .   : milestone, 106,

    section Baseline
    This PR (7522) - mean (106ms)  : 103, 108
     .   : milestone, 106,
    master - mean (105ms)  : 103, 108
     .   : milestone, 105,

    section CallTarget+Inlining+NGEN
    This PR (7522) - mean (707ms)  : 691, 723
     .   : milestone, 707,
    master - mean (713ms)  : 695, 731
     .   : milestone, 713,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7522) - mean (94ms)  : 93, 95
     .   : milestone, 94,
    master - mean (94ms)  : 93, 95
     .   : milestone, 94,

    section Baseline
    This PR (7522) - mean (93ms)  : 92, 95
     .   : milestone, 93,
    master - mean (93ms)  : 91, 95
     .   : milestone, 93,

    section CallTarget+Inlining+NGEN
    This PR (7522) - mean (670ms)  : 650, 690
     .   : milestone, 670,
    master - mean (668ms)  : 649, 687
     .   : milestone, 668,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7522) - mean (93ms)  : 92, 94
     .   : milestone, 93,
    master - mean (93ms)  : 91, 94
     .   : milestone, 93,

    section Baseline
    This PR (7522) - mean (92ms)  : 90, 94
     .   : milestone, 92,
    master - mean (92ms)  : 90, 94
     .   : milestone, 92,

    section CallTarget+Inlining+NGEN
    This PR (7522) - mean (596ms)  : 581, 612
     .   : milestone, 596,
    master - mean (601ms)  : 589, 612
     .   : milestone, 601,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7522) - mean (193ms)  : 189, 197
     .   : milestone, 193,
    master - mean (194ms)  : 190, 198
     .   : milestone, 194,

    section Baseline
    This PR (7522) - mean (190ms)  : 183, 197
     .   : milestone, 190,
    master - mean (191ms)  : 187, 196
     .   : milestone, 191,

    section CallTarget+Inlining+NGEN
    This PR (7522) - mean (1,098ms)  : 1065, 1132
     .   : milestone, 1098,
    master - mean (1,101ms)  : 1072, 1130
     .   : milestone, 1101,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7522) - mean (273ms)  : 267, 280
     .   : milestone, 273,
    master - mean (275ms)  : 268, 281
     .   : milestone, 275,

    section Baseline
    This PR (7522) - mean (272ms)  : 264, 279
     .   : milestone, 272,
    master - mean (273ms)  : 267, 279
     .   : milestone, 273,

    section CallTarget+Inlining+NGEN
    This PR (7522) - mean (887ms)  : 848, 925
     .   : milestone, 887,
    master - mean (895ms)  : 860, 930
     .   : milestone, 895,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7522) - mean (267ms)  : 261, 272
     .   : milestone, 267,
    master - mean (267ms)  : 260, 274
     .   : milestone, 267,

    section Baseline
    This PR (7522) - mean (267ms)  : 261, 273
     .   : milestone, 267,
    master - mean (267ms)  : 262, 272
     .   : milestone, 267,

    section CallTarget+Inlining+NGEN
    This PR (7522) - mean (876ms)  : 839, 913
     .   : milestone, 876,
    master - mean (872ms)  : 839, 905
     .   : milestone, 872,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7522) - mean (264ms)  : 257, 272
     .   : milestone, 264,
    master - mean (264ms)  : 257, 271
     .   : milestone, 264,

    section Baseline
    This PR (7522) - mean (265ms)  : 259, 271
     .   : milestone, 265,
    master - mean (264ms)  : 256, 273
     .   : milestone, 264,

    section CallTarget+Inlining+NGEN
    This PR (7522) - mean (783ms)  : 763, 804
     .   : milestone, 783,
    master - mean (789ms)  : 766, 812
     .   : milestone, 789,

Loading

@datadog-datadog-prod-us1

This comment has been minimized.

@andrewlock andrewlock force-pushed the andrew/settings/3a-mutable-settings branch 3 times, most recently from 4c49a69 to 94f1261 Compare September 16, 2025 16:10
@andrewlock
Copy link
Member Author

@codex review

@chatgpt-codex-connector
Copy link

To use Codex here, create a Codex account and connect to github.

@andrewlock
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex Review: Here are some suggestions.

Reply with @codex fix comments to fix any unresolved comments.

About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".

@andrewlock andrewlock force-pushed the andrew/settings/3a-mutable-settings branch from 94f1261 to c4d1f08 Compare September 17, 2025 09:08
@andrewlock andrewlock marked this pull request as ready for review September 17, 2025 14:54
@andrewlock andrewlock requested review from a team as code owners September 17, 2025 14:54
Copy link
Collaborator

@bouwkast bouwkast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I've assumed most of the code was just a move with the equality / Raw stuff being new which looked fine to me (and the new tests)

How often does the new big equality check run? I don't think it is very often right? Thinking in terms of performance that you mentioned

@andrewlock
Copy link
Member Author

LGTM

I've assumed most of the code was just a move with the equality / Raw stuff being new which looked fine to me (and the new tests)

Thanks, yes, I meant to add comments to the PR about that and totally forgot, sorry 🤦‍♂️

How often does the new big equality check run? I don't think it is very often right? Thinking in terms of performance that you mentioned

As of this PR it doesn't really ever run, but in subsequent PRs it runs every time you do configuration in code, or every time there's a dynamic/remote config update. Overall, that should be relatively rare, and there's a lot else going on, so it's not hot path. Obviously we should optimise what we can, but it's not critical path 🙂

…t runtime.

This is just the first part of a settings overhaul to more gracefully handle the fact that these settings can change at runtime. For now, there's no practical change, this is just a minimal first step to keep the changes atomic and reviewable.
Required as part of the ExporterSettings refactoring
- Make internal properties public
- Add Creation via static method
@andrewlock andrewlock force-pushed the andrew/settings/3a-mutable-settings branch from c4d1f08 to 6e8443e Compare September 19, 2025 10:22

ValidationWarnings = new List<string>();

source ??= NullConfigurationSource.Instance;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config is all just moved as-is to the Raw nested type

AzureAppServiceMetadata = new ImmutableAzureAppServiceSettings(source, _telemetry);
}

var otelTags = config
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all moved as-is to MutableSettings

@andrewlock andrewlock merged commit 79da0d6 into master Sep 23, 2025
153 checks passed
@andrewlock andrewlock deleted the andrew/settings/3a-mutable-settings branch September 23, 2025 14:04
@github-actions github-actions bot added this to the vNext-v3 milestone Sep 23, 2025
andrewlock added a commit that referenced this pull request Oct 15, 2025
## Summary of changes

Rebuild and re-assign `MutableSettings` when dynamic config (remote or
config in code) changes.

## Reason for change

This is part of a general stack to extract the "mutable" configuration
from static config that is fixed for the lifetime of the app. In the
[previous PR](#7522) we
moved mutable settings to their own type, but otherwise left things
unchanged and just rebuilt everything whenever anything changes.

In this PR we move towards combining the dynamic/code configuration,
handling changes by _only_ rebuilding the `MutableSettings` (not
`TracerSettings`) and handling all the fallout that causes for
telemetry.

This is very much still a "stop gap"; we still rebuild everything
(_except_ the `TracerSettings` object) when these settings changes.

## Implementation details

- Create "global" config sources for dynamic settings and code settings
- There's actually a bug today where we clobber dynamic settings if we
change things in code because we're not storing the sources globally.
- When dynamic config or config in code changes
- Create a new `MutableSettings` object based only on dynamic sources
(with a fallback for the "static" values)
  - For config in code, we also check for changes to `ExporterSettings`
- If there's no discernable changes, bail out - no need to tear down the
world
- If there _are_ changes, Mutate `TracerSettings`, and do the normal
"reconfigure everything"
- Remove the (now unused `ImmutableDynamicSettingsTests`)

Note that there are technically some behaviour changes in this PR:
- `useDefaultSources: false` only ignores env vars etc for values that
can be set through code. Other settings will always use the default
sources.
- `StatsComutationEnabled` can not be _set_ via code.

## Test coverage

This is still all technically a "refactoring", so should be covered by
existing tests 🤞

## Other details


https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack
- #7522
- #7652
- #7525 👈
- #7530
- #7532
- #7543
- #7544
andrewlock added a commit that referenced this pull request Oct 15, 2025
## Summary of changes

- Expose `DefaultServiceName` on `MutableSettings` instead of on
`TracerManager`
- Expose `MutableSettings` on `PerTraceSettings`

## Reason for change

The `DefaultServiceName` depends on `ServiceName`, which can change at
runtime, so `MutableSettings` seems like the best place for it
(eventually we will only have a single `TracerManager` and
`TracerSettings` per lifetime

> Note that this also solves an existing edge-case bug when customers do
config in code and already-finished traces are serialized with the
"incorrect" default service name.

## Implementation details

Mostly commit-by-commit but:

- Move the "fallback application name" calculation to a helper class
- Expose the fallback application name on `TracerSettings`
- Created as a `Lazy<>` because this calculation can be kind of
expensive, and isn't necessary if the customer specifies a service name.
  - Exposed on `TracerSettings` because it doesn't change
- Expose `DefaultServiceName` on `MutableSettings` instead of
`TracerManager`.
- Update usages to point to the new location

Additionally, there are some places where I believe we were
"incorrectly" using the `DD_SERVICE` value, and ignoring the real
fallback name.
- [x] Service Discovery's `StoreTracerMetadata` - @anna-git can you
confirm if I'm correct that this _should_ be using the "calculated"
service name as a fallback?
- [x] `TraceExporterConfiguration` - @ganeshnj, same question, can you
confirm that we should be passing the "calculated" service name, not
just the "explictly set" service name?

One additional aspect I think we should consider: 
- Currently we're calling `NormalizeService(serviceName)` in a few
places. Is there any reason we shouldn't be _always_ normalizing the
service name?
- e.g. why don't we normalize `DefaultServiceName` automatically?

## Test coverage

Covered by existing tests generally

## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack

- #7522
- #7525
- #7530 👈
- #7532
- #7543
- #7544
andrewlock added a commit that referenced this pull request Oct 16, 2025
## Summary of changes

Don't pass a `Tracer` instance to `PerTraceSettings.GetServiceName()`
(as it's not required)

## Reason for change

Cleanup/simplify the code.

#7530 exposes `MutableSettings` and `DefaultServiceName` on
`PerTraceSettings`. Reading `DefaultServiceName` was the only reason
`GetServiceName()` took a `Tracer`, so this is now uneccessary.

## Implementation details

- Use `MutableSettings.DefaultServiceName` directly inside
`PerTraceSettings.GetServiceName()`
- Stop passing in `Tracer` instance
- Cleanup usages

## Test coverage

Covered by existing, just a refactoring

## Other details

Could have done it as part of #7530 but trying to keep things small!

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack

- #7522
- #7525
- #7530
- #7532 👈
- #7543
- #7544
andrewlock added a commit that referenced this pull request Oct 22, 2025
…ropriate (#7543)

## Summary of changes

Fix usages of `Tracer.Instance.Settings` to use
`Tracer.Instance.CurrentTracerSettings.Settings` where appropriate

## Reason for change

This PR "fixes" the places that were previously grabbing the
environment/service etc from `TracerSettings` to use the
`MutableSettings` exposed via `CurrentTracerSettings` instead, as the
location where these settings will ultimately exist.

This is effectively still just a refactoring, but prepares for the point
where these settings aren't exposed on `TracerSettings` at all. The
updates in this PR are for cases where you _don't_ have long-lived
services, and rather need to do ad-hoc `Tracer.Instance` grabbing of the
setting values in a global context. Note too that many of these places
_could_ be updated in the future to subscribe to changes if that
provides performance benefits. Also note that I elected not to change
most calls to `IsIntegrationEnabled()` etc in this PR as there are
hundreds of locations. The follow up PR handles that

Also found a few cases that were incorrectly assuming that these values
cannot change. Marked them with 'TODO: Subscribe to changes in settings'

## Implementation details

- Mostly find and replace to use `CurrentTracerSettings.Settings`
- Occasional extraction of a variable where it makes sense to avoid
repeated access
- Functionally identical currently (where `MutableSettings` is replaced
on `TracerSettings`) but will be a required change once we stop
replacing `TracerManager`.

## Test coverage

Covered by existing tests

## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack

- #7522
- #7525
- #7530
- #7532
- #7543 👈
- #7544
andrewlock added a commit that referenced this pull request Oct 22, 2025
…tIntegrationAnalyticsSampleRate()` (#7544)

## Summary of changes

Fix usages of `IsIntegrationEnabled()`, `IsErrorStatusCode()`, and
`GetIntegrationAnalyticsSampleRate()` to use `MutableSettings` instead
of `TracerSettings`

## Reason for change

These functions are dependent on `MutableSettings`, and are exposed
there, so making sure we call the methods there, and remove the
delegation from `TracerSettings` entirely.

## Implementation details

- Find and replace usages
- Remove the old delegating methods

## Test coverage

Just a refactor, so covered by existing tests

## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack

- #7522
- #7525
- #7530
- #7532
- #7543
- #7544 👈
- #7695
andrewlock added a commit that referenced this pull request Oct 31, 2025
…ndows (#7721)

## Summary of changes

Enforces that you can't _change_ the `AgentUri` to be a UDS Uri if
you're on Windows

## Reason for change

The trace exporter doesn't work with UDS on Windows, so we have a check
in `TracerSettings` that disables the pipeline if we find this scenario.
However, user's can still _change_ the agent URI at runtime in code (😭).

We currently assume that the data pipeline won't be toggled at runtime
(we _do_ allow for reconfiguring it in general, but not for completely
removing or reintroducing). Changing this to allow the scenario would be
a pain, so instead this PR blocks you from setting a UDS URI in code if
you're on Windows.

The good news is that as far as I can tell, noone does this today, so
while _technically_ it could be considered a breaking change, I think
it's ok.

## Implementation details

- Throw an `ArgumentException` in the Datadog.Trace.Manual library, if
you're on Windows (or .NET FX) and you try to set a UDS agent URI (using
the same "detection" we do in `ExporterSettings`.
- Add a check in the Instrumentation of `Tracer.Configure()` to make
sure it hasn't slipped through. This could happen if a customer was
using an old version of the Datadog.Trace NuGet package with a newer
version of auto instrumentation.

Note that this adds two additional framework references for .NET Core
3.1+, to check if we're on Windows.

## Test coverage

Added an extra step to the manual instrumentation integration test to
confirm we throw

## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack

- #7522
- #7525
- #7530
- #7532
- #7543
- #7544
- #7721 👈
- #7722
- #7695
- #7723
- #7724
andrewlock added a commit that referenced this pull request Nov 3, 2025
## Summary of changes

Add a helper for comparing `ReadOnlyDictionary<>` instances

## Reason for change

As part of the config work, we need to detect if tags have changed when
customers do a manual/remote config update. This helper makes it easy

## Implementation details

Added a `SequenceEqual` extension method.

Note that I used `SequenceEqual` because it _already_ exists in
_System.Linq_, but I could see an argument that it's too easy to use the
wrong one, and instead we could use a different name? `IsSameAs(other)`?

Also, I only wrote this for `ReadOnlyDictionary<>` because that's all we
need, it's what we use for all our setting dictionaries, and it will be
(a tiny bit) faster than making it `IDictionary<>`, but happy to change
if people feel strongly.

## Test coverage

Added unit tests

## Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack


- #7522
- #7525
- #7530
- #7532
- #7543
- #7544
- #7721
- #7722 👈
- #7695
- #7723
- #7724

---------

Co-authored-by: Steven Bouwkamp <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) type:refactor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants