-
Notifications
You must be signed in to change notification settings - Fork 3k
[receiver/datadog] Support 128 bits TraceID coming from Datadog #39654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[receiver/datadog] Support 128 bits TraceID coming from Datadog #39654
Conversation
d4a5fca
to
8012d98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to dig in to this issue!
This seems like a very useful fix, but would probably constitute a breaking change for existing users of the component. As such I think we'll need to make it opt-in using a featuregate.
You can check for examples of using featuregates in other components, but they're relatively easy to set up. If you have one set to StageAlpha, it will be disabled by default, and then in later versions we will move it to Beta (enabled by default) and then the gate can later be removed. The gate will then need to be configured at runtime using a flag like --feature-gates=receiver.datadogreceiver.Enable128BitTraceID
Here's an example of what defining the gate looks like in code, copied and modified slightly from another component
var FullTraceIDFeatureGate = featuregate.GlobalRegistry().MustRegister(
"receiver.datadogreceiver.Enable128BitTraceID",
featuregate.StageAlpha,
featuregate.WithRegisterDescription("<description."),
featuregate.WithRegisterFromVersion("v0.125.0"),
)
...
if FullTraceIDFeatureGate.Enabled() {
...
receiver/datadogreceiver/internal/translator/traces_translator.go
Outdated
Show resolved
Hide resolved
@dehaansa Thanks for reviewing! I've added a featuregate in af83cf2, let me know if anything. |
58129d5
to
014c6e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will try to get codeowner review before marking ready for merge.
71347ee
to
fa7e5d3
Compare
When testing activating
Full stack trace
I don't know how to get the details of the Datadog span that causes this problem. Here are my findings: |
receiver/datadogreceiver/internal/translator/traces_translator.go
Outdated
Show resolved
Hide resolved
Example exception 1:
|
9d8cbdb
to
b3f8d1d
Compare
Thanks for testing and the help debugging the issue! I've also improved error handling so as not to reject the whole |
b3f8d1d
to
0f8a876
Compare
I could successfully test the fix:
With otelcol 0.125.0, the traces of the |
Unrelated vulnerability addressed in #39862 |
Could we get a codeowner review? @boostchicken @gouthamve @MovieStoreGuy |
6e567c7
to
b130d80
Compare
Please resolve the conflicts :) |
With this commit, we add support for 128 bits TraceIDs coming from Datadog instrumented services. This can happen when an OTel instrumented service calls a downstream Datadog instrumented one. Datadog instrumentation libraries store the 128 bits TraceID into two different fields: * TraceID: lower 64 bits of the 128 bits TraceID * _dd.p.tid: upper 64 bits of the 128 bits TraceID This commit adds logic that reconstructs the 128 bits TraceID. Before this commit, only the lower 64 bits were used as TraceID. Fixes open-telemetry#36926
This helps with reusing trace id across multiple flushes. This commit also adds a new configuration item: `trace_id_cache_size`, which will be used to size the LRU cache. If translator.ToTraces doesn't get a cache (e.g. nil), it'll disable the feature.
As noted by @cyrille-leclerc, some spans failed to convert their trace id to 128 bits. This was due to a naive string concatenation vs properly splitting the trace id at the byte level as in uInt64ToTraceID. Previously, we were also rejecting the traces in the current ToTraces run. Now, we just skip converting and keep the 64-bit trace id if anything.
b130d80
to
6f904c0
Compare
@MovieStoreGuy done! |
…-telemetry#39654) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description With this commit, we add support for 128 bits TraceIDs coming from Datadog instrumented services. This can happen when an OTel instrumented service calls a downstream Datadog instrumented one. Datadog instrumentation libraries store the 128 bits TraceID into two different fields: * TraceID: lower 64 bits of the 128 bits TraceID * _dd.p.tid: upper 64 bits of the 128 bits TraceID This commit adds logic that reconstructs the 128 bits TraceID. Before this commit, only the lower 64 bits were used as TraceID. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#36926 <!--Describe what testing was performed and which tests were added.--> #### Testing Tested the setup with the following chain: OTel Instrumented Service --> Datadog Instrumented Service -> OTel Instrumented Service. The TraceID was maintained on the whole chain. Also added a unit test (`TestToTraces64to128bits`). #### Documentation Updated README.md in [58129d5](open-telemetry@58129d5)
…-telemetry#39654) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description With this commit, we add support for 128 bits TraceIDs coming from Datadog instrumented services. This can happen when an OTel instrumented service calls a downstream Datadog instrumented one. Datadog instrumentation libraries store the 128 bits TraceID into two different fields: * TraceID: lower 64 bits of the 128 bits TraceID * _dd.p.tid: upper 64 bits of the 128 bits TraceID This commit adds logic that reconstructs the 128 bits TraceID. Before this commit, only the lower 64 bits were used as TraceID. <!-- Issue number (e.g. #1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#36926 <!--Describe what testing was performed and which tests were added.--> #### Testing Tested the setup with the following chain: OTel Instrumented Service --> Datadog Instrumented Service -> OTel Instrumented Service. The TraceID was maintained on the whole chain. Also added a unit test (`TestToTraces64to128bits`). #### Documentation Updated README.md in [58129d5](open-telemetry@58129d5)
Description
With this commit, we add support for 128 bits TraceIDs coming from Datadog instrumented services. This can happen when an OTel instrumented service calls a downstream Datadog instrumented one. Datadog instrumentation libraries store the 128 bits TraceID into two different fields:
This commit adds logic that reconstructs the 128 bits TraceID. Before this commit, only the lower 64 bits were used as TraceID.
Link to tracking issue
Fixes #36926
Testing
Tested the setup with the following chain: OTel Instrumented Service --> Datadog Instrumented Service -> OTel Instrumented Service. The TraceID was maintained on the whole chain.
Also added a unit test (
TestToTraces64to128bits
).Documentation
Updated README.md in 58129d5