-
Couldn't load subscription status.
- Fork 701
Bump go.opentelemetry.io/proto/otlp to v1.7.0
#4212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
go.opentelemetry.io/proto/otlp to v1.7.0go.opentelemetry.io/proto/otlp to v1.7.0
no. I made the dumps with something like this +var dumpNo = atomic.Int64{}
+
func (h *ingestHandler) Export(ctx context.Context, er *pprofileotlp.ExportProfilesServiceRequest) (*pprofileotlp.ExportProfilesServiceResponse, error) {
+ dump, _ := proto.Marshal(er)
+ _ = os.WriteFile(fmt.Sprintf("dump_%d", dumpNo.Add(1)), dump, 0666)I suggest you regenerate the first testcase |
|
Other than that it looks good to me. Does the new version work for asprof? 🤞 🥳 |
Ok thanks, let me know if you're happy with the new profiles in
Works like a charm 😃 Just tested with OTLP |
|
Head branch was pushed to by a user without write access
Sure: 75f3b60
Yeah I can do that, I'll follow up in the coming days |
|
I'm looking at the test failure, will push again soon |
Head branch was pushed to by a user without write access
|
Fixed, I wrongly assumed at least one sample type should always be provided |
|
As a side note, I noticed Pyroscope crashes altogether when it ingests an invalid OTLP payload. For instance, I had an off-by-one bug involving the string table in one of my profiles. This is not related to this PR, is it something that should be looked at? |
Do you know when and why this is not the case? How did it work before?
We better not panic and fix the panic. At the same time I thought we had a per-request panic recovery mechanisms. If we do not have them for otlp, we should add them. We don't have to do this in this PR and do a followups if needed. |
| res.dst.PeriodType = res.convertValueTypeBack(src.PeriodType, dictionary) | ||
|
|
||
| var defaultSampleType string | ||
| if len(src.SampleType) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I am confused a bit. What is the value of DefaultSampleTypeIndex if len(src.SampleType) is 0?
Do you think we should reject such profiles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know when and why this is not the case?
I think there should always be one item in Profile.sample_type for each item in Profile.sample.value. Having e.g. len(Profile.sample.value) == 1 and len(Profile.sample_type) == 0 seems like a malformed profile to me.
How did it work before?
There used to be a field called default_sample_type_strindex which would be used to index directly into the string table. However, this has been replaced with default_sample_type_index, which should be used to index into Profile.sample_type. So we didn't need to check anything, we could just take whatever value was at the first index of the string table.
What is the value of DefaultSampleTypeIndex if len(src.SampleType) is 0?
DefaultSampleTypeIndex can be unset, in which case it defaults to zero. If len(src.SampleType) == 0, accessing src.SampleType[DefaultSampleTypeIndex] will cause an out-of-bounds exception. Thus, the value of DefaultSampleTypeIndex is invalid, and the profile is malformed.
Do you think we should reject such profiles?
I could not find an indication in the OTEL profiling spec about whether profiles with an empty sample_type list should be considered malformed and rejected. I would find it logical, otherwise there should be a mandated default sample type.
I'll think about it, and maybe raise this topic at the next Profiling SIG meeting. For the time being, I think assigning a default value to defaultSampleType is a safe choice.
|
Thank you for working on this |
This reverts commit 837db07.
|
The PR caused panics in our internal dev environment, where we dogfood OTLP format, so I had to revert it: #4218 |
|
Hi @kolesnikovae, sorry for that. Where can I find the configuration code for this test? I'd like to check it out locally. If it's using |
| func serviceNameFromSample(sample *otelProfile.Sample, dictionary *otelProfile.ProfilesDictionary) string { | ||
| for _, attributeIndex := range sample.AttributeIndices { | ||
| attribute := p.AttributeTable[attributeIndex] | ||
| attribute := dictionary.AttributeTable[attributeIndex] | ||
| if attribute.Key == serviceNameKey { | ||
| return attribute.Value.GetStringValue() | ||
| } | ||
| } | ||
| return "" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look to deep, but it feels like a major change in the OTel spec, where we add a Dictionary member (to share parts across resource profiles). In the code we assume that the member is always present, and we hit nil deference the first time we access it – on this line.
Although it is an experimental API in dev, let's handle it more gracefully: if specs don't say anything about the fallback, we should drop the profile/request.
|
It's all cool @fandreuz – no worries. I think it's here https://github.com/grafana/pyroscope/pull/4212/files#r2122527475 We should add sanitization for things like broken references, semantic issues, off-by-one errors, etc. All sorts of issues are totally expected – clearly out of scope of the PR – I'll create an issue to make sure we sanitize everything and handle such failures more gracefully. Update: created #4220 |
In this PR I update the OTLP version to
v1.7.0.All unit tests are green, but integration tests in
pkg/test/integration/ingest_otlp_test.goaren't:.pb.binfiles intestdatamust be updated. Is there an automated way to run the eBPF profiler on some known programs to refresh the test data? I put together an updated version of opentelemetry-ebpf-profiler consuming open-telemetry/opentelemetry-collector#13075.