Skip to content

Conversation

gizas
Copy link
Contributor

@gizas gizas commented Apr 4, 2025

Description

This is an ongoing work to add :

  • In the endOp function of ObsReport, errors are classified as either:
    • Downstream errors (refused) - when internal.IsDownstreamError(err) returns true
    • Internal errors - when the error is not a downstream error
  • New metrics otelcol_receiver_internal_errors_spans, otelcol_receiver_internal_errors_metric_points and otelcol_receiver_internal_errors_log_records in case of inetrnal errors returned above
  • Support of otelcol_receiver_requests metric that will determines the outcome of the request:
    • "success" if there's no error
    • "refused" if there's a downstream error
    • "failure" if there's an internal error

Link to tracking issue

Fixes #12207

Testing

Only added the integration tests.

Documentation

See updated receiver/receiverhelper/documentation.md

gizas added 17 commits April 3, 2025 18:30
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Apr 24, 2025
@jade-guiton-dd
Copy link
Contributor

@gizas Are you still working on this? Or should it be marked as ready for review?

@github-actions github-actions bot removed the Stale label Apr 25, 2025
@gizas
Copy link
Contributor Author

gizas commented Apr 25, 2025

@jade-guiton-dd I will try to fix the linting errors and will open it as ready for review. thanks for the ping

@gizas gizas marked this pull request as ready for review April 29, 2025 13:28
@gizas
Copy link
Contributor Author

gizas commented Aug 19, 2025

@jade-guiton-dd I think I have made some progress. I have made the choice to add a new flag additional to existing telemetry.newPipelineTelemetry
So if someone now wants to test this should run:

otelcol --config=config.yaml --feature-gates=+telemetry.newPipelineTelemetry,+telemetry.distinguishDownstreamErrors
Screenshot with some manual tests I have even built and run some manual tests and I can see e2e the metrics appear: Screenshot 2025-08-13 at 1 42 22 PM

Let me know wdyt

Signed-off-by: Andreas Gkizas <[email protected]>
@gizas
Copy link
Contributor Author

gizas commented Aug 19, 2025

Hm, it seems pretty weird that your PR would affect the coverage of tests in the exporterhelper. Are you sure it's related to your changes, and isn't due to a change that was made on main?

No are not related to my changes. But are indirect refernces. See the links above. Ok I will also remove them. I agree will be cleaner

Copy link
Contributor

@jade-guiton-dd jade-guiton-dd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good, although there are still a few issues to fix. The go.mod required versions are also still wrong.

@jade-guiton-dd
Copy link
Contributor

I'm surprised that test fails considering the feature gate should be disabled by default 🤔

Signed-off-by: Andreas Gkizas <[email protected]>
@jade-guiton-dd
Copy link
Contributor

jade-guiton-dd commented Aug 20, 2025

Looking at the contrib-tests currently running in CI, I don't see any failures for receiver/awsxrayreceiver. I'm wondering if you weren't testing against an older version of your code, with the incorrect logic in obsreport?

@gizas
Copy link
Contributor Author

gizas commented Aug 21, 2025

Looking at the contrib-tests currently running in CI, I don't see any failures for receiver/awsxrayreceiver. I'm wondering if you weren't testing against an older version of your code, with the incorrect logic in obsreport?

Indeed @jade-guiton-dd our comments crossed. I wrote it the comment for the contrib-tests before seeing yours to update the logic. So later I did the updates. (I saw some errors today but unrelated to our changes). I have fixed the lint errors and retriggered the CI. Let's see now

gizas added 2 commits August 21, 2025 10:56
Signed-off-by: Andreas Gkizas <[email protected]>
Signed-off-by: Andreas Gkizas <[email protected]>
@gizas
Copy link
Contributor Author

gizas commented Aug 22, 2025

@jade-guiton-dd do you think are we ready to merge? anything else missing?

Copy link
Contributor

@jade-guiton-dd jade-guiton-dd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're about done, I just have a few more suggestions regarding documentation. Thank you for your patience with all these back-and-forths.

@jade-guiton-dd jade-guiton-dd added the ready-to-merge Code review completed; ready to merge by maintainers label Aug 25, 2025
@mx-psi mx-psi added this pull request to the merge queue Aug 25, 2025
Merged via the queue into open-telemetry:main with commit a3ff02b Aug 25, 2025
60 of 75 checks passed
crobert-1 added a commit to signalfx/splunk-otel-collector that referenced this pull request Aug 28, 2025
otelcol_receiver_failed_metric_points was added upstream in
open-telemetry/opentelemetry-collector#12802
and is enabled by default. This test is failing because it expected
17 metrics, but is now receiving 18.
crobert-1 added a commit to signalfx/splunk-otel-collector that referenced this pull request Aug 28, 2025
* Update OpenTelemetry Dependencies to latest

* Update test to account for new internal telemetry metric

otelcol_receiver_failed_metric_points was added upstream in
open-telemetry/opentelemetry-collector#12802
and is enabled by default. This test is failing because it expected
17 metrics, but is now receiving 18.

* Same as 43d812b but for smart agent receiver teset

---------

Co-authored-by: Curtis Robert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[receiverhelper] Add metric for requests that failed to be received
4 participants