-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Fix duplicate metrics in AWX subsystem_metrics #15964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Fix duplicate metrics in AWX subsystem_metrics #15964
Conversation
52da7b8
to
a1de394
Compare
|
a1de394
to
72f1bb5
Compare
I checked the logs from one of the awx_collection tests and it gave this:
I think those artifacts are available publicly? You have to click on the check output, and then Summary, and then down at the bottom it has artifacts. That lets you get the logs which often tell what actually went wrong, like the above trace |
Hi @AlanCoding thanks - I've updated my approach based on the failing tests. My initial naive solution was to remove the metrics from one of the subsystems to eliminate the duplicates, but that isn't sufficient. In my recent changes, I renamed the This is technically a breaking change in the metrics API because three metrics have been removed, being replaced by three metrics for Callback Receiver and three for the Dispatcher. Can you please enable the checks again? |
Hello there ! |
Hi @AlanCoding can you please re-run the tests? |
Friendly ping @AlanCoding @chrismeyersfsu @kdelee |
|
SUMMARY
Fixes #15179.
The original PR (#14775) that introduced this bug included a non-empty
_METRICSLIST
in theMetrics
class which was subsequently subclassed twice - by theDispatcherMetrics
andCallbackMetrics
, causing duplicate metrics. This issue doesn't show up immediately upon startup. Once a workflow starts, this metrics appears twice in the/metrics
endpoint.This PR renames the
subsystem_*
metrics into separate metrics for both theCallbackReceiver
and theDispatcher
.This is technically a breaking change in the metrics API because three metrics have been removed, being replaced by three metrics for
CallbackReceiver
and three for theDispatcher
ISSUE TYPE
/metrics
endpointPrometheus logs:
Without this change, Prometheus drops the duplicate metrics and logs errors:
COMPONENT NAME
AWX VERSION
ADDITIONAL INFORMATION
See #15179 (comment) for context