-
Notifications
You must be signed in to change notification settings - Fork 134
Open
Description
Performance issue observed on oDAO node with metrics collection taking excessive time to respond, suggesting metrics are collected on-demand during query rather than continuously maintained.
Evidence:
-
Metric endpoint response times:
- from localhost:
time curl -s 0:9102/metrics 0.00s user 0.01s system 0% cpu 19.347 total
- from prometheus slave:
time curl http://10.13.0.58:9102/metrics 0.00s user 0.01s system 0% cpu 44.452 total
- from localhost:
-
Impact visible in monitoring:
- Significant increase in TCP socket TIMEWAIT states
- File descriptors for rocketpool process show elevated numbers
- No corresponding increase in system load
Suggested improvement:
Consider implementing continuous metric collection instead of on-demand gathering during scrape requests to reduce response latency.
jakubgs and jshufro
Metadata
Metadata
Assignees
Labels
No labels