-
Notifications
You must be signed in to change notification settings - Fork 43
[KVBlock.Index] Prometheus Metrics & Logging #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Maroon Ayoub <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all GIE metrics are added to the controller-runtime metrics registry which is then served from main.
You should be registering metrics into the same registry.
See https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/4ffb5f6dd4f6d004cfbcde6dd4da78009acdfba9/pkg/epp/metrics/metrics.go#L262 for example.
Signed-off-by: Maroon Ayoub <[email protected]>
Thank @elevran, I split the implicit registration call ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(very :-)) minor nit: consider tracking only hits, not both hits and miss
pkg/kvcache/metrics/collector.go
Outdated
// LookupRequests counts how many Lookup() calls have been made. | ||
LookupRequests = prometheus.NewCounter(prometheus.CounterOpts{ | ||
Namespace: "kvcache", Subsystem: "index", Name: "lookup_requests_total", | ||
Help: "Total number of lookup calls", | ||
}) | ||
// KeyLookupResults counts keys examined, labelled by result “hit” or | ||
// “miss”. | ||
KeyLookupResults = prometheus.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: "kvcache", Subsystem: "index", Name: "lookup_key_total", | ||
Help: "Number of keys looked up by result (hit/miss)", | ||
}, []string{"result"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to decipher what these mean.
if you have two counters, one for total lookups and one for total hits, you can use PromQL later to calculate rates over time periods as well as "trends".
The LookupRequests
seems to be the total lookups. Is KeyLookupResults
just two counters - one for hit and one for miss?
Can't you do miss-rate
= (total
-hit
)/total
on the fly instead of counting both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, done.
Signed-off-by: Maroon Ayoub <[email protected]>
Help: "Total number of lookup calls", | ||
}) | ||
// LookupHits counts how many keys were found in the cache on Lookup(). | ||
LookupHits = prometheus.NewCounter(prometheus.CounterOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do it per request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't that cause an infinite number of metrics (one per request since each would have a different label)?
If this is mostly for debug/validation, maybe we can consider a "tracing" header to the request so it logs the hit ratio when done? Requests without the special header won't log information to avoid flooding the log.
Summary
Added
kvblock.Index
metrics collection for prometheus, with support for logging periodically.