-
Notifications
You must be signed in to change notification settings - Fork 249
A51 update: add ServerMetricRecorder API and QPS field #345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
markdroth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good! Comments are all minor.
I'll let @ejona86 be the approver for this PR, since he was the approver for this gRFC.
Please let me know if you have any questions. Thanks!
markdroth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! @ejona86 should also approve before merging.
|
Updated after today's sync:
|
A51-custom-backend-metrics.md
Outdated
| function recordUtilizationMetric(String name, double value); | ||
| // Records an opaque named metric measurement for the call. | ||
| function recordNamedMetricsMetric(String name, double value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The named_metrics and eps changes do not belong in this PR. We'll make those changes in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. I think I accidentally rebased on top of the follow-up change (eps/named metrics).
| // Records number of queries per second to the server in the range [0, infy). | ||
| // Values outside of the valid range are rejected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the WIP grpc java implementation, they are ignored, not rejected. Should we clarify here if by "rejected" we mean "ignored" - because rejected may imply the method could throw.
On the other hand, I'm a bit afraid if we just ignore the value. I agree that throwing is a bad idea, but just swallowing an error leads to reporting incorrect data. I don't think we can judge if skewing a particular metric is critical to a customer.
One example that come to mind: a customer is storing qps internally in an int, the service gets a huge number of requests, and the int gets overflowed. In this case, they'll call setQps() with a negative value, when in fact there was a huge spike. If another service goes down because of a qps spike, but our service doesn't report the qps increase, an engineer responding to the incident won't be able to identify the source of the request spike correctly.
I wonder if it'd make sense require logging a warning when an incorrect value is presented to the MetricRecorder/CallMetricRecorder. However, depending on how often the data is reported, it might lead to the logs getting flooded. Obviously, we can restrict how often to produce these warnings, but this may be going to far out-of-scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is for C++ and it's from comments in code so I don't think "rejected" may confuse. Java snippet does not have this word.
Regarding the range validation, I believe the caller can validate or log in such cases. Note that recorder APIs already ignore with no feedback today (before this change) and this PR just clarifies that. Today the value will be rejected during proto's range validation and/or when consumed by the client where neither is explicit.
No description provided.