Skip to content

[Merged by Bors] - added debounce to log #4269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

eserilev
Copy link
Member

@eserilev eserilev commented May 7, 2023

Issue Addressed

#4259

Proposed Changes

debounce spammy Unable to send message to the beacon processor log messages

Additional Info

We could potentially debounce other logs that have the potential to be "spammy".

After some feedback we decided to additionally add the following change:

create a newtype wrapper around mpsc::Sender<BeaconWorkEvent<T>>. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.

@AgeManning
Copy link
Member

This seems fine to me, however CI is failing. Could you run cargo fmt to get CI to pass.

@michaelsproul michaelsproul added the waiting-on-author The reviewer has suggested changes and awaits thier implementation. label May 9, 2023
@eserilev eserilev requested a review from michaelsproul May 9, 2023 05:48
@michaelsproul michaelsproul added ready-for-review The code is ready for review v4.3.0 Estimated Q2 2023 and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels May 25, 2023
Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me but I wouldn't mind input from e.g. @paulhauner on whether he thinks it's OK that we don't de-bounce the log messages by work_type. It could be kinda useful to get one new message for each work_type, but then again, if the node is overwhelmed maybe it doesn't matter too much what is hammering it.

I think if we did want to differentiate by work_type we'd need multiple latches, which might be more trouble than it's worth.

@michaelsproul michaelsproul requested a review from paulhauner May 30, 2023 06:09
@paulhauner
Copy link
Member

@paulhauner on whether he thinks it's OK that we don't de-bounce the log messages by work_type

I like this PR but I also agree with Michael's point. Separately, I've been keen to see some metrics about which work events are being dropped by the BeaconProcessor since I don't think we have anything like that atm. Doing that would mitigate the issue of the log debounce; we'd get less logs but be able to see which message types are being dropped by looking at metrics.

To ensure we're always logging dropped messages, I don't think it would suffice to just add a metric where we're debouncing these logs since there are multiple places where this beacon_processor_send is used. I think the easiest way to ensure we catch all uses would be to create a newtype wrapper around the mpsc::Sender<BeaconWorkEvent<T>> BeaconProcessor. When there is an error on the try_send method on the wrapper, it could increase a counter metric with one label per work type. Similar to this:

metrics::inc_counter_vec(
&metrics::BEACON_PROCESSOR_WORK_EVENTS_RX_COUNT,
&[event.work.str_id()],
);

I'm open to doing this change separately if it's too much work for you at the moment @eserilev ☺️ No pressure.

@paulhauner paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Jun 2, 2023
@eserilev
Copy link
Member Author

eserilev commented Jun 2, 2023

To ensure we're always logging dropped messages, I don't think it would suffice to just add a metric where we're debouncing these logs since there are multiple places where this beacon_processor_send is used. I think the easiest way to ensure we catch all uses would be to create a newtype wrapper around the mpsc::Sender<BeaconWorkEvent<T>> BeaconProcessor. When there is an error on the try_send method on the wrapper, it could increase a counter metric with one label per work type. Similar to this:

I like this idea, I can definitely include these changes as part of this PR. Thanks for the feedback

@eserilev
Copy link
Member Author

eserilev commented Jun 3, 2023

@paulhauner I made some changes here that hopefully should reflect the discussions above. It's ready for another review when you get the chance. Thanks!

@eserilev eserilev requested a review from michaelsproul June 5, 2023 07:11
@eserilev
Copy link
Member Author

I should also bounce error logs by work type, I'll push up those changes shortly

@eserilev
Copy link
Member Author

Looks good to me but I wouldn't mind input from e.g. @paulhauner on whether he thinks it's OK that we don't de-bounce the log messages by work_type. It could be kinda useful to get one new message for each work_type, but then again, if the node is overwhelmed maybe it doesn't matter too much what is hammering it.

I think if we did want to differentiate by work_type we'd need multiple latches, which might be more trouble than it's worth.

you're right, we would need additional latches to debounce log by work type. That might make things messy. Since we are now increasing a counter metric per work type, that may be sufficient enough?

@michaelsproul michaelsproul added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Jun 28, 2023
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is great as is! Thanks @eserilev!

@paulhauner paulhauner added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Jun 29, 2023
@eserilev
Copy link
Member Author

eserilev commented Jun 29, 2023

sorry, there was a linting issue that i just noticed on the new BeaconProcessorSend newtype wrapper impl

pub fn try_send(&self, message: WorkEvent<T>) -> Result<(), TrySendError<WorkEvent<T>>>

 the `Err`-variant is at least 256 bytes
    |
    = help: try reducing the size of `tokio::sync::mpsc::error::TrySendError<beacon_processor::WorkEvent<T>>`, for example by boxing large elements or replacing it with `Box<tokio::sync::mpsc::error::TrySendError<beacon_processor::WorkEvent<T>>>`

I resolved by Boxing TrySendError. I also had to 'unbox' the match statement in router.rs

@paulhauner
Copy link
Member

I resolved by Boxing TrySendError. I also had to 'unbox' the match statement in router.rs

Thanks, I'm happy with this! I've refactored the WorkEvent struct in #4435, so I don't think it's worth digging deeper here to shrink that struct.

@paulhauner
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Jun 30, 2023
## Issue Addressed

[#4259](#4259)

## Proposed Changes

debounce spammy `Unable to send message to the beacon processor` log messages

## Additional Info

We could potentially debounce other logs that have the potential to be "spammy". 

After some feedback we decided to additionally add the following change:

create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.
@bors
Copy link

bors bot commented Jun 30, 2023

@bors bors bot changed the title added debounce to log [Merged by Bors] - added debounce to log Jun 30, 2023
@bors bors bot closed this Jun 30, 2023
ghost pushed a commit to oone-world/lighthouse that referenced this pull request Jul 13, 2023
## Issue Addressed

[sigp#4259](sigp#4259)

## Proposed Changes

debounce spammy `Unable to send message to the beacon processor` log messages

## Additional Info

We could potentially debounce other logs that have the potential to be "spammy". 

After some feedback we decided to additionally add the following change:

create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
[sigp#4259](sigp#4259)

debounce spammy `Unable to send message to the beacon processor` log messages

We could potentially debounce other logs that have the potential to be "spammy".

After some feedback we decided to additionally add the following change:

create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
[sigp#4259](sigp#4259)

debounce spammy `Unable to send message to the beacon processor` log messages

We could potentially debounce other logs that have the potential to be "spammy".

After some feedback we decided to additionally add the following change:

create a newtype wrapper around `mpsc::Sender<BeaconWorkEvent<T>>`. When there is an error on the try_send method on the wrapper, we increase a counter metric with one label per work type.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-merge This PR is ready to merge. v4.3.0 Estimated Q2 2023
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants