-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Add interrupts collector filtering #3028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CC @rtreffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice 🥳
var ( | ||
interruptsInclude = kingpin.Flag("collector.interrupts.name-include", "Regexp of interrupts name to include (mutually exclusive to --collector.interrupts.name-exclude).").String() | ||
interruptsExclude = kingpin.Flag("collector.interrupts.name-exclude", "Regexp of interrupts name to exclude (mutually exclusive to --collector.interrupts.name-include).").String() | ||
interruptsIncludeZeros = kingpin.Flag("collector.interrupts.include-zeros", "Include interrupts that have a zero value").Default("true").Bool() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be in the README, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really have a full flag documentation in the README.
1c04df1
to
ae6f414
Compare
In order to reduce cardinality of the interrupts collector add filtering options * Add include/exclude regexp filter flags. * Add boolean flag to include zero values, enabled by default. Signed-off-by: Ben Kochie <[email protected]>
ae6f414
to
92367ab
Compare
To support the performance monitoring on mainnet, add a tool where custom metrics can be calculated and exported to prometheus's `node_exporter` through the `textfile` collector. Currently, the total number of TLB shootdowns across all CPUs will be exposed as `sum_tlb_shootdowns`, collected once per minute, as the latest `node_exporter` does not allow filtering of data of its built-in `interrupts` collector that could otherwise do it for us (until prometheus/node_exporter#3028 is included in the release branches) and will add many metrics with high cardinality otherwise. NODE-1445
To support the performance monitoring on mainnet, add a tool where custom metrics can be calculated and exported to prometheus's `node_exporter` through the `textfile` collector. Currently, the total number of TLB shootdowns across all CPUs will be exposed as `sum_tlb_shootdowns`, collected once per minute, as the latest `node_exporter` does not allow filtering of data of its built-in `interrupts` collector that could otherwise do it for us (until prometheus/node_exporter#3028 is included in the release branches) and will add many metrics with high cardinality otherwise. NODE-1445
In order to reduce cardinality of the interrupts collector add filtering options * Add include/exclude regexp filter flags. * Add boolean flag to include zero values, enabled by default. Signed-off-by: Ben Kochie <[email protected]> Signed-off-by: Vitaly Zhuravlev <[email protected]>
* [CHANGE] meminfo: Convert linux implementation to use procfs lib #3049 * [CHANGE] Update logging to use Go log/slog #3097 * [FEATURE] filesystem: Add `node_filesystem_mount_info` metric #2970 * [FEATURE] btrfs: Add metrics for commit statistics #3010 * [FEATURE] interrupts: Add collector include/exclude filtering #3028 * [FEATURE] interrupts: Add "exclude zeros" filtering #3028 * [FEATURE] slabinfo: Add filters for slab name. #3041 * [FEATURE] pressure: add IRQ PSI metrics #3048 * [FEATURE] hwmon: Add include and exclude filter for sensors #3072 * [FEATURE] filesystem: Add NetBSD support #3082 * [FEATURE] netdev: Add ifAlias label #3087 * [FEATURE] hwmon: Add Support for GPU Clock Frequencies #3093 * [FEATURE] Add `exclude[]` URL parameter #3116 * [FEATURE] Add AIX support #3136 * [FEATURE] filesystem: Add fs-types/mount-points include flags #3171 * [FEATURE] netstat: Add collector for tcp packet counters for FreeBSD. #3177 * [ENHANCEMENT] ethtool: Add logging for filtering flags #2979 * [ENHANCEMENT] netstat: Add TCPRcvQDrop to default metrics #3021 * [ENHANCEMENT] diskstats: Add block device rotational #3022 * [ENHANCEMENT] cpu: Support CPU online status #3032 * [ENHANCEMENT] arp: optimize interface name resolution #3133 * [ENHANCEMENT] textfile: Allow specifiying multiple directory globs #3135 * [ENHANCEMENT] filesystem: Add reporting of purgeable space on MacOS #3206 * [ENHANCEMENT] ethtool: Skip full scan of NetClass directories #3239 * [BUGFIX] zfs: Prevent `procfs` integer underflow #2961 * [BUGFIX] pressure: Fix collection on systems that do not expose a full CPU stat #3054 * [BUGFIX] cpu: Fix FreeBSD 32-bit host support and plug memory leak #3083 * [BUGFIX] hwmon: Add safety check to hwmon read #3134 * [BUGFIX] zfs: Allow space in dataset name #3186 Signed-off-by: Ben Kochie <[email protected]>
* [CHANGE] meminfo: Convert linux implementation to use procfs lib #3049 * [CHANGE] Update logging to use Go log/slog #3097 * [FEATURE] filesystem: Add `node_filesystem_mount_info` metric #2970 * [FEATURE] btrfs: Add metrics for commit statistics #3010 * [FEATURE] interrupts: Add collector include/exclude filtering #3028 * [FEATURE] interrupts: Add "exclude zeros" filtering #3028 * [FEATURE] slabinfo: Add filters for slab name. #3041 * [FEATURE] pressure: add IRQ PSI metrics #3048 * [FEATURE] hwmon: Add include and exclude filter for sensors #3072 * [FEATURE] filesystem: Add NetBSD support #3082 * [FEATURE] netdev: Add ifAlias label #3087 * [FEATURE] hwmon: Add Support for GPU Clock Frequencies #3093 * [FEATURE] Add `exclude[]` URL parameter #3116 * [FEATURE] Add AIX support #3136 * [FEATURE] filesystem: Add fs-types/mount-points include flags #3171 * [FEATURE] netstat: Add collector for tcp packet counters for FreeBSD. #3177 * [ENHANCEMENT] ethtool: Add logging for filtering flags #2979 * [ENHANCEMENT] netstat: Add TCPRcvQDrop to default metrics #3021 * [ENHANCEMENT] diskstats: Add block device rotational #3022 * [ENHANCEMENT] cpu: Support CPU online status #3032 * [ENHANCEMENT] arp: optimize interface name resolution #3133 * [ENHANCEMENT] textfile: Allow specifiying multiple directory globs #3135 * [ENHANCEMENT] filesystem: Add reporting of purgeable space on MacOS #3206 * [ENHANCEMENT] ethtool: Skip full scan of NetClass directories #3239 * [BUGFIX] zfs: Prevent `procfs` integer underflow #2961 * [BUGFIX] pressure: Fix collection on systems that do not expose a full CPU stat #3054 * [BUGFIX] cpu: Fix FreeBSD 32-bit host support and plug memory leak #3083 * [BUGFIX] hwmon: Add safety check to hwmon read #3134 * [BUGFIX] zfs: Allow space in dataset name #3186 Signed-off-by: Ben Kochie <[email protected]>
In order to reduce cardinality of the interrupts collector add filtering options