-
Couldn't load subscription status.
- Fork 2.6k
additional metrics for pcidevice and id to name conversion #3425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @SuperQ , this is a continuation of prometheus/procfs#748. Has some additional features as well. Please take a look and share your thoughts. |
|
Released and updated procfs. #3444 Please rebase. 🎉 |
* add sriov, power info support and pci id name resolution Signed-off-by: Jain Johny <[email protected]> * fix/remove debug lines Signed-off-by: Jain Johny <[email protected]> --------- Signed-off-by: Jain Johny <[email protected]>
* add numa_node and missing test output file Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
|
|
||
| pcideviceNumaNodeDesc = prometheus.NewDesc( | ||
| prometheus.BuildFQName(namespace, pcideviceSubsystem, "numa_node"), | ||
| "NUMA node number for the PCI device. -1 indicates unknown or not available.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to just not emit the metric if it's unknown / not available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
|
Looks good, a couple questions. |
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
There are two changes in this PR.
[1] Enabling additional metrics including SRIOV, power management details and numa node of the pci device. This depends on prometheus/procfs#748
[2] Optional (disabled by default) feature to convert pci vendor/device/class ids to names from the system pci.ids file or a file passed as argument.
I had an internal repo where I had done this earlier. Recently saw the PR for pcidevice_linux.go and thought it will be a good idea to add my work into the same collector.
I have also added nil pointer checks to all optional fields in sysfs.PciDevice struct.
I understand that some might find [2] unnecessary, but it has been very useful in our use cases (both LLM and human use). Hence I have added it but kept disabled by default. Please let me know your thoughts. @discordianfish
PS: Tests are failing because of the dependency on prometheus/procfs#748