Skip to content

Conversation

@jj-asama
Copy link
Contributor

@jj-asama jj-asama commented Sep 19, 2025

There are two changes in this PR.

[1] Enabling additional metrics including SRIOV, power management details and numa node of the pci device. This depends on prometheus/procfs#748
[2] Optional (disabled by default) feature to convert pci vendor/device/class ids to names from the system pci.ids file or a file passed as argument.

I had an internal repo where I had done this earlier. Recently saw the PR for pcidevice_linux.go and thought it will be a good idea to add my work into the same collector.
I have also added nil pointer checks to all optional fields in sysfs.PciDevice struct.

I understand that some might find [2] unnecessary, but it has been very useful in our use cases (both LLM and human use). Hence I have added it but kept disabled by default. Please let me know your thoughts. @discordianfish

PS: Tests are failing because of the dependency on prometheus/procfs#748

@jj-asama
Copy link
Contributor Author

Hi @SuperQ , this is a continuation of prometheus/procfs#748. Has some additional features as well. Please take a look and share your thoughts.

@SuperQ
Copy link
Member

SuperQ commented Oct 22, 2025

Released and updated procfs. #3444

Please rebase. 🎉

* add sriov, power info support and pci id name resolution

Signed-off-by: Jain Johny <[email protected]>

* fix/remove debug lines

Signed-off-by: Jain Johny <[email protected]>

---------

Signed-off-by: Jain Johny <[email protected]>
* add numa_node and missing test output file

Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>
Signed-off-by: Jain Johny <[email protected]>

pcideviceNumaNodeDesc = prometheus.NewDesc(
prometheus.BuildFQName(namespace, pcideviceSubsystem, "numa_node"),
"NUMA node number for the PCI device. -1 indicates unknown or not available.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to just not emit the metric if it's unknown / not available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@SuperQ
Copy link
Member

SuperQ commented Oct 22, 2025

Looks good, a couple questions.

@SuperQ SuperQ merged commit 2f8f920 into prometheus:master Oct 22, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants