Skip to content

perf collector not working on EPYC CPU #2469

@Xaraxia

Description

@Xaraxia

Host operating system:

4.18.0-348.el8.0.2.x86_64 #1 SMP Sun Nov 14 00:51:12 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

node_exporter version: output of node_exporter --version

node_exporter, version 1.3.1 (branch: HEAD, revision: a2321e7b940ddcff26873612bccdf7cd4c42b6b6)
  build user:       root@243aafa5525c
  build date:       20211205-11:09:49
  go version:       go1.17.3
  platform:         linux/amd64

node_exporter command line flags

--collector.perf

Are you running node_exporter in Docker?

No

What did you do that produced an error?

Ran the exporter

What did you expect to see?

Correctly running, able to get metrics

What did you see instead?

(Note running as root to exclude sysctl/capability issues, but we see the same issue running as a user with these set appropriately):

ts=2022-09-14T00:55:11.418Z caller=node_exporter.go:182 level=info msg="Starting node_exporter" version="(version=1.3.1, branch=HEAD, revision=a2321e7b940ddcff26873612bccdf7cd4c42b6b6)"
ts=2022-09-14T00:55:11.418Z caller=node_exporter.go:183 level=info msg="Build context" build_context="(go=go1.17.3, user=root@243aafa5525c, date=20211205-11:09:49)"
ts=2022-09-14T00:55:11.418Z caller=node_exporter.go:185 level=warn msg="Node Exporter is running as root user. This exporter is designed to run as unpriviledged user, root is not required."
ts=2022-09-14T00:55:11.419Z caller=filesystem_common.go:111 level=info collector=filesystem msg="Parsed flag --collector.filesystem.mount-points-exclude" flag=^/(dev|proc|run/credentials/.+|sys|var/lib/docker/.+)($|/)
ts=2022-09-14T00:55:11.419Z caller=filesystem_common.go:113 level=info collector=filesystem msg="Parsed flag --collector.filesystem.fs-types-exclude" flag=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
panic: Couldn't create metrics handler: couldn't create collector: Failed to setup bus cycles profiler: pid (-1) cpu (0) "no such file or directory"; Failed to setup ref CPU cycles profiler: pid (-1) cpu (0) "no such file or directory"

goroutine 1 [running]:
main.newHandler(0x1, 0x28, {0xbfa3c0, 0xc00035a4c0})
	/app/node_exporter.go:66 +0x245
main.main()
	/app/node_exporter.go:188 +0x1165

I've done a fair bit of digging, and as far as I can tell this is being thrown all the way up the chain from https://pkg.go.dev/golang.org/x/sys/unix#PerfEventOpen

I'm not sure whether the issue is in the perf-utils or higher/lower in the chain.

Perf is installed, perf list hw gives the following:

  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  cache-misses                                       [Hardware event]
  cache-references                                   [Hardware event]
  cpu-cycles OR cycles                               [Hardware event]
  instructions                                       [Hardware event]
  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]

So I'd expect bus-cycles to fail because it's not there. Source code is using PERF_COUNT_HW_REF_CPU_CYCLES rather than PERF_COUNT_HW_CPU_CYCLES, which is appropriate from what I'm reading, but I suspect is also not supported on our architecture.

I think there needs to be some configuration options regarding which of these are being tracked (or automatic detection, but that's going to be more work to code up for little gain IMO).

Thanks for the work on the exporter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions