-
Couldn't load subscription status.
- Fork 2.8k
Description
Issue Description
Using --sdnotify=healthy to notify systemd when a container is healthy leads to a panic.
Steps to reproduce the issue
Run a rootless container as a systemd service with -sdnotify=healthy.
ExecStart=/usr/bin/podman run --log-level=debug --sdnotify=healthy --cgroups=split --rm --replace --health-cmd="/usr/bin/pg_isready -U postgres" --network=none docker.io/postgres:16
Type=notify
NotifyAccess=all
Describe the results you received
Systemd is not notified and the service start-up times out.
Relevant entries from the journal:
May 08 21:24:06 podman[483918]: 2024-05-08 21:24:06.539529547 +0000 UTC m=+0.157916012 container init 17674e7c92fd7b87a541d6dd58cea965d892f2a65c380170315bf8f1a133c035 (image=docker.io/library/postgres:16, name=postgres, io.containers.autoupdate=registry, PODMAN_SYSTEMD_UNIT=container-postgres.service)
May 08 21:24:06 podman[483918]: time="2024-05-08T21:24:06Z" level=debug msg="Attaching to container 17674e7c92fd7b87a541d6dd58cea965d892f2a65c380170315bf8f1a133c035"
May 08 21:24:06 podman[483918]: time="2024-05-08T21:24:06Z" level=debug msg="Starting container 17674e7c92fd7b87a541d6dd58cea965d892f2a65c380170315bf8f1a133c035 with command [docker-entrypoint.sh postgres]"
May 08 21:24:06 conmon[483930]: conmon 17674e7c92fd7b87a541 <ndebug>: Accepted console connection 11
May 08 21:24:06 podman[483918]: time="2024-05-08T21:24:06Z" level=debug msg="Started container 17674e7c92fd7b87a541d6dd58cea965d892f2a65c380170315bf8f1a133c035"
May 08 21:24:06 podman[483918]: time="2024-05-08T21:24:06Z" level=debug msg="Notify sent successfully"
May 08 21:24:06 podman[483918]: 2024-05-08 21:24:06.573788763 +0000 UTC m=+0.192175219 container start 17674e7c92fd7b87a541d6dd58cea965d892f2a65c380170315bf8f1a133c035 (image=docker.io/library/postgres:16, name=postgres, PODMAN_SYSTEMD_UNIT=container-postgres.service, io.containers.autoupdate=registry)
May 08 21:24:06 podman[483918]: panic: operation not permitted
May 08 21:24:06 podman[483918]: goroutine 28 [running]:
May 08 21:24:06 podman[483918]: github.com/containers/podman/v5/libpod/lock.(*SHMLock).Unlock(0xc000118680?)
May 08 21:24:06 podman[483918]: /build/podman/src/podman/libpod/lock/shm_lock_manager_linux.go:135 +0x48
May 08 21:24:06 podman[483918]: github.com/containers/podman/v5/libpod.(*Container).start(0xc000118680, {0x60103c2dd8f8, 0x60103d06ac60})
May 08 21:24:06 podman[483918]: /build/podman/src/podman/libpod/container_internal.go:1311 +0x4b8
May 08 21:24:06 podman[483918]: github.com/containers/podman/v5/libpod.(*ConmonOCIRuntime).Attach(0xc0001d5e40, 0xc000118680, 0xc0004ee060)
May 08 21:24:06 podman[483918]: /build/podman/src/podman/libpod/oci_conmon_attach_common.go:89 +0x345
May 08 21:24:06 podman[483918]: github.com/containers/podman/v5/libpod.(*Container).StartAndAttach.func2()
May 08 21:24:06 podman[483918]: /build/podman/src/podman/libpod/container_api.go:185 +0x1ab
May 08 21:24:06 podman[483918]: created by github.com/containers/podman/v5/libpod.(*Container).StartAndAttach in goroutine 1
May 08 21:24:06 podman[483918]: /build/podman/src/podman/libpod/container_api.go:173 +0x318
May 08 21:24:06 postgres[483930]:
May 08 21:24:06 conmon[483930]: conmon 17674e7c92fd7b87a541 <nwarn>: Failed to write to remote console socket
May 08 21:24:06 postgres[483930]: PostgreSQL Database directory appears to contain a database; Skipping initialization
May 08 21:24:06 conmon[483930]: conmon 17674e7c92fd7b87a541 <ndebug>: Closing 11
[..]
May 08 21:26:07 podman[484109]: 2024-05-08 21:26:07.478242298 +0000 UTC m=+0.149717560 container health_status 16d63de9b28af0c05314339082a33b7c6c9847ef8c0f3c35348cbf68e5527c2a (image=docker.io/library/postgres:16, name=postgres, health_status=healthy, io.containers.autoupdate=registry, PODMAN_SYSTEMD_UNIT=container-postgres.service)
Note: the "Notify sent successfully" in the log refers to the message where the MAINPID was sent.
There should be another another message with the same text when the "READY" message is sent, which does not happen due to the panic.
Describe the results you expected
The goroutine should not panic, wait for the container to be healty and notify systemd.
podman info output
host:
arch: amd64
buildahVersion: 1.35.3
cgroupControllers:
- cpuset
- cpu
- io
- memory
- hugetlb
- pids
- rdma
- misc
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: /usr/bin/conmon is owned by conmon 1:2.1.11-1
path: /usr/bin/conmon
version: 'conmon version 2.1.10, commit: e21e7c85b7637e622f21c57675bf1154fc8b1866'
cpuUtilization:
idlePercent: 98.31
systemPercent: 0.86
userPercent: 0.83
cpus: 4
databaseBackend: boltdb
distribution:
distribution: arch
version: unknown
eventLogger: journald
freeLocks: 2048
hostname: vmd79514
idMappings:
gidmap: null
uidmap: null
kernel: 6.6.30-1-lts
linkmode: dynamic
logDriver: journald
memFree: 180203520
memTotal: 8331169792
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: Unknown
package: /usr/lib/podman/netavark is owned by netavark 1.10.3-1
path: /usr/lib/podman/netavark
version: netavark 1.10.3
ociRuntime:
name: crun
package: /usr/bin/crun is owned by crun 1.15-1
path: /usr/bin/crun
version: |-
crun version 1.15
commit: e6eacaf4034e84185fd8780ac9262bbf57082278
rundir: /run/user/0/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: /usr/bin/pasta is owned by passt 2024_04_26.d03c4e2-1
version: |
pasta 2024_04_26.d03c4e2
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: false
path: /run/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: false
seccompEnabled: true
seccompProfilePath: /etc/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: /usr/bin/slirp4netns is owned by slirp4netns 1.3.0-1
version: |-
slirp4netns version 1.3.0
commit: 8a4d4391842f00b9c940bb8f067964427eb0c964
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.5
swapFree: 0
swapTotal: 0
uptime: 77h 40m 24.00s (Approximately 3.21 days)
variant: ""
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries: {}
store:
configFile: /etc/containers/storage.conf
containerStore:
number: 0
paused: 0
running: 0
stopped: 0
graphDriverName: overlay
graphOptions:
overlay.mountopt: nodev
graphRoot: /var/lib/containers/storage
graphRootAllocated: 214520823808
graphRootUsed: 126285017088
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "false"
Supports d_type: "true"
Supports shifting: "true"
Supports volatile: "true"
Using metacopy: "true"
imageCopyTmpDir: /var/tmp
imageStore:
number: 1
runRoot: /run/containers/storage
transientStore: false
volumePath: /var/lib/containers/storage/volumes
version:
APIVersion: 5.0.2
Built: 1713438799
BuiltTime: Thu Apr 18 11:13:19 2024
GitCommit: 3304dd95b8978a8346b96b7d43134990609b3b29-dirty
GoVersion: go1.22.2
Os: linux
OsArch: linux/amd64
Version: 5.0.2Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
No response
Additional information
The commit that added this feature is 0cfd127
This is the relevant part (2nd frame of the backtrace above):
// Wait for the container to turn healthy before sending the READY
// message. This implies that we need to unlock and re-lock the
// container.
if !c.batched {
c.lock.Unlock()
defer c.lock.Lock()
}