Skip to content

ginkgo run in parallel can hang when child command outlives test suite #1191

@Luap99

Description

@Luap99

I have been debugging hangs in ginkgo v2 in over last weeks and I think I finally found the problem.
I created https://github.com/Luap99/ginkgo-hang for a simple reproducer.

This diff seems to fix the issue for me:

diff --git a/internal/output_interceptor_unix.go b/internal/output_interceptor_unix.go
index f5ae15b..70d5647 100644
--- a/internal/output_interceptor_unix.go
+++ b/internal/output_interceptor_unix.go
@@ -26,6 +26,11 @@ func (impl *dupSyscallOutputInterceptorImpl) CreateStdoutStderrClones() (*os.Fil
        stdoutCloneFD, _ := unix.Dup(1)
        stderrCloneFD, _ := unix.Dup(2)
 
+       flags, _ := unix.FcntlInt(uintptr(stdoutCloneFD), unix.F_GETFD, 0)
+       unix.FcntlInt(uintptr(stdoutCloneFD), unix.F_SETFD, flags|unix.FD_CLOEXEC)
+       flags, _ = unix.FcntlInt(uintptr(stderrCloneFD), unix.F_GETFD, 0)
+       unix.FcntlInt(uintptr(stderrCloneFD), unix.F_SETFD, flags|unix.FD_CLOEXEC)
+
        // And then wrap the clone file descriptors in files.
        // One benefit of this (that we don't use yet) is that we can actually write
        // to these files to emit output to the console even though we're intercepting output

By using CLOEXEC we make sure the fds are never leaked into commands that are executed in the test suite.
I am happy to open a PR if you agree that this is the right approach to fix the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions