Skip to content

Add the ability to wire-up listeners before starting a child process #38081

@TheYarin

Description

@TheYarin

Is your feature request related to a problem? Please describe.
When registering multiple listeners (callbacks) to the data event of a child process's stdout, there's no way to get the child process to wait for all the callbacks to be registered before starting. This means there's a window between registering the first listener and the second one in which the first listener might "pull" the first available chunk and when the second listener is registered, it won't receive the first chunk.

A thinned-down example:

// Expected behaviour scenario
const { exec } = require("child_process");
p = exec("seq 1000"); // This command prints the numbers between 1 and 1000, each in a different line
a1 = '';
a2 = '';
p.stdout.on("data", (d) => a1 += d);
p.stdout.on("data", (d) => a2 += d);

// When the child process completes, a1 and a2 will both contain all the numbers from 1 to 1000
// Edge-case scenario
const { exec } = require("child_process");
p = exec("seq 1000");
a1 = '';
a2 = '';
p.stdout.on("data", (d) => a1 += d);
setTimeout(() => {
  p.stdout.on("data", (d) => a2 += d);
}, 500);
// When the child process completes, a1 will contain all the numbers from 1 to 1000 while a2 will remain an empty string

From what I understand from reading the documentation of child_process, when a child process is started nodejs saves it's output in a buffer until a listener is registered (either by directly binding to the 'data' event or by pipe()ing stdout to a writable stream). This behaviour creates two potential problems:

  1. A second listener might not get the same data as the first one.
  2. the child process might output more data than the buffer can contain before any data can be processed.

Describe the solution you'd like
The solution I propose is to allow wiring up all the listeners and pipes before starting the child process.
Considering backwards compatibility, I imagine the best way to achieve this is by passing a new option (something like autostart that will default to true) to the options parameter of spawn, exec etc., that will make those functions return a ChildProcess instance that was not yet started, together with a new start() method added to the ChildProcess class.

Describe alternatives you've considered
The alternatives as I see them are:

  1. Only register a single handler and pass the data around to your multiple destinations.
  2. Try to proxy the readable stream to a second one that is already wired up.
  3. Try your best to minimize that window and hope for the best.

Metadata

Metadata

Assignees

No one assigned

    Labels

    child_processIssues and PRs related to the child_process subsystem.feature requestIssues that request new features to be added to Node.js.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions