feature: Read input from an argv argument instead of stdin using `--input` flag

### The problem: No way to read input from argument

I am again dealing with lot of `jq`-ing and I am constantly having this issue: 

I want to send data directly into `jq` using an `argv[]` argument and not the standard input descriptor. But so far, and please correct me, if I am wrong, this seems to be impossible.

While some more regular/traditional `jq` users might oppose the idea, it would be extremely powerful feature, that would come handy in many specialty situations: a subshell, [`parallel`](https://www.gnu.org/software/parallel/), [`xargs`](https://www.man7.org/linux/man-pages/man1/xargs.1.html), [`execline`](https://www.skarnet.org/software/execline/), or for example [`socat`](http://www.dest-unreach.org/socat/), or in billions of other such specialty cases.

More than like 5 years ago, I requested something like this, and got redirected to `--arg` and `--argjson`. While I was thankful and while those are infinitely useful, it's not the same thing! I wasn't `jq`-ing that much since then, but I am, once again, and inability to do this is literally killing me.

> For example: more modern gnu `xargs` versions have grown support for `-o, --open-tty`. This makes `xargs` process able consume data from it's own `/dev/stdin`, but still, for each record/line handling "execution", `xargs` will rebind child's (`[command]`, `-exec {} ;`) `/dev/stdin` to it's original controlling terminal again. This allows you to process entries from file/pipe as usual, yet each record "handler" can still communicate with user on tty (for example, for password entry). It's nigh impossible to use `jq` in this setup efficiently without mucking around with subshelling idioms like `X="$(echo "${json_data}" | jq -r '.somefield')"`. This also requires one to spawn `sh -c` for each `xargs` "record", to just be able to do subshelling.

> For example: if using **JSON** as "binary safer" (and structured) string processing format, especially in shell scripts, which is very convenient and powerful ability, one often ends up mucking around with `VAR="$(echo "${JSON_DATA}" | jq -r '.somefield')"` again, just to extract value of `.somefield` from specific `${JSON_DATA}`. Similarly, despite various modern shell optimizations, this can sometimes (and in certain setups) spawn 3 sub-processes: `subshell`, `echo` and `jq`(!) (and also constructs pipline). All just to "lift" single field (or field chain) from input JSON.

> For example: in `execline` language (which is very similar to `socat` case) one would benefit greatly from ability to access fields from structured input data directly, making these tools much more powerful. But because `jq` cannot read it's input from it's argument, one has to wrap the "input sending part" into `pipeline` command, in case of `execline`, or into `sh -c` in case of `socat`, to get access to the fields, again.

In a nutshell, this feature would come incredibly handy in ad-hoc api explorations, and quick one off jobs, which iterate over larger datasets, using any OS level iterators or executors, that fork a child, but user would also benefit from `/dev/stdin` being left alone or for left open other uses.

While some might argue, that for such jobs one should use something like `python`, that language is not concise enough to cut through large swaths of data being pumped through command lines and pipelines, especially ad hoc. On the other hand, `jq` language is sufficiently terse and syntax efficient for exactly that kind of work.

### Suggestion of solution

Thus I propose introduction of  `--input / -i` argument, that would take the next string argument as input, make `jq` consume it verbatim as an input buffer, preferably completely ignoring `/dev/stdin` handling. Whether `--input` should exist in the `argv[]` as singleton, similarly to "jq program" argument, is probably best left to `jq` maintainers to decide. But to maintain parity with "jq program" argument handling, and to decrease implementation complexity, I suggest singleton approach, ie only and exactly one `--input` allowed only, ie either `jq` would read from `stdin` or from `--input` arg.

### Usage example

This is little bit contrived, but I hope it illustrates a point well, so please bear with me. 

Let's say one needs to do some specific ad-hoc action for each container managed by `cri-o` on a k8s node. With `--input` I can get `.name` field for each record directly (as if I was using unix native `cut(1)`):

```bash
crictl ps -o json \
  | jq --raw-output0 -c '[.containers[]|{name:.metadata.name,id:.id}]|.[]' \
    | xargs -0 -I'%j' jq --input '%j' -r '"name:" + .name'
```

> Annotation (careful invalid shell code!):
```bash
  # gets JSON data from some data producer
  crictl ps -o json

  # "slice" and massage the dataset for our needs, ie select specific fields
  | jq --raw-output0 -c '[.containers[]|{name:.metadata.name,id:.id}]|.[]'

  # now apply resulting fields as "named columns" in "subcomand"
  # - we can "reference" fields directly from argv
  | xargs -0 -I'%j' jq --input '%j' -r '"name:" + .name' 
```
  
Without `--input`, this needs to be done instead:

```bash
crictl ps -o json \
  | jq --raw-output0 -c '[.containers[]|{name:.metadata.name,id:.id}]|.[]' \
    | xargs -0 -I'%j' \
       sh -c "printf 'name:%s\n' \$(echo '%j' | jq -r '.name')"
```

Observe, that in first case, `%j` "variable" is just **raw string**. The "expansion" is handled by `xargs` implicitly: it searches for literal string `'%j'` in it's own argvector and then it just copy pastes it into it's subchild argvector: `jq --input '%j' -r '"name:" +.name'` ie `jq` subprocess literally becomes:

From:

```python
['jq', '--input', '%j', '-r', '"name": + .name' ]
```

to

```python
['jq', '--input', '{"name":"kube-proxy","id":"c31ef8zzssddrrtyt"}', '-r', '"name": + .name' ]
```

after each "line expansion", at the `execve` level.

When combined `-0` this makes such executions very safe, without worry, that in-between shell will somehow mangle them. And we are not even talking about reduction of number of sub-forks, pipes, file descriptor etc.

Because maximum lengths for each argv element are quite big these days, this allows one to do expansions like these:

```bash
crictl ps -o json \
  | jq --raw-output0 -c '[.containers[]|{name:.metadata.name,id:.id}]|.[]' \
    | xargs -0 -I'%j' \
       sh -c "cmd-do-something-cmd-somewhere --name \$(jq -i '%j' -r '.name') --id \$(jq -i '%j' -r '.id')"
```

Where we save one echo and two fds per pipeline each JSON data field derefercing. If we want to be extra explicit:

```bash
crictl ps -o json \
  | jq --raw-output0 -c '[.containers[]|{name:.metadata.name,id:.id}]|.[]' \
    | xargs -0 -I'%j' \
       sh -c "exec cmd-do-something-cmd-somewhere --name \$(exec jq -i '%j' -r '.name') --id \$(exec jq -i '%j' -r '.id')"
```

But without the `--input` provision, most concise form I got to, is this (by abusing inline shell functions which removes lot of safety):

```bash
crictl ps -o json \
  | jq --raw-output0 -c '[.containers[]|{name:.metadata.name,id:.id}]|.[]' \
    | xargs -0 -I'$j' \
       sh -c "j(){ jq -r \$@;};d(){ echo '\$j';}; cmd-do-something-cmd-somewhere --name \$(d|j '.name') --id \$(d|j '.id')"
```

While this might seem more compact (because of shell "hacks"), it is a lot worse from points of both execution complexity and string safety.

I believe you can infer much more advanced and even nested usage from here, especially if taking into account more complex `jq` programs (loaded from files) for initial jq "selector" part of the pipeline (the `jq --raw-output0 -c '[.contain...` part).

I hope what I wrote makes sense, and will make you consider this feature.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature: Read input from an argv argument instead of stdin using `--input` flag #3293

The problem: No way to read input from argument

Suggestion of solution

Usage example

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feature: Read input from an argv argument instead of stdin using --input flag #3293

Description

The problem: No way to read input from argument

Suggestion of solution

Usage example

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

feature: Read input from an argv argument instead of stdin using `--input` flag #3293