Skip to content

(#209) Redact inputs in spool files #210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

LMacchi
Copy link

@LMacchi LMacchi commented Jul 11, 2025

Associated issue: #209

  • Redact metadata['request']['input'] from choria.json
  • Remove wrapper_stdin

Sample after applying changes:

/opt/puppetlabs/mcollective/tasks-spool # cat 9818ae3c4cc5507fb4abf20927f88adb/choria.json | jq
{
  "start_time": 1752186657,
  "caller": "choria=me.mcollective",
  "task": "my_tasks::sleep_with_pwd_ruby",
  "request": {
    "executable": "/opt/puppetlabs/mcollective/tasks-spool/9818ae3c4cc5507fb4abf20927f88adb/files/my_tasks/tasks/sleep_with_pwd_ruby.rb",
    "arguments": [],
    "input": {},
    "stdout": "/opt/puppetlabs/mcollective/tasks-spool/9818ae3c4cc5507fb4abf20927f88adb/stdout",
    "stderr": "/opt/puppetlabs/mcollective/tasks-spool/9818ae3c4cc5507fb4abf20927f88adb/stderr",
    "exitcode": "/opt/puppetlabs/mcollective/tasks-spool/9818ae3c4cc5507fb4abf20927f88adb/exitcode"
  }
}
/opt/puppetlabs/mcollective/tasks-spool # cat 9818ae3c4cc5507fb4abf20927f88adb/wrapper_stdin
cat: wrapper_stdin: No such file or directory

I can still retrieve the task summary and metadata:

~$ mco tasks status 9818ae3c4cc5507fb4abf20927f88adb -v -I server123 --summary

 * [ ============================================================> ] 1 / 1



Summary for task 9818ae3c4cc5507fb4abf20927f88adb

                       Task Name: my_tasks::sleep_with_pwd_ruby
                          Caller: choria=me.mcollective
                       Completed: 1
                         Running: 0

                      Successful: 1
                          Failed: 0

                Average Run Time: 5.09s

And metadata:

~$ mco tasks status 9818ae3c4cc5507fb4abf20927f88adb -v -I server123 --metadata

 * [ ============================================================> ] 1 / 1

  server123


---- Task Stats ----
           Nodes: 1 / 1
     Pass / Fail: 0 / 1
      Start Time: 2025-07-11 19:58:35 +0000
  Discovery Time: 0.00ms
      Agent Time: 1059.72ms
      Total Time: 1059.72ms

@LMacchi
Copy link
Author

LMacchi commented Jul 11, 2025

Happy to test other scenarios or discuss other approaches. I don't think inputs are used after execution of the task itself started, but maybe I missed something like retries?

@ripienaar
Copy link
Member

@smortex you have any opinion on this? Seems like a good change to me

@smortex
Copy link
Member

smortex commented Jul 24, 2025

I have mixed feelings. Due to the way the project currently work, we have to write the input to disk. I understand this can be a concern when this input contain tokens / passwords, but removing the information when it is not needed anymore may gives a false sense of security because at some point the information was there. Also, we log auditing information that contain these values to /var/log/choria-audit.log, so striping the info in one place while it is still available in another is maybe not ideal.

Currently, regular users should not have access to the spool files nor to the log file, so I am not concerned a lot about this (but the few tokens that appear in my infrastructure are short-lived and expire as soon as a a task is terminated, it probably helps).

Ideally, we should have some IPC that avoid writing intermediate files to the disk, and maybe Sensitive data redacted when written to the logs, but that does not look trivial. Also, the output of a command may be sensitive too, and could possibly need some care similar to the input.

@LMacchi can you detail your situation?

@LMacchi
Copy link
Author

LMacchi commented Jul 24, 2025

Oh, no, I hadn't realized about /var/log/choria-audit.log, but I can see the inputs there too.

A bit more context:

My company is trying to deploy Choria, and with any new tech adoption, we go through a compliance process. One of the things our auditors check is secrets in disk. We found the secrets in the inputs wrapper and in choria.json (and now the audit log) and I tried to come up with a MR to mitigate.

All the inputs are marked as sensitive in the metadata, but I couldn't find how to access that from within the choria tasks code. I'm personally not worried about outputs since we managed to drill it into everyone's heads to not output secrets, but I can see it being an issue too.

I like the idea of not saving to disk! Unfortunately I don't have enough knowledge to help with this.

As extra info, we also have PE deployed so I looked at our past compliance testing and found and seems like PE tasks don't rely on inputs in disk and they don't get logged - so it passed our compliance process.

Lemme know if I can provide any more info.

@ripienaar
Copy link
Member

In theory we could introduce a sensitive flag to the Request DDL and the fields will then not be logged to audit log. And we can prevent the input from being written.

It's ok but as @smortex says quite brittle, the better fix is to rethink how these things are run etc, Perforce has gotten a lot more attention to this code, but ultimately for me hard to dedicate time, someone will have to improve this.

Though I think I'll do the DDL stuff anyway that seems valuable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants