-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Propagate SIGTERM to uv run
#6738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, should we be forwarding all signals?
|
I don't think this forwards them per se, it just kills the process if we see them. Other signals might have different meanings, right...? |
|
Oh.. hm. I'll need to look closer. |
| // signal handlers after the command completes. | ||
| let _handler = tokio::spawn(async { while tokio::signal::ctrl_c().await.is_ok() {} }); | ||
|
|
||
| let status = handle.wait().await.context("Child process disappeared")?; | ||
| let mut term_signal = signal(SignalKind::terminate())?; | ||
| let mut int_signal = signal(SignalKind::interrupt())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to remove the above ctrl_c() handler if we're going to replicate it here, right?
|
Reading about this in https://unix.stackexchange.com/questions/176235/fork-and-how-signals-are-delivered-to-processes Seems like both the child and parent should receive the signal already? That's why above we say:
And just eat all the SIGINT we receive in the parent process. |
|
More interesting discussion in docker/compose#10898 (comment) and docker/cli#4402 (comment)
|
|
Basically — I think we should be forwarding all signals to the child process but we'll need an exception for SIGINT in an interactive session because it's probably already been sent to the child process by a shell. 😭 |
|
Hi @charliermarsh , I have tested this PR the same way as I stated in #6724 and sadly it doesn't seem to actually fix the issue 😬 Maybe it's not about sigterm after all? I can't really say. All I know is that the service doesn't stop when you Ctrl+C the service started with docker compose |
|
Thanks @overfl0! I appreciate you trying it out. I can also repro the issue, though hadn't had a chance to test out this fix. |
|
I wonder if it wouldn't be better to replace the current process completely (a la Disclaimer: I'm not sure the about the overall design of I see that /// Spawns a command exec style.
pub fn exec_spawn(cmd: &mut Command) -> Result<Infallible, Error> {
// this is technically only necessary on windows
crate::disable_ctrlc_handler();
#[cfg(unix)]
{
use std::os::unix::process::CommandExt;
let err = cmd.exec();
Err(err.into())
}
#[cfg(windows)]
{
cmd.stdin(Stdio::inherit());
let status = cmd.status()?;
std::process::exit(status.code().unwrap())
}
}What would be the pros/cons of not using Tokio's |
|
That's also discussed briefly in #3095 |
|
|
||
| // `SIGINT` | ||
| _ = int_signal.recv() => { | ||
| handle.kill().await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be important to preserve the signal kind, so still a SIGINT to the child?
|
I was pointed here from another issue with something that might be related. So just to leave it here, did you consider how a Uvicorn (or other webservers) deployed application might be used with uv? In the context of signal control of the server that is. Sending SIGHUP, SIGTTIN, SIGTTOU to your container to scale Uvicorn to the container from it's orchestration layer like Docker Swarm has meaning as only the PID 1 process in the container will receive the signal, in the case of Here's some context: https://www.kaggle.com/code/residentmario/best-practices-for-propagating-signals-on-docker Anyway I don't think it's something I can help with but wanted to point it out - maybe it's a separate issue, don't know how you guys would rather track that but I think it's an important thing to consider. Thanks. |
|
Maybe documenting this consideration would be a good idea? |
|
Hello all, wondering what's missing here for this to be folded into the next release? I see some checks are failing but also that it was approved. If this ensures some propagation it would be super helpful for my team to have as right now we see the same behavior in the original reference issue re: dockerized processes and we depend on being able to properly gracefully shutdown rather than have a forced kill after the sigterm wait period. |
|
This is missing a design and subsequent changes to address my comments at #6738 (comment) |
|
@zanieb I think what's more important is that this PR doesn't actually fix the issue, see my comment below yours: #6738 (comment) |
jupyter-client also sends signals to the process group, so I'm worried about that one too if running the ipykernel using |
took another look at this and I also confirmed that when sending a signal it was not passed to the underlying process as expected. I now see what @zanieb mentioned with some interactive shells passing signals directly when done from an interactive context but I'm still not convinced that's a big deal really. at this point I'd rather receive double the signals than the current behavior |
|
A few things after exploring where things are going wrong. Keep in mind I do not know rust and am only looking at it because this tool is written in it; may need some help to confirm my findings.
I am slow in rust as mentioned but I have been looking at this with earnest, though I recognize someone that actually knows this language will be much faster to take them if indeed these are the right things to address |
|
@charliermarsh @zanieb sorry for the dup PR but didn't want to push to your branch (and not sure I could anyways) I added a few commits and I believe I have a fix ready here #8933, though I might need some guidance incase CI fails |
<!-- Thank you for contributing to uv! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR builds off of #6738 to fix #6724 (sorry for the new PR @charliermarsh I didn't want to push to your branch, not even sure if I could). The reason the original PR doesn't fix the issue described in #6724 is because the fastapi is ran in the project context (as I assume a lot of use cases are). This PR adds an extra commit to handle the signals in the project/run.rs file ~It also addresses the comment [here](https://github.com/astral-sh/uv/pull/6738/files#r1734757548) to not use the tokio ctrl-c method since we are now handling SIGINT ourselves~ update, tokio handles SIGINT in a platform agnostic way, intercepting this ouselves makes the logic more complicated with windows, decided to leave the tokio ctrl-c handler ~[This comment](https://github.com/astral-sh/uv/pull/6738/files#r1743510140) remains unaddressed, however, the Child process does not have any other methods besides kill() so I don't see how we can "preserve" the interrupt call :/ I tried looking around but no luck.~ updated, this PR is reduced to only handling SIGTERM propagation on unix machines, and the sigterm call to the child is preserved by making use of the nix package, instead of relying on tokio which only allowed for `kill()` on a child process ## Test Plan I tested this by building the docker container locally with these changes and tagging it "myuv", and then using that as the base image in uv-docker-example, (and ofc following the rest of the repro issues in #6724. In my tests I see that ctrl-c in the docker-compose up command exits the process almost immediately 👍 --------- Co-authored-by: Charlie Marsh <[email protected]>
There should be two functional changes here: - If we receive SIGINT twice, forward it to the child process - If the `uv run` child process changes its PGID, then forward SIGINT Previously, we never forwarded SIGINT to a child process. Instead, we relied on shell to do so. On Windows, we still do nothing but eat the Ctrl-C events we receive. I cannot see an easy way to send them to the child. The motivation for these changes should be explained in the comments. Closes #10952 (in which Ray changes its PGID) Replaces the (much simpler) #10989 with a more comprehensive approach. See #6738 (comment) for some previous context.
As I suspected quite some time ago (#6738 (comment)), it's problematic that we don't handle _every_ signal here. This PR adds handling for all of the Unix signals except `SIGCHLD`, `SIGIO`, and `SIGPOLL` which seem incorrect to forward. Also notable, we _cannot_ handle `SIGKILL` so if someone sends that to the PID instead of the PGID, they will leave dangling subprocesses. Instead, we could use `exec` and avoid this handling. However, we'd lose the ability to add nice error message on failure (e.g., as someone is trying to add in #12201) and, more critically, we'd need to figure out how to clean up resources properly (i.e., temporary directories) which currently happens on `Drop`. In the long-term, we'll probably want an option to use `exec` — but we'll need to figure out when to clean up resources or accept that they will dangle. This was last discussed in #3095 — discussion on that approach should continue there. A note on the implementation: I spent time time trying to write the handler using a tokio stream, so we could dynamically iterate over a list of signals instead of copy/pasting the implementation — I couldn't get it to work though and it didn't seem critical. Closes #12830
Summary
See: #6724
Test Plan
(Needs testing.)