-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[Feat][Core/Dashboard] Add SubprocessModules to the Dashboard routes, and convert HealthzHead #51282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat][Core/Dashboard] Add SubprocessModules to the Dashboard routes, and convert HealthzHead #51282
Conversation
2dbfedf
to
1da9451
Compare
30b0061
to
5836482
Compare
b2f8ac0
to
5c080c5
Compare
28bc5a8
to
52c166e
Compare
Signed-off-by: Chi-Sheng Liu <[email protected]>
Signed-off-by: Chi-Sheng Liu <[email protected]>
Added back |
This reverts commit bbed411. Signed-off-by: Chi-Sheng Liu <[email protected]>
73f9222
to
1186504
Compare
raise OSError( | ||
f"AF_UNIX path length cannot exceed {maxlen} bytes: {result!r}" | ||
) | ||
validate_socket_filepath(result.split("://", 1)[-1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to keep .encode("utf-8")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to use non-ASCII chracters for filename? If we only use ASCII characters for filename, then the length of string and the length of the encoded bytes will be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the history, I think it's safer to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added back .encode("utf-8")
inside validate_socket_filepath
.
Detect parent process death by checking if ppid is still the same. | ||
""" | ||
while True: | ||
ppid = os.getppid() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return the parent’s process id. When the parent process has exited, on Unix the id returned is the one of the init process (1), on Windows it is still the same id, which may be already reused by another process.
This approach won't work on Windows.
Try multiprocessing.parent_process().is_alive()
or pipe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use psutil
instead. It can detect PID reuse.
https://psutil.readthedocs.io/en/latest/index.html#psutil.Process.is_running
Signed-off-by: Chi-Sheng Liu <[email protected]>
Signed-off-by: Chi-Sheng Liu <[email protected]>
:param dashboard_head: The DashboardHead instance. | ||
""" | ||
self._config = config | ||
self._parent_process = psutil.Process(parent_process_pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if parent is already dead and the pid is reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the constructor of the child process. So do you mean that the parent process dies immediately even before the child process constructor has run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the implementation to use multiprocessing.parent_process().is_alive()
because I found that the parent_process()
call returns a variable maintained by Python, so it should be able to identify the parent process calling multiprocessing
.
Signed-off-by: Chi-Sheng Liu <[email protected]>
|
||
async def _detect_parent_process_death(self): | ||
""" | ||
Detect parent process death by checking if ppid is still the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to update the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Signed-off-by: Chi-Sheng Liu <[email protected]>
Signed-off-by: Chi-Sheng Liu <[email protected]>
… routes, and convert HealthzHead (ray-project#51282)" This reverts commit ae7340d. Signed-off-by: Rui Qiao <[email protected]>
… routes, and convert HealthzHead (#51282)" (#51512) Signed-off-by: Rui Qiao <[email protected]>
…d routes, and convert HealthzHead (ray-project#51282)" (ray-project#51512) This reverts commit 8773682. Signed-off-by: Chi-Sheng Liu <[email protected]>
…d routes, and convert HealthzHead (#51282)" (#51512) (#51523) Signed-off-by: Chi-Sheng Liu <[email protected]>
… and convert HealthzHead (ray-project#51282) Signed-off-by: Chi-Sheng Liu <[email protected]> Signed-off-by: Dhakshin Suriakannu <[email protected]>
… routes, and convert HealthzHead (ray-project#51282)" (ray-project#51512) Signed-off-by: Rui Qiao <[email protected]> Signed-off-by: Dhakshin Suriakannu <[email protected]>
…d routes, and convert HealthzHead (ray-project#51282)" (ray-project#51512) (ray-project#51523) Signed-off-by: Chi-Sheng Liu <[email protected]> Signed-off-by: Dhakshin Suriakannu <[email protected]>
Why are these changes needed?
Based on #51172 and #49864 to convert
HealthzHead
to subprocess dashboard module.DashboardHeadModule
s can still work.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.