Skip to content

Commit 09bddbb

Browse files
logging: include pid in create_file_writer (meta-pytorch#892)
Summary: Pull Request resolved: meta-pytorch#892 `fbcode//monarch/hyperactor_mesh:hyperactor_mesh_proxy_test` is a standalone self bootstrapping program that uses `ProcessAllocator` to do the following: - the driver creates a proc/process to host a `ProxyActor` - initialization of the `ProxyActor` on the new proc/process creates a proc/process to host a `TestActor` so, executing this program creates a 3 level process hierarchy `driver -> parent -> grandchild` where the `parent` process hosts a single proc/process (rank = 0) with one `ProxyActor` and the `grandchild` a single proc/process (rank = 0) with one `TestActor`. using this program, i observe that as things stand, logs from the parent and the grandchild (since they share a common rank) are merged in the one file `/tmp/$USER/monarch_log_0.stdout`. this diff disambiguates proc logs by incorporating the process ID of the mesh owner into the proc's log file name. so, for example, now there will be logs `monarch_log_3529266_0.stdout` (capturing the logs of the parent proc) and `monarch_log_3530444_0.stdout` (capturing the logs of the grandchild proc). Reviewed By: highker Differential Revision: D80349615 fbshipit-source-id: 3c37864fa3d5fe327f1d1e679df583fd5e023475
1 parent cac2e47 commit 09bddbb

File tree

1 file changed

+12
-1
lines changed

1 file changed

+12
-1
lines changed

hyperactor_mesh/src/logging.rs

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,18 @@ fn create_file_writer(
383383
let (path, filename) = log_file_path(env)?;
384384
let path = Path::new(&path);
385385
let mut full_path = PathBuf::from(path);
386-
full_path.push(format!("{}_{}.{}", filename, local_rank, suffix));
386+
387+
// This is the PID of the "owner" of the proc mesh, the proc mesh
388+
// this proc "belongs" to. In other words,the PID of the process
389+
// that invokes `cmd.spawn()` (where `cmd: &mut
390+
// tokio::process::Command`) to start the process that will host
391+
// the proc that this file writer relates to.
392+
let file_created_by_pid = std::process::id();
393+
394+
full_path.push(format!(
395+
"{}_{}_{}.{}",
396+
filename, file_created_by_pid, local_rank, suffix
397+
));
387398
let file = std::fs::OpenOptions::new()
388399
.create(true)
389400
.append(true)

0 commit comments

Comments
 (0)