Skip to content

Conversation

boeddeker
Copy link
Member

I am not yet sure, what the best way is to provide the files.
For multichannel files like they are used in libriCSS, scp would be ok, but it does not generalize to multiple files and each contain a single channel.

@keisukekino (CC: @tnakieee) Could you test the code, if it works for you?

I ran the code with
mpirun -np 16 python -m pb_ime5.scripts.kaldi_run_rttm_libri_css with storage_dir=exp database_rttm=overlap_ratio_40.0_sil0.1_1.0_session0_actual39.5.rttm job_id=1 number_of_jobs=1
where I replaced the mpirun with our HPC command. With 103 cores it took 4:40 minutes.

end = str(end).zfill(max_digits)

# return f'{file_id}_{speaker_id}-{start}_{end}'
return f'{speaker_id}_{start}-{end}'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you could change the file names. The code will save the utterances as {storage_dir}/audio/{example_id}.wav

sessions: set = get_sessions(database_rttm)
assert len(sessions) == 1, sessions

files = list(Path(database_rttm).parent.glob('*.wav'))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, what the best way is to obtain the observation.
The default is to assume, that the wav file next to the rttm file is the observation.
Feel free to propose an alternative (Do you know a common way that generalizes to 1 file per channel?).

@@ -0,0 +1,247 @@
"""
[mpirun -np $(nproc --all)] python -m pb_chime5.scripts.kaldi_run_rttm_libri_css with storage_dir=<...> database_rttm=<...> [activity_rttm=<...>] [session_id=dev] [session_to_audio_paths=<...>.{yaml,json}] job_id=1 number_of_jobs=1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you can find how to start this script.
It supports mpi parallel and kaldi parallel. Both could be used at the same time.
To disable mpi (i.e. when you have a broken installation, uninstall mpi4py).

@aarora8
Copy link

aarora8 commented Mar 2, 2021

I am not yet sure, what the best way is to provide the files.
For multichannel files like they are used in libriCSS, scp would be ok, but it does not generalize to multiple files and each contain a single channel.

@keisukekino (CC: @tnakieee) Could you test the code, if it works for you?

I ran the code with
mpirun -np 16 python -m pb_ime5.scripts.kaldi_run_rttm_libri_css with storage_dir=exp database_rttm=overlap_ratio_40.0_sil0.1_1.0_session0_actual39.5.rttm job_id=1 number_of_jobs=1
where I replaced the mpirun with our HPC command. With 103 cores it took 4:40 minutes.

Hi, thank you for this pull request to run guided source separation with dataset other than CHiME. I am trying to run it with a different dataset. It contains 6 wav files for each session. I created the rttm file and session_to_audio_paths json file.

My session_to_audio_paths JSON file is as follows:
{
"session_id": '[{audio_path}/session_id_ch1.wav,
{audio_path}/session_id_ch2.wav,
{audio_path}/session_id_ch3.wav,
{audio_path}/session_id_ch4.wav,
{audio_path}/session_id_ch5.wav,
{audio_path}/session_id_ch6.wav]'
}

my rttm is as follow:

SPEAKER session_id 1 0.45 0.55 <_NA> <_NA> Speaker_id <_NA>
SPEAKER session_id 1 0.94 0.70 <_NA> <_NA> Speaker_id <_NA>

However, I am getting the following error. Can you please help me with it:

File "pb_chime5/pb_chime5/database/chime5/rttm.py", line 474, in data
audio_path = self._audio_paths[session_id]
TypeError: list indices must be integers or slices, not str

Since in your above comment it is mentioned that "it does not generalize to multiple files and each contain a single channel.".
I wanted to check with you, if what I did above needs some additional change aswell.

I tried merging the multiple files to get a multichannel file. And got the new json file as follows:

{
'session_id': '{audio_path}/session_id.wav'
}

However, I am getting the same error.
File "pb_chime5/pb_chime5/database/chime5/rttm.py", line 474, in data
audio_path = self._audio_paths[session_id]
TypeError: list indices must be integers or slices, not str

Thanks,
Ashish

@boeddeker
Copy link
Member Author

Hi,

thank you for trying to use this code.
Getting the input right is actually the most difficult part, because of the guide and maybe multiple files.

With "does not generalize to multiple files" I meant the simple scp files (only example id and file path, without sox etc).
Using a json doesn't have this problem and the code should work in both cases. Both of your jsons look correct.

The error you get looks strange. audio_paths should be a dictionary and not a list.
Can you inspect the value of self._audio_paths? I pushed some changes to produce a more verbose error.
Alternatively you can start the script with pdb (i.e. pb_ime5.scripts.kaldi_run_rttm_libri_css --pdb with ...) and manually inspect the object (or other tricks like print or raise a new exception as in my modification).

When you post the value of audio_paths, I hope that I can better help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants