-
Notifications
You must be signed in to change notification settings - Fork 34
Add changes to apply this system to libriCSS #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
end = str(end).zfill(max_digits) | ||
|
||
# return f'{file_id}_{speaker_id}-{start}_{end}' | ||
return f'{speaker_id}_{start}-{end}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you could change the file names. The code will save the utterances as {storage_dir}/audio/{example_id}.wav
sessions: set = get_sessions(database_rttm) | ||
assert len(sessions) == 1, sessions | ||
|
||
files = list(Path(database_rttm).parent.glob('*.wav')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, what the best way is to obtain the observation.
The default is to assume, that the wav file next to the rttm file is the observation.
Feel free to propose an alternative (Do you know a common way that generalizes to 1 file per channel?).
@@ -0,0 +1,247 @@ | |||
""" | |||
[mpirun -np $(nproc --all)] python -m pb_chime5.scripts.kaldi_run_rttm_libri_css with storage_dir=<...> database_rttm=<...> [activity_rttm=<...>] [session_id=dev] [session_to_audio_paths=<...>.{yaml,json}] job_id=1 number_of_jobs=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you can find how to start this script.
It supports mpi parallel and kaldi parallel. Both could be used at the same time.
To disable mpi (i.e. when you have a broken installation, uninstall mpi4py
).
Hi, thank you for this pull request to run guided source separation with dataset other than CHiME. I am trying to run it with a different dataset. It contains 6 wav files for each session. I created the rttm file and session_to_audio_paths json file. My session_to_audio_paths JSON file is as follows: my rttm is as follow: SPEAKER session_id 1 0.45 0.55 <_NA> <_NA> Speaker_id <_NA> However, I am getting the following error. Can you please help me with it: File "pb_chime5/pb_chime5/database/chime5/rttm.py", line 474, in data Since in your above comment it is mentioned that "it does not generalize to multiple files and each contain a single channel.". I tried merging the multiple files to get a multichannel file. And got the new json file as follows: { However, I am getting the same error. Thanks, |
Hi, thank you for trying to use this code. With "does not generalize to multiple files" I meant the simple The error you get looks strange. When you post the value of |
I am not yet sure, what the best way is to provide the files.
For multichannel files like they are used in libriCSS, scp would be ok, but it does not generalize to multiple files and each contain a single channel.
@keisukekino (CC: @tnakieee) Could you test the code, if it works for you?
I ran the code with
mpirun -np 16 python -m pb_ime5.scripts.kaldi_run_rttm_libri_css with storage_dir=exp database_rttm=overlap_ratio_40.0_sil0.1_1.0_session0_actual39.5.rttm job_id=1 number_of_jobs=1
where I replaced the
mpirun
with our HPC command. With 103 cores it took 4:40 minutes.