AuxiliaryASR

This repo contains the training code for Phoneme-level ASR for Voice Conversion (VC) and TTS (Text-Mel Alignment) used in StarGANv2-VC and StyleTTS.

Pre-requisites

Python >= 3.7
Clone this repository:

git clone https://github.com/yl4579/AuxiliaryASR.git
cd AuxiliaryASR

Install python requirements:

pip install SoundFile torchaudio torch jiwer pyyaml click matplotlib g2p_en librosa

Prepare your own dataset and put the train_list.txt and val_list.txt in the Data folder (see Training section for more details).

Training

python train.py --config_path ./Configs/config.yml

Please specify the training and validation data in config.yml file. The data list format needs to be filename.wav|label|speaker_number, see train_list.txt as an example (a subset for LJSpeech). Note that speaker_number can just be 0 for ASR, but it is useful to set a meaningful number for TTS training (if you need to use this repo for StyleTTS).

Checkpoints and Tensorboard logs will be saved at log_dir. To speed up training, you may want to make batch_size as large as your GPU RAM can take. However, please note that batch_size = 64 will take around 10G GPU RAM.

Languages

This repo is set up for English with the g2p_en package, but you can train it with other languages. If you would like to train for datasets in different languages, you will need to modify the meldataset.py file (L86-93) with your own phonemizer. You also need to change the vocabulary file (word_index_dict.txt) and change n_token in config.yml to reflect the number of tokens. A recommended phonemizer for other languages is phonemizer.

References

Acknowledgement

The author would like to thank @tosaka-m for his great repository and valuable discussions.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Configs		Configs
Data		Data
LICENSE		LICENSE
README.md		README.md
layers.py		layers.py
meldataset.py		meldataset.py
models.py		models.py
optimizers.py		optimizers.py
text_utils.py		text_utils.py
train.py		train.py
trainer.py		trainer.py
utils.py		utils.py
word_index_dict.txt		word_index_dict.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AuxiliaryASR

Pre-requisites

Training

Languages

References

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

yl4579/AuxiliaryASR

Folders and files

Latest commit

History

Repository files navigation

AuxiliaryASR

Pre-requisites

Training

Languages

References

Acknowledgement

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages