-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[Doc] update readem for aishell/asr0 #1677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
ae1b222
[Doc] update readem for aishell/asr0, test=doc
Jackwaterveg a22f29b
test=doc
Jackwaterveg ee96fb4
test=doc
Jackwaterveg 88f5595
test=doc
Jackwaterveg 1a67038
test=doc
Jackwaterveg f71b9b9
test=doc
Jackwaterveg 3c93953
test=doc
Jackwaterveg 75c9dc7
test=doc
Jackwaterveg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -143,25 +143,14 @@ avg.sh best exp/conformer/checkpoints 20 | |
CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20 | ||
``` | ||
## Pretrained Model | ||
You can get the pretrained transformer or conformer using the scripts below: | ||
You can get the pretrained transformer or conformer from [this](../../../docs/source/released_model.md) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 相同 |
||
|
||
```bash | ||
# Conformer: | ||
wget https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.release.tar.gz | ||
|
||
# Chunk Conformer: | ||
wget https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.chunk.release.tar.gz | ||
|
||
# Transformer: | ||
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz | ||
|
||
``` | ||
using the `tar` scripts to unpack the model and then you can use the script to test the model. | ||
|
||
For example: | ||
``` | ||
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz | ||
tar xzvf transformer.model.tar.gz | ||
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz | ||
tar xzvf asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz | ||
source path.sh | ||
# If you have process the data and get the manifest file, you can skip the following 2 steps | ||
bash local/data.sh --stage -1 --stop_stage -1 | ||
|
@@ -206,7 +195,7 @@ In some situations, you want to use the trained model to do the inference for th | |
``` | ||
you can train the model by yourself using ```bash run.sh --stage 0 --stop_stage 3```, or you can download the pretrained model through the script below: | ||
```bash | ||
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz | ||
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz | ||
tar xzvf transformer.model.tar.gz | ||
``` | ||
You can download the audio demo: | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -151,44 +151,22 @@ avg.sh best exp/conformer/checkpoints 20 | |
CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20 | ||
``` | ||
## Pretrained Model | ||
You can get the pretrained transformer or conformer using the scripts below: | ||
```bash | ||
# Conformer: | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz | ||
# Transformer: | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/transformer.model.tar.gz | ||
``` | ||
You can get the pretrained transformer or conformer from [this](../../../docs/source/released_model.md). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 同上 |
||
|
||
using the `tar` scripts to unpack the model and then you can use the script to test the model. | ||
|
||
For example: | ||
```bash | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz | ||
tar xzvf transformer.model.tar.gz | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz | ||
tar xzvf asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz | ||
source path.sh | ||
# If you have process the data and get the manifest file, you can skip the following 2 steps | ||
bash local/data.sh --stage -1 --stop_stage -1 | ||
bash local/data.sh --stage 2 --stop_stage 2 | ||
CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20 | ||
``` | ||
The performance of the released models are shown below: | ||
## Conformer | ||
train: Epoch 70, 4 V100-32G, best avg: 20 | ||
|
||
| Model | Params | Config | Augmentation | Test set | Decode method | Loss | WER | | ||
| --------- | ------- | ------------------- | ------------ | ---------- | ---------------------- | ----------------- | -------- | | ||
| conformer | 47.63 M | conf/conformer.yaml | spec_aug | test-clean | attention | 6.433612394332886 | 0.039771 | | ||
| conformer | 47.63 M | conf/conformer.yaml | spec_aug | test-clean | ctc_greedy_search | 6.433612394332886 | 0.040342 | | ||
| conformer | 47.63 M | conf/conformer.yaml | spec_aug | test-clean | ctc_prefix_beam_search | 6.433612394332886 | 0.040342 | | ||
| conformer | 47.63 M | conf/conformer.yaml | spec_aug | test-clean | attention_rescoring | 6.433612394332886 | 0.033761 | | ||
## Transformer | ||
train: Epoch 120, 4 V100-32G, 27 Day, best avg: 10 | ||
The performance of the released models are shown in [here](./RESULTS.md). | ||
|
||
| Model | Params | Config | Augmentation | Test set | Decode method | Loss | WER | | ||
| ----------- | ------- | --------------------- | ------------ | ---------- | ---------------------- | ----------------- | -------- | | ||
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | attention | 6.382194232940674 | 0.049661 | | ||
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | ctc_greedy_search | 6.382194232940674 | 0.049566 | | ||
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | ctc_prefix_beam_search | 6.382194232940674 | 0.049585 | | ||
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | attention_rescoring | 6.382194232940674 | 0.038135 | | ||
## Stage 4: CTC Alignment | ||
If you want to get the alignment between the audio and the text, you can use the ctc alignment. The code of this stage is shown below: | ||
```bash | ||
|
@@ -227,8 +205,8 @@ In some situations, you want to use the trained model to do the inference for th | |
``` | ||
you can train the model by yourself using ```bash run.sh --stage 0 --stop_stage 3```, or you can download the pretrained model through the script below: | ||
```bash | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz | ||
tar xzvf conformer.model.tar.gz | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz | ||
tar xzvf asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz | ||
``` | ||
You can download the audio demo: | ||
```bash | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# Transformer/Conformer ASR with Librispeech Asr2 | ||
# Transformer/Conformer ASR with Librispeech ASR2 | ||
|
||
This example contains code used to train a Transformer or [Conformer](http://arxiv.org/abs/2008.03802) model with [Librispeech dataset](http://www.openslr.org/resources/12) and use some functions in kaldi. | ||
|
||
|
@@ -213,44 +213,22 @@ avg.sh latest exp/transformer/checkpoints 10 | |
./local/recog.sh --ckpt_prefix exp/transformer/checkpoints/avg_10 | ||
``` | ||
## Pretrained Model | ||
You can get the pretrained transformer using the scripts below: | ||
```bash | ||
# Transformer: | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/transformer.model.tar.gz | ||
``` | ||
You can get the pretrained models from [this](../../../docs/source/released_model.md). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 同上 |
||
|
||
using the `tar` scripts to unpack the model and then you can use the script to test the model. | ||
|
||
For example: | ||
```bash | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/transformer.model.tar.gz | ||
tar xzvf transformer.model.tar.gz | ||
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz | ||
tar xzvf asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz | ||
source path.sh | ||
# If you have process the data and get the manifest file, you can skip the following 2 steps | ||
bash local/data.sh --stage -1 --stop_stage -1 | ||
bash local/data.sh --stage 2 --stop_stage 2 | ||
|
||
CUDA_VISIBLE_DEVICES= ./local/test.sh conf/transformer.yaml exp/ctc/checkpoints/avg_10 | ||
``` | ||
The performance of the released models are shown below: | ||
### Transformer | ||
| Model | Params | GPUS | Averaged Model | Config | Augmentation | Loss | | ||
| :---------: | :----: | :--------------------: | :--------------: | :-------------------: | :----------: | :-------------: | | ||
| transformer | 32.52M | 8 Tesla V100-SXM2-32GB | 10-best val_loss | conf/transformer.yaml | spec_aug | 6.3197922706604 | | ||
|
||
#### Attention Rescore | ||
| Test Set | Decode Method | #Snt | #Wrd | Corr | Sub | Del | Ins | Err | S.Err | | ||
| ---------- | --------------------- | ---- | ----- | ---- | ---- | ---- | ---- | ---- | ----- | | ||
| test-clean | attention | 2620 | 52576 | 96.4 | 2.5 | 1.1 | 0.4 | 4.0 | 34.7 | | ||
| test-clean | ctc_greedy_search | 2620 | 52576 | 95.9 | 3.7 | 0.4 | 0.5 | 4.6 | 48.0 | | ||
| test-clean | ctc_prefix_beamsearch | 2620 | 52576 | 95.9 | 3.7 | 0.4 | 0.5 | 4.6 | 47.6 | | ||
| test-clean | attention_rescore | 2620 | 52576 | 96.8 | 2.9 | 0.3 | 0.4 | 3.7 | 38.0 | | ||
|
||
#### JoinCTC | ||
| Test Set | Decode Method | #Snt | #Wrd | Corr | Sub | Del | Ins | Err | S.Err | | ||
| ---------- | ----------------- | ---- | ----- | ---- | ---- | ---- | ---- | ---- | ----- | | ||
| test-clean | join_ctc_only_att | 2620 | 52576 | 96.1 | 2.5 | 1.4 | 0.4 | 4.4 | 34.7 | | ||
| test-clean | join_ctc_w/o_lm | 2620 | 52576 | 97.2 | 2.6 | 0.3 | 0.4 | 3.2 | 34.9 | | ||
| test-clean | join_ctc_w_lm | 2620 | 52576 | 97.9 | 1.8 | 0.2 | 0.3 | 2.4 | 27.8 | | ||
The performance of the released models are shown [here](./RESULTS.md). | ||
|
||
Compare with [ESPNET](https://github.com/espnet/espnet/blob/master/egs/librispeech/asr1/RESULTS.md#pytorch-large-transformer-with-specaug-4-gpus--transformer-lm-4-gpus) we using 8gpu, but the model size (aheads4-adim256) small than it. | ||
## Stage 5: CTC Alignment | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是用 here 比较好.. this 感觉怪怪的