Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 5 additions & 12 deletions examples/aishell/asr0/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,21 +151,14 @@ avg.sh best exp/deepspeech2/checkpoints 1
CUDA_VISIBLE_DEVICES= ./local/test.sh conf/deepspeech2.yaml exp/deepspeech2/checkpoints/avg_1
```
## Pretrained Model
You can get the pretrained transformer or conformer using the scripts below:
```bash
Deepspeech2 offline:
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/ds2.model.tar.gz

Deepspeech2 online:
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/aishell_ds2_online_cer8.00_release.tar.gz
You can get the pretrained models from [this](../../../docs/source/released_model.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是用 here 比较好.. this 感觉怪怪的


```
using the `tar` scripts to unpack the model and then you can use the script to test the model.

For example:
```
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/ds2.model.tar.gz
tar xzvf ds2.model.tar.gz
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz
tar xzvf asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz
source path.sh
# If you have process the data and get the manifest file, you can skip the following 2 steps
bash local/data.sh --stage -1 --stop_stage -1
Expand Down Expand Up @@ -209,8 +202,8 @@ if [ ${stage} -le 6 ] && [ ${stop_stage} -ge 6 ]; then
```
you can train the model by yourself, or you can download the pretrained model by the script below:
```bash
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/ds2.model.tar.gz
tar xzvf ds2.model.tar.gz
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz
tar xzvf asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz
```
You can download the audio demo:
```bash
Expand Down
19 changes: 4 additions & 15 deletions examples/aishell/asr1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,25 +143,14 @@ avg.sh best exp/conformer/checkpoints 20
CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20
```
## Pretrained Model
You can get the pretrained transformer or conformer using the scripts below:
You can get the pretrained transformer or conformer from [this](../../../docs/source/released_model.md)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

相同


```bash
# Conformer:
wget https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.release.tar.gz

# Chunk Conformer:
wget https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.chunk.release.tar.gz

# Transformer:
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz

```
using the `tar` scripts to unpack the model and then you can use the script to test the model.

For example:
```
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz
tar xzvf transformer.model.tar.gz
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz
tar xzvf asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz
source path.sh
# If you have process the data and get the manifest file, you can skip the following 2 steps
bash local/data.sh --stage -1 --stop_stage -1
Expand Down Expand Up @@ -206,7 +195,7 @@ In some situations, you want to use the trained model to do the inference for th
```
you can train the model by yourself using ```bash run.sh --stage 0 --stop_stage 3```, or you can download the pretrained model through the script below:
```bash
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz
wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz
tar xzvf transformer.model.tar.gz
```
You can download the audio demo:
Expand Down
36 changes: 7 additions & 29 deletions examples/librispeech/asr1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,44 +151,22 @@ avg.sh best exp/conformer/checkpoints 20
CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20
```
## Pretrained Model
You can get the pretrained transformer or conformer using the scripts below:
```bash
# Conformer:
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz
# Transformer:
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/transformer.model.tar.gz
```
You can get the pretrained transformer or conformer from [this](../../../docs/source/released_model.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上


using the `tar` scripts to unpack the model and then you can use the script to test the model.

For example:
```bash
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz
tar xzvf transformer.model.tar.gz
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
tar xzvf asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
source path.sh
# If you have process the data and get the manifest file, you can skip the following 2 steps
bash local/data.sh --stage -1 --stop_stage -1
bash local/data.sh --stage 2 --stop_stage 2
CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20
```
The performance of the released models are shown below:
## Conformer
train: Epoch 70, 4 V100-32G, best avg: 20

| Model | Params | Config | Augmentation | Test set | Decode method | Loss | WER |
| --------- | ------- | ------------------- | ------------ | ---------- | ---------------------- | ----------------- | -------- |
| conformer | 47.63 M | conf/conformer.yaml | spec_aug | test-clean | attention | 6.433612394332886 | 0.039771 |
| conformer | 47.63 M | conf/conformer.yaml | spec_aug | test-clean | ctc_greedy_search | 6.433612394332886 | 0.040342 |
| conformer | 47.63 M | conf/conformer.yaml | spec_aug | test-clean | ctc_prefix_beam_search | 6.433612394332886 | 0.040342 |
| conformer | 47.63 M | conf/conformer.yaml | spec_aug | test-clean | attention_rescoring | 6.433612394332886 | 0.033761 |
## Transformer
train: Epoch 120, 4 V100-32G, 27 Day, best avg: 10
The performance of the released models are shown in [here](./RESULTS.md).

| Model | Params | Config | Augmentation | Test set | Decode method | Loss | WER |
| ----------- | ------- | --------------------- | ------------ | ---------- | ---------------------- | ----------------- | -------- |
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | attention | 6.382194232940674 | 0.049661 |
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | ctc_greedy_search | 6.382194232940674 | 0.049566 |
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | ctc_prefix_beam_search | 6.382194232940674 | 0.049585 |
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | attention_rescoring | 6.382194232940674 | 0.038135 |
## Stage 4: CTC Alignment
If you want to get the alignment between the audio and the text, you can use the ctc alignment. The code of this stage is shown below:
```bash
Expand Down Expand Up @@ -227,8 +205,8 @@ In some situations, you want to use the trained model to do the inference for th
```
you can train the model by yourself using ```bash run.sh --stage 0 --stop_stage 3```, or you can download the pretrained model through the script below:
```bash
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz
tar xzvf conformer.model.tar.gz
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
tar xzvf asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
```
You can download the audio demo:
```bash
Expand Down
34 changes: 6 additions & 28 deletions examples/librispeech/asr2/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Transformer/Conformer ASR with Librispeech Asr2
# Transformer/Conformer ASR with Librispeech ASR2

This example contains code used to train a Transformer or [Conformer](http://arxiv.org/abs/2008.03802) model with [Librispeech dataset](http://www.openslr.org/resources/12) and use some functions in kaldi.

Expand Down Expand Up @@ -213,44 +213,22 @@ avg.sh latest exp/transformer/checkpoints 10
./local/recog.sh --ckpt_prefix exp/transformer/checkpoints/avg_10
```
## Pretrained Model
You can get the pretrained transformer using the scripts below:
```bash
# Transformer:
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/transformer.model.tar.gz
```
You can get the pretrained models from [this](../../../docs/source/released_model.md).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上


using the `tar` scripts to unpack the model and then you can use the script to test the model.

For example:
```bash
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/transformer.model.tar.gz
tar xzvf transformer.model.tar.gz
wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz
tar xzvf asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz
source path.sh
# If you have process the data and get the manifest file, you can skip the following 2 steps
bash local/data.sh --stage -1 --stop_stage -1
bash local/data.sh --stage 2 --stop_stage 2

CUDA_VISIBLE_DEVICES= ./local/test.sh conf/transformer.yaml exp/ctc/checkpoints/avg_10
```
The performance of the released models are shown below:
### Transformer
| Model | Params | GPUS | Averaged Model | Config | Augmentation | Loss |
| :---------: | :----: | :--------------------: | :--------------: | :-------------------: | :----------: | :-------------: |
| transformer | 32.52M | 8 Tesla V100-SXM2-32GB | 10-best val_loss | conf/transformer.yaml | spec_aug | 6.3197922706604 |

#### Attention Rescore
| Test Set | Decode Method | #Snt | #Wrd | Corr | Sub | Del | Ins | Err | S.Err |
| ---------- | --------------------- | ---- | ----- | ---- | ---- | ---- | ---- | ---- | ----- |
| test-clean | attention | 2620 | 52576 | 96.4 | 2.5 | 1.1 | 0.4 | 4.0 | 34.7 |
| test-clean | ctc_greedy_search | 2620 | 52576 | 95.9 | 3.7 | 0.4 | 0.5 | 4.6 | 48.0 |
| test-clean | ctc_prefix_beamsearch | 2620 | 52576 | 95.9 | 3.7 | 0.4 | 0.5 | 4.6 | 47.6 |
| test-clean | attention_rescore | 2620 | 52576 | 96.8 | 2.9 | 0.3 | 0.4 | 3.7 | 38.0 |

#### JoinCTC
| Test Set | Decode Method | #Snt | #Wrd | Corr | Sub | Del | Ins | Err | S.Err |
| ---------- | ----------------- | ---- | ----- | ---- | ---- | ---- | ---- | ---- | ----- |
| test-clean | join_ctc_only_att | 2620 | 52576 | 96.1 | 2.5 | 1.4 | 0.4 | 4.4 | 34.7 |
| test-clean | join_ctc_w/o_lm | 2620 | 52576 | 97.2 | 2.6 | 0.3 | 0.4 | 3.2 | 34.9 |
| test-clean | join_ctc_w_lm | 2620 | 52576 | 97.9 | 1.8 | 0.2 | 0.3 | 2.4 | 27.8 |
The performance of the released models are shown [here](./RESULTS.md).

Compare with [ESPNET](https://github.com/espnet/espnet/blob/master/egs/librispeech/asr1/RESULTS.md#pytorch-large-transformer-with-specaug-4-gpus--transformer-lm-4-gpus) we using 8gpu, but the model size (aheads4-adim256) small than it.
## Stage 5: CTC Alignment
Expand Down