@@ -103,12 +103,19 @@ If you want to train the model, you can use the script below to execute stage 0
103
103
``` bash
104
104
bash run.sh --stage 0 --stop_stage 1
105
105
```
106
- or you can run these scripts in the command line (only use CPU).
106
+ Or you can run these scripts in the command line (only use CPU).
107
107
``` bash
108
108
source path.sh
109
109
bash ./local/data.sh
110
- CUDA_VISIBLE_DEVICES= ./local/train.sh conf/deepspeech2.yaml deepspeech2
110
+ CUDA_VISIBLE_DEVICES= ./local/train.sh conf/deepspeech2.yaml deepspeech2
111
111
```
112
+ If you want to use GPU, you can run these scripts in the command line (suppose you have only 1 GPU).
113
+ ``` bash
114
+ source path.sh
115
+ bash ./local/data.sh
116
+ CUDA_VISIBLE_DEVICES=0 ./local/train.sh conf/deepspeech2.yaml deepspeech2
117
+ ```
118
+
112
119
## Stage 2: Top-k Models Averaging
113
120
After training the model, we need to get the final model for testing and inference. In every epoch, the model checkpoint is saved, so we can choose the best model from them based on the validation loss or we can sort them and average the parameters of the top-k models to get the final model. We can use stage 2 to do this, and the code is shown below:
114
121
``` bash
@@ -148,7 +155,7 @@ source path.sh
148
155
bash ./local/data.sh
149
156
CUDA_VISIBLE_DEVICES= ./local/train.sh conf/deepspeech2.yaml deepspeech2
150
157
avg.sh best exp/deepspeech2/checkpoints 1
151
- CUDA_VISIBLE_DEVICES= ./local/test.sh conf/deepspeech2.yaml exp/deepspeech2/checkpoints/avg_1
158
+ CUDA_VISIBLE_DEVICES= ./local/test.sh conf/deepspeech2.yaml conf/tuning/decode.yaml exp/deepspeech2/checkpoints/avg_10
152
159
```
153
160
## Pretrained Model
154
161
You can get the pretrained models from [ this] ( ../../../docs/source/released_model.md ) .
@@ -157,14 +164,14 @@ using the `tar` scripts to unpack the model and then you can use the script to t
157
164
158
165
For example:
159
166
```
160
- wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_aishell_ckpt_0.1 .1.model.tar.gz
161
- tar xzvf asr0_deepspeech2_aishell_ckpt_0.1 .1.model.tar.gz
167
+ wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_offline_aishell_ckpt_1.0 .1.model.tar.gz
168
+ tar xzvf asr0_deepspeech2_offline_aishell_ckpt_1.0 .1.model.tar.gz
162
169
source path.sh
163
170
# If you have process the data and get the manifest file, you can skip the following 2 steps
164
171
bash local/data.sh --stage -1 --stop_stage -1
165
172
bash local/data.sh --stage 2 --stop_stage 2
166
173
167
- CUDA_VISIBLE_DEVICES= ./local/test.sh conf/deepspeech2.yaml exp/deepspeech2/checkpoints/avg_1
174
+ CUDA_VISIBLE_DEVICES= ./local/test.sh conf/deepspeech2.yaml exp/deepspeech2/checkpoints/avg_10
168
175
```
169
176
The performance of the released models are shown in [ this] ( ./RESULTS.md )
170
177
## Stage 4: Static graph model Export
@@ -178,7 +185,7 @@ This stage is to transform dygraph to static graph.
178
185
If you already have a dynamic graph model, you can run this script:
179
186
``` bash
180
187
source path.sh
181
- ./local/export.sh deepspeech2.yaml exp/deepspeech2/checkpoints/avg_1 exp/deepspeech2/checkpoints/avg_1 .jit offline
188
+ ./local/export.sh conf/ deepspeech2.yaml exp/deepspeech2/checkpoints/avg_10 exp/deepspeech2/checkpoints/avg_10 .jit
182
189
```
183
190
## Stage 5: Static graph Model Testing
184
191
Similar to stage 3, the static graph model can also be tested.
@@ -190,7 +197,7 @@ Similar to stage 3, the static graph model can also be tested.
190
197
```
191
198
If you already have exported the static graph, you can run this script:
192
199
``` bash
193
- CUDA_VISIBLE_DEVICES= ./local/test_export.sh conf/deepspeech2.yaml exp/deepspeech2/checkpoints/avg_1 .jit offline
200
+ CUDA_VISIBLE_DEVICES= ./local/test_export.sh conf/deepspeech2.yaml conf/tuning/decode.yaml exp/deepspeech2/checkpoints/avg_10 .jit
194
201
```
195
202
## Stage 6: Single Audio File Inference
196
203
In some situations, you want to use the trained model to do the inference for the single audio file. You can use stage 5. The code is shown below
@@ -202,14 +209,14 @@ if [ ${stage} -le 6 ] && [ ${stop_stage} -ge 6 ]; then
202
209
```
203
210
you can train the model by yourself, or you can download the pretrained model by the script below:
204
211
``` bash
205
- wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_aishell_ckpt_0.1 .1.model.tar.gz
206
- tar xzvf asr0_deepspeech2_aishell_ckpt_0.1 .1.model.tar.gz
212
+ wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_offline_aishell_ckpt_1.0 .1.model.tar.gz
213
+ tar asr0_deepspeech2_offline_aishell_ckpt_1.0 .1.model.tar.gz
207
214
```
208
215
You can download the audio demo:
209
216
``` bash
210
217
wget -nc https://paddlespeech.bj.bcebos.com/datasets/single_wav/zh/demo_01_03.wav -P data/
211
218
```
212
219
You need to prepare an audio file or use the audio demo above, please confirm the sample rate of the audio is 16K. You can get the result of the audio demo by running the script below.
213
220
``` bash
214
- CUDA_VISIBLE_DEVICES= ./local/test_wav.sh conf/deepspeech2.yaml conf/tuning/decode.yaml exp/deepspeech2/checkpoints/avg_1 data/demo_01_03.wav
221
+ CUDA_VISIBLE_DEVICES= ./local/test_wav.sh conf/deepspeech2.yaml conf/tuning/decode.yaml exp/deepspeech2/checkpoints/avg_10 data/demo_01_03.wav
215
222
```
0 commit comments