Skip to content

Rust SAPI engine (and simple CLI) that uses Windows' text-to-speech APIs and also supports piper text-to-speech voices

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

Lej77/windows-text-to-speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Windows text-to-speech

This repository contains a Rust CLI program that uses Windows' text-to-speech APIs to read text passed to the program. You can find the source code in ./crates/windows_tts_cli/.

This repository also contains a text-to-speech engine which uses Microsoft Language Detection to determine the sent language and then selects a voice in that language and plays the text using the Windows.Media.SpeechSynthesis.SpeechSynthesizer class. The engine can alternatively use the lingua Rust library for better language detection and/or the piper-rs Rust library for text-to-speech.

Usage of windows_tts_cli

Install Rust and then you can install this program from source:

cargo install --git "https://github.com/Lej77/windows-text-to-speech" windows_tts_cli
# Latest "windows_tts_cli.exe" will be built and placed inside "%UserProfile%/.cargo/bin/"
windows_tts_cli.exe This text will be read

cargo uninstall windows_tts_cli

If you have cloned this repository, then you can run the code using:

cargo run -- This text will be read

Alternatively download the windows_tts_cli.exe binary from the latest release and run that from the command line:

./windows_tts_cli.exe This text will be read

If you have Cargo B(inary)Install then it can download the latest release for you:

cargo binstall --git "https://github.com/Lej77/windows-text-to-speech" windows_tts_cli
# Latest "windows_tts_cli.exe" will be downloaded to "%UserProfile%/.cargo/bin/"
windows_tts_cli.exe This text will be read

cargo uninstall windows_tts_cli

Usage of windows_tts_engine

  1. Acquire windows_tts_engine_installer.exe and at least one text-to-speech engine like windows_tts_engine.dll or windows_tts_engine_piper.dll.
    • You can find them in the latest GitHub release.
      • The windows_tts_engine_piper_lingua.dll file is an alternative to windows_tts_engine_piper.dll and should be renamed to that before running the installer.
        • This alternative DLL uses the lingua Rust library for language detection.
        • The DLL has a larger file size and the lingua will also use more RAM but it should be slightly better at detecting languages.
    • Or you can build them from source:
      1. Install Rust
      2. Clone this repository:
        git clone https://github.com/Lej77/windows-text-to-speech.git
      3. Build everything in the repository:
        cargo build --release --workspace
      4. You should find the built files inside the ./target/release folder.
  2. Place all files in the same directory and run the installer.
    • Actually you don't need the installer, just run regsvr32 ./windows_tts_engine.dll for each of the text-to-speech engine DLLs to install them.
      • This won't add an uninstall entry in Windows Settings app.
      • This command needs to run with admin rights, otherwise it will fail.
  3. You can now find and select the voice in Windows Control Panel under the Speech Recognition icon then the Text to Speech link in the left sidebar.
    • The text-to-speech engine will NOT be visible in the modern Settings app under the
      Time & Languages > Speech > Voices option.
    • The text-to-speech engine WILL be visible in the modern Settings app under
      Accessibility > Narrator > Choose a voice option.
  4. If you move the files you need to re-install them, otherwise Windows won't be able to find them.
  5. Uninstall the program using the command windows_tts_engine_installer.exe --uninstall or through Windows Settings app (the Programs and Features panel).
    • Note that the uninstaller won't remove any files, it will only unregister the program from Windows by removing Windows Registry entries.
    • If you installed the text-to-speech engine without the install then you can uninstall it using regsvr32 /u ./windows_tts_engine.dll. (Use the full path if the terminal isn't in the same folder as the dll file.)
      • This command needs to run with admin rights, otherwise it will fail.

If you installed the windows_tts_engine_piper.dll text-to-speech engine then it will expect a folder named piper_models inside the same folder as the DLL file. In the piper_models folder you need to put .onnx.json model configs and .onnx model files for the engine to work. You can also add .voice.txt files next to the model files with a single integer in each to specify the voice/speaker used (for models with multiple speakers).

Example file structure:

  • C:\Program Files\Lej77TextToSpeech
    • piper_models
      • en_US-libritts_r-medium.onnx
      • en_US-libritts_r-medium.onnx.json
      • en_US-libritts_r-medium.voice.txt (optional)
    • windows_tts_engine.dll
    • windows_tts_engine_installer.exe
    • windows_tts_engine_piper.dll
    • windows_tts_engine_piper.debug.log (only when debugging, will grow in size without any limit)

Debugging text-to-speech engine DLL

Both text-to-speech engine DLL can write debug logs if there is a DLL_NAME.debug.log file present next to the engine DLL. This is useful if the text to speech engine is not working properly and you want to determine why. Make sure to delete the log file after you finish debugging since otherwise the engine will keep writing debug logs into it forever, which might eventually make it quite large.

Prerequisites for windows_tts_engine_piper.dll

The windows_tts_engine_piper.dll DLL is not statically linked to the C runtime so to use it you need to install the Microsoft Visual C++ Runtime.

The piper text-to-speech engine also requires eSpeak NG data files. You can download them from piper-rs's GitHub releases or by simply installing eSpeak NG itself.

References

Text-to-speech on Windows

High quality offline text-to-speech

Develop new text-to-speech voices/engines for legacy Microsoft Speech API (SAPI)

License

This project is released under either:

at your choosing.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

About

Rust SAPI engine (and simple CLI) that uses Windows' text-to-speech APIs and also supports piper text-to-speech voices

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published

Languages