VoiceMode (Cross‑platform CLI)

Simple, robust voice dictation that works on Windows and macOS with only Python installed.

Author: Thomas Rice [email protected]
Website: https://www.thomasrice.com/
Thomas Rice is co‑founder of Minotaur Capital — https://www.minotaurcapital.com/

Highlights:

Global hotkey to start/stop recording (default: F8)
Uses OpenAI speech‑to‑text (gpt-4o-transcribe by default)
Pastes the result into the active app automatically
Cross‑platform start/stop sounds (uses packaged WAVs if present, else generated beeps)
Resilient audio stream with auto‑restart on errors

Quick Start

Prerequisites:

Python 3.11+
An OpenAI API key in the environment: OPENAI_API_KEY=...
- Optionally create a .env file next to this README with OPENAI_API_KEY=... (auto-loaded)

Install dependencies (recommended):

python -m pip install -r requirements.txt

Run directly without installing the package:

python -m voiceapp --help
python -m voiceapp  # Interactive mode (default)

Or install as an editable package (recommended):

python -m pip install -e .
voicemode  # CLI entry point

Notes for macOS:

Grant your terminal app “Accessibility” permission (System Settings → Privacy & Security → Accessibility) so global hotkeys and paste work.
On macOS, paste uses Command+V; on Windows, Ctrl+V.
If keyboard is unavailable, the app falls back to pynput automatically.

Helper launchers in this folder:

./voicemode or ./voicemode.sh (macOS/Linux)
voicemode.bat (Windows)
voicemode.command (macOS double‑click)

All launchers run python -m voiceapp from this folder.

Options

--hotkey       Global hotkey (default: F8)
--model        OpenAI model (default: gpt-4o-transcribe)
--rate         Sample rate in Hz (default: 16000)
--device       Optional input device index (see --list-devices)
--no-sound     Disable start/stop sounds
--list-devices List audio input devices and exit
--push-to-talk Hold hotkey to record; release to transcribe

Background server (Linux/macOS)

Optionally, you can run a small background server that responds to toggle commands. This is useful if you want to bind your desktop hotkey to voicemode toggle without keeping a foreground terminal open.

Start server: voicemode serve
Toggle listening: voicemode toggle
Check status: voicemode status

Notes:

The server uses a Unix domain socket under the app config directory and is supported on Linux and macOS.
On Windows, use the default interactive mode instead.

Sounds

If you’d like custom sounds, place WAV files at:

voiceapp/assets/start.wav
voiceapp/assets/stop.wav

They’ll be packaged when installed. If missing, the app generates short beeps.

Setting your OpenAI key

VoiceMode looks for your key in this order:

Environment variable OPENAI_API_KEY (preferred)
.env file in this folder (auto‑loaded)
Per‑user config file saved by the command below
Simple text files named openai.txt, openai.key, or OPENAI_API_KEY.txt in this folder or in the app config folder

To save your key using the CLI:

voicemode config --set-openai-key sk-...

This writes to a per‑user config file:

Windows: %APPDATA%\VoiceApp\config.json
macOS: ~/Library/Application Support/VoiceApp/config.json
Linux: ~/.config/voiceapp/config.json

Alternatively, create a .env file in this folder containing:

OPENAI_API_KEY=sk-...

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
voiceapp		voiceapp
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock
voicemode		voicemode
voicemode.bat		voicemode.bat
voicemode.command		voicemode.command
voicemode.sh		voicemode.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VoiceMode (Cross‑platform CLI)

Quick Start

Options

Background server (Linux/macOS)

Sounds

Setting your OpenAI key

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

thomasrice/voicemode

Folders and files

Latest commit

History

Repository files navigation

VoiceMode (Cross‑platform CLI)

Quick Start

Options

Background server (Linux/macOS)

Sounds

Setting your OpenAI key

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages