Simple, robust voice dictation that works on Windows and macOS with only Python installed.
- Author: Thomas Rice [email protected]
- Website: https://www.thomasrice.com/
- Thomas Rice is co‑founder of Minotaur Capital — https://www.minotaurcapital.com/
Highlights:
- Global hotkey to start/stop recording (default: F8)
- Uses OpenAI speech‑to‑text (
gpt-4o-mini-transcribeby default) - Pastes the result into the active app automatically
- Cross‑platform start/stop sounds (uses packaged WAVs if present, else generated beeps)
- Resilient audio stream with auto‑restart on errors
Prerequisites:
- Python 3.11+
- An OpenAI API key in the environment:
OPENAI_API_KEY=...- Optionally create a
.envfile next to this README withOPENAI_API_KEY=...(auto-loaded)
- Optionally create a
Install dependencies (recommended):
python -m pip install -r requirements.txtRun directly without installing the package:
python -m voiceapp --help
python -m voiceapp # Press F8 to toggle listeningOr install as an editable package (recommended):
python -m pip install -e .
voicemode # CLI entry pointNotes for macOS:
- Grant your terminal app “Accessibility” permission (System Settings → Privacy & Security → Accessibility) so global hotkeys and paste work.
- On macOS, paste uses
Command+V; on Windows,Ctrl+V. - If
keyboardis unavailable, the app falls back topynputautomatically.
Helper launchers in this folder:
./voicemodeor./voicemode.sh(macOS/Linux)voicemode.bat(Windows)voicemode.command(macOS double‑click)
All launchers run python -m voiceapp from this folder.
--hotkey Global hotkey (default: F8)
--model OpenAI model (default: gpt-4o-mini-transcribe)
--rate Sample rate in Hz (default: 16000)
--device Optional input device index (see --list-devices)
--no-sound Disable start/stop sounds
--list-devices List audio input devices and exit
--push-to-talk Hold hotkey to record; release to transcribe
If you’d like custom sounds, place WAV files at:
voiceapp/assets/start.wavvoiceapp/assets/stop.wav
They’ll be packaged when installed. If missing, the app generates short beeps.
VoiceMode looks for your key in this order:
- Environment variable
OPENAI_API_KEY(preferred) .envfile in this folder (auto‑loaded)- Per‑user config file saved by the command below
- Simple text files named
openai.txt,openai.key, orOPENAI_API_KEY.txtin this folder or in the app config folder
To save your key using the CLI:
voicemode config --set-openai-key sk-...This writes to a per‑user config file:
- Windows:
%APPDATA%\VoiceApp\config.json - macOS:
~/Library/Application Support/VoiceApp/config.json - Linux:
~/.config/voiceapp/config.json
Alternatively, create a .env file in this folder containing:
OPENAI_API_KEY=sk-...