Local speech-to-text for desktop using faster-whisper.
Let's you dictate text into any application without sending audio to any cloud services. Everything runs locally on your machine — no internet connection required after the initial model was download.
Currently only tested under Linux with KDE ;)
- Run
./cli.py listen(Whisper model downloaded on first run, cached on disk) - Hold Scroll Lock to record from your microphone
- Release Scroll Lock — the audio is transcribed locally by faster-whisper
- The transcribed text is copied to the clipboard via
wl-copyand pasted into the focused window viaydotool key ctrl+v
Used tools:
- faster-whisper for local speech recognition
- ydotool to simulate keyboard input (works on Wayland and X11)
- wl-clipboard (
wl-copy) to paste text via clipboard — avoids keyboard layout issues - chime to play notification sounds
Requirements: Python 3.12+, a working microphone, wl-clipboard and ydotool and ydotoold:
sudo apt install ydotool ydotoold wl-clipboard
sudo usermod -aG input $USER
echo 'KERNEL=="uinput", GROUP="input", MODE="0660"' | sudo tee /etc/udev/rules.d/60-uinput.rules
sudo udevadm control --reload-rules && sudo udevadm triggerThen re-login (or run newgrp input in the current shell) so the group change takes effect.
You can install "stt2desktop" with pipx:
sudo apt install pipx
pipx install stt2desktopThen run:
stt2desktop listenThe default global hotkey is Scroll Lock (In german: "rollen").
You can change it via the --hotkey option (see below).
Proposal for alternative key: ctrl_r, alt_r, cmd_r, shift_r ;)
usage: stt2desktop listen [-h] [LISTEN OPTIONS]
Start the STT listener. Hold the hotkey to record, release to transcribe and insert.
╭─ options ────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ -h, --help show this help message and exit │
│ -v, --verbosity Verbosity level; e.g.: -v, -vv, -vvv, etc. (repeatable) │
│ --model {tiny_en,tiny,base_en,base,small_en,small,medium_en,medium,large_v1,large_v2,large_v3,large,distil_large_v2, │
│ distil_medium_en,distil_small_en,distil_large_v3,distil_large_v3_5,large_v3_turbo,turbo} │
│ Whisper model to use for transcription. (default: small) │
│ --hotkey STR evdev key name to hold for recording. Release to transcribe and insert text. Examples: │
│ KEY_SCROLLLOCK, KEY_RIGHTCTRL, KEY_RIGHTALT. (default: KEY_SCROLLLOCK) │
│ --sample-rate INT Audio sample rate in Hz. Whisper expects 16000. (default: 16000) │
│ --device STR Device to run inference on, e.g. cpu or cuda. (default: auto) │
│ --compute-type STR Quantization type, e.g. int8, float16, float32. (default: int8) │
│ --num-workers {None}|INT Number of parallel transcription workers. Defaults to CPU count. (default: None) │
│ --sounds, --no-sounds Play notification sounds via chime. (default: True) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Just a selection and approximate values:
| Model | Size | Speed | Accuracy |
|---|---|---|---|
tiny |
~75 MB | fastest | lowest |
base |
~145 MB | fast | good |
small |
~460 MB | slower | better (default) |
medium |
~1.5 GB | slow | high |
Larger models produce more accurate transcriptions but take longer to process ;)
Use pavucontrol to check your audio setup and make sure the correct microphone is selected and working.
Test audio recording:
./cli.py test-recordingSome terminal commands to check your audio setup:
# List capture devices in PulseAudio sound server:
pactl list sources short
# Check current volume:
pactl list sources | grep -A1 "Name: .*input\|Volume:"
# Displays the current state in PipeWire:
wpctl statusSetup loopback mode to hear youself:
# Start:
pactl load-module module-loopback
# Undo:
pactl unload-module module-loopbackAt least uv is needed. Install e.g.: via pipx:
apt-get install pipx
pipx install uvClone the project and just start the CLI help commands. A virtual environment will be created/updated automatically.
~$ git clone https://github.com/jedie/stt2desktop.git
~$ cd stt2desktop
~/stt2desktop$ ./cli.py --help
~/stt2desktop$ ./dev-cli.py --helpusage: ./dev-cli.py [-h] {coverage,install,lint,mypy,nox,pip-audit,publish,test,update,update-readme-history,update-test-snapshot-files,version}
╭─ options ────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ -h, --help show this help message and exit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ subcommands ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ (required) │
│ • coverage Run tests and show coverage report. │
│ • install Install requirements and 'stt2desktop' via pip as editable. │
│ • lint Check/fix code style by run: "ruff check --fix" │
│ • mypy Run Mypy (configured in pyproject.toml) │
│ • nox Run nox │
│ • pip-audit │
│ Run pip-audit check against current requirements files │
│ • publish Build and upload this project to PyPi │
│ • test Run unittests │
│ • update Update dependencies (uv.lock) and git pre-commit hooks │
│ • update-readme-history │
│ Update project history base on git commits/tags in README.md Will be exited with 1 if the README.md │
│ was updated otherwise with 0. │
│ │
│ Also, callable via e.g.: │
│ python -m cli_base update-readme-history -v │
│ • update-test-snapshot-files │
│ Update all test snapshot files (by remove and recreate all snapshot files) │
│ • version Print version and exit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
- v0.3.0
- 2026-04-23 - avoid double hotkey processing
- 2026-04-23 - nicer exit
- 2026-04-23 - fix code style
- 2026-04-23 - Use a lock file to ensure that only one instance is running
- 2026-04-23 - restore old clipboard after pasting the STT text
- v0.2.0
- 2026-04-22 - paste text via clipboard to avoid keyboard layout issues
- 2026-04-16 - Add test commands and migrate to ydotool
- v0.1.2
- 2026-03-30 - print warning when not running on Linux
- 2026-03-30 - Update requirements
- 2026-03-27 - Update README
- v0.1.1
- 2026-03-27 - +Proposal for alternative hotkey
- 2026-03-27 - fix color outputs
- 2026-03-27 - Update requirements
- 2026-03-27 - add missing license file.