Speech-to-text with AI-powered formatting. Record your voice, and SpeechForge transcribes it using Groq's Whisper API, then cleans it up with an LLM — fixing errors, removing filler words, and formatting naturally. The result is copied straight to your clipboard.
- Push-to-talk recording — press a hotkey to start recording, press again to process
- AI-powered cleanup — transcriptions are refined by an LLM to fix errors and improve readability
- Multiple profiles — create different prompt profiles for different contexts (coding, email, notes, etc.)
- Per-profile vocabulary — add correction terms and text expansions to each profile
- Customizable hotkeys — configure record/cancel keys and per-profile shortcut keys
- Model selection — choose your preferred Whisper and LLM models from the settings UI
- Native desktop app — runs as a system-tray-style window using NiceGUI + pywebview
- Console mode — headless mode for minimal resource usage
- uv (Python package manager — installs Python for you)
- A Groq API key (free tier available)
- A microphone
- Windows (primary target; may work on macOS/Linux with minor adjustments)
You'll need a terminal (PowerShell) for the setup steps below. Open one by pressing the Windows key, typing powershell, and selecting it.
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
Reopen your terminal after installing, then verify with uv --version.
Download from git-scm.com/downloads/win and run the installer with defaults. Reopen your terminal, then verify with git --version.
git clone https://github.com/KennyVaneetvelde/SpeechForge.git
cd SpeechForge
uv sync
uv sync installs the correct Python version and all dependencies automatically.
- Create an account at console.groq.com.
- Go to API Keys and create one.
- Copy the key (
gsk_...) — it's only shown once.
uv run speechforge
Go to the Settings tab and paste your Groq API key. It's saved automatically to ~/.speechforge/config.yaml.
If recording doesn't work, make sure microphone access is enabled in Windows privacy settings: Settings > Privacy > Microphone — enable both "Microphone access" and "Let desktop apps access your microphone".
- Press Pause to start recording (status changes to "Recording").
- Speak into your microphone.
- Press Pause again to stop.
- The result is copied to your clipboard — paste with Ctrl+V.
If your keyboard doesn't have a Pause key, change the hotkey in Settings.
- Launch:
uv run speechforge(from the project folder) - Press the record hotkey (default:
Pause) - Speak
- Press the hotkey again to stop
- Result is on your clipboard — paste wherever you need it
Profiles let you tailor the AI's behavior for different tasks. Each profile has a name, an AI prompt, an optional hotkey, and vocabulary (corrections + expansions). Manage them in the Profiles tab.
| Setting | Description | Default |
|---|---|---|
| Groq API Key | Your Groq API key (gsk_...) |
(none) |
| Record hotkey | Key to start/stop recording | Pause |
| Cancel hotkey | Key to cancel an in-progress recording | Escape |
| Audio input device | Which microphone to use | System default |
| Transcription model | Groq Whisper model for speech-to-text | whisper-large-v3 |
| Processing model | Groq LLM model for text cleanup | llama-3.3-70b-versatile |
uv run speechforge --console
Runs without the GUI. Same hotkey behavior, results still go to clipboard. Quit with Ctrl+C.
uv run speechforge -v
Create a shortcut so you don't need to open a terminal each time. Run this in PowerShell:
$projectDir = "C:\path\to\SpeechForge"
$shortcutPath = "$env:USERPROFILE\Desktop\SpeechForge.lnk"
$shell = New-Object -ComObject WScript.Shell
$shortcut = $shell.CreateShortcut($shortcutPath)
$shortcut.TargetPath = "powershell.exe"
$shortcut.Arguments = "-WindowStyle Hidden -Command `"cd '$projectDir'; uv run speechforge`""
$shortcut.WorkingDirectory = $projectDir
$shortcut.Description = "SpeechForge - Speech-to-text with AI"
$shortcut.Save()Set $projectDir to your SpeechForge folder and $shortcutPath to wherever you want the shortcut.
Tip:
-WindowStyle Hiddenhides the terminal. Remove it if you want to see logs.
Tip: For auto-start on login, set
$shortcutPathto"$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup\SpeechForge.lnk".
SpeechForge uses uv to manage Python and dependencies. Key points:
uv syncinstalls the correct Python version and all dependencies into an isolated.venv.uv run speechforgeruns the app using that environment. Always useuv runinstead of callingpythondirectly.- After
git pull, runuv syncagain to pick up dependency changes. - The
uv.lockfile ensures reproducible installs.
See the uv documentation for more.
Configuration and profiles are stored in ~/.speechforge/:
~\.speechforge\
├── config.yaml ← settings (API key, hotkeys, models, audio device)
└── profiles.yaml ← profiles (prompts, vocabulary, hotkeys)
Delete this folder to reset to defaults.
- Check the hotkey shown in the Status tab — make sure you're pressing the right key.
- Try a different key in Settings (e.g.
F1,F8). Some keys may be captured by other apps. - Try running as administrator.
- Run with
-vand check where the pipeline stops. - Verify your microphone works in Windows Sound Settings.
- Try selecting your specific microphone in SpeechForge Settings instead of "default".
Paste your API key in the Settings tab. Get one at console.groq.com/keys.
- Verify uv is installed:
uv --version - Make sure you're in the SpeechForge folder (the one with
pyproject.toml). - Reopen your terminal and try again.
git pull
uv sync
If it persists, use uv run speechforge --console as a workaround.
- Check your profile prompt — the AI follows whatever instructions you give it.
- Try a different LLM model in Settings.