Skip to content

Eigenwise/SpeechForge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpeechForge

Speech-to-text with AI-powered formatting. Record your voice, and SpeechForge transcribes it using Groq's Whisper API, then cleans it up with an LLM — fixing errors, removing filler words, and formatting naturally. The result is copied straight to your clipboard.

Features

  • Push-to-talk recording — press a hotkey to start recording, press again to process
  • AI-powered cleanup — transcriptions are refined by an LLM to fix errors and improve readability
  • Multiple profiles — create different prompt profiles for different contexts (coding, email, notes, etc.)
  • Per-profile vocabulary — add correction terms and text expansions to each profile
  • Customizable hotkeys — configure record/cancel keys and per-profile shortcut keys
  • Model selection — choose your preferred Whisper and LLM models from the settings UI
  • Native desktop app — runs as a system-tray-style window using NiceGUI + pywebview
  • Console mode — headless mode for minimal resource usage

Requirements

  • uv (Python package manager — installs Python for you)
  • A Groq API key (free tier available)
  • A microphone
  • Windows (primary target; may work on macOS/Linux with minor adjustments)

Setup

You'll need a terminal (PowerShell) for the setup steps below. Open one by pressing the Windows key, typing powershell, and selecting it.

1. Install uv

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Reopen your terminal after installing, then verify with uv --version.

2. Install Git

Download from git-scm.com/downloads/win and run the installer with defaults. Reopen your terminal, then verify with git --version.

3. Clone and install

git clone https://github.com/KennyVaneetvelde/SpeechForge.git
cd SpeechForge
uv sync

uv sync installs the correct Python version and all dependencies automatically.

4. Get a Groq API key

  1. Create an account at console.groq.com.
  2. Go to API Keys and create one.
  3. Copy the key (gsk_...) — it's only shown once.

5. Launch and configure

uv run speechforge

Go to the Settings tab and paste your Groq API key. It's saved automatically to ~/.speechforge/config.yaml.

6. Microphone access

If recording doesn't work, make sure microphone access is enabled in Windows privacy settings: Settings > Privacy > Microphone — enable both "Microphone access" and "Let desktop apps access your microphone".

7. Test it

  1. Press Pause to start recording (status changes to "Recording").
  2. Speak into your microphone.
  3. Press Pause again to stop.
  4. The result is copied to your clipboard — paste with Ctrl+V.

If your keyboard doesn't have a Pause key, change the hotkey in Settings.


Usage

Basic workflow

  1. Launch: uv run speechforge (from the project folder)
  2. Press the record hotkey (default: Pause)
  3. Speak
  4. Press the hotkey again to stop
  5. Result is on your clipboard — paste wherever you need it

Profiles

Profiles let you tailor the AI's behavior for different tasks. Each profile has a name, an AI prompt, an optional hotkey, and vocabulary (corrections + expansions). Manage them in the Profiles tab.

Settings

Setting Description Default
Groq API Key Your Groq API key (gsk_...) (none)
Record hotkey Key to start/stop recording Pause
Cancel hotkey Key to cancel an in-progress recording Escape
Audio input device Which microphone to use System default
Transcription model Groq Whisper model for speech-to-text whisper-large-v3
Processing model Groq LLM model for text cleanup llama-3.3-70b-versatile

Console mode

uv run speechforge --console

Runs without the GUI. Same hotkey behavior, results still go to clipboard. Quit with Ctrl+C.

Verbose logging

uv run speechforge -v

Creating a shortcut

Create a shortcut so you don't need to open a terminal each time. Run this in PowerShell:

$projectDir = "C:\path\to\SpeechForge"
$shortcutPath = "$env:USERPROFILE\Desktop\SpeechForge.lnk"
$shell = New-Object -ComObject WScript.Shell
$shortcut = $shell.CreateShortcut($shortcutPath)
$shortcut.TargetPath = "powershell.exe"
$shortcut.Arguments = "-WindowStyle Hidden -Command `"cd '$projectDir'; uv run speechforge`""
$shortcut.WorkingDirectory = $projectDir
$shortcut.Description = "SpeechForge - Speech-to-text with AI"
$shortcut.Save()

Set $projectDir to your SpeechForge folder and $shortcutPath to wherever you want the shortcut.

Tip: -WindowStyle Hidden hides the terminal. Remove it if you want to see logs.

Tip: For auto-start on login, set $shortcutPath to "$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup\SpeechForge.lnk".


About uv

SpeechForge uses uv to manage Python and dependencies. Key points:

  • uv sync installs the correct Python version and all dependencies into an isolated .venv.
  • uv run speechforge runs the app using that environment. Always use uv run instead of calling python directly.
  • After git pull, run uv sync again to pick up dependency changes.
  • The uv.lock file ensures reproducible installs.

See the uv documentation for more.


Data storage

Configuration and profiles are stored in ~/.speechforge/:

~\.speechforge\
├── config.yaml      ← settings (API key, hotkeys, models, audio device)
└── profiles.yaml    ← profiles (prompts, vocabulary, hotkeys)

Delete this folder to reset to defaults.


Troubleshooting

Nothing happens when I press the record hotkey

  • Check the hotkey shown in the Status tab — make sure you're pressing the right key.
  • Try a different key in Settings (e.g. F1, F8). Some keys may be captured by other apps.
  • Try running as administrator.

No output / nothing pasted

  • Run with -v and check where the pipeline stops.
  • Verify your microphone works in Windows Sound Settings.
  • Try selecting your specific microphone in SpeechForge Settings instead of "default".

Groq API key is required — set it in Settings

Paste your API key in the Settings tab. Get one at console.groq.com/keys.

uv sync fails

  • Verify uv is installed: uv --version
  • Make sure you're in the SpeechForge folder (the one with pyproject.toml).
  • Reopen your terminal and try again.

GUI crashes on launch

git pull
uv sync

If it persists, use uv run speechforge --console as a workaround.

Garbled or wrong results

  • Check your profile prompt — the AI follows whatever instructions you give it.
  • Try a different LLM model in Settings.

About

Speech-to-text with AI processing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages