A macOS menu bar app that turns speech into refined, ready-to-send text anywhere you type — powered by local Whisper transcription and AI cleanup.
Trigger recording with a global shortcut or wake phrase, speak naturally, and have your words transcribed locally, refined with AI, and inserted directly into the active text field — whether that's a chat window, browser, editor, or AI tool.
No app switching. No copy-pasting. Just speak and send.
- Global shortcut & wake phrase activation — trigger from anywhere on your Mac
- Local Whisper transcription — privacy-first, on-device speech-to-text via WhisperKit
- AI cleanup — refine raw dictation into polished text with configurable prompts
- Flexible AI providers — OpenAI, Anthropic, or custom endpoints with selectable models
- Direct text insertion — results go straight into the active text field
- Spoken send phrase — hands-free recording control
- Auto media pause/resume — pauses Spotify/Apple Music while recording
- Recording history — browse, replay, and reuse past transcriptions
- Custom prompt library — save and switch between cleanup prompts
- Multiple shortcut bindings — assign different shortcuts to different prompts
- Notification sounds — configurable beep presets and volume
- Menu bar app — always available, never in the way
- Press your shortcut (or say your wake phrase)
- Background media pauses automatically
- Speak your message
- Say your send phrase or press the shortcut again to stop
- Audio is transcribed locally with Whisper
- AI cleanup refines the transcription (optional)
- Text is inserted into the active text field or copied to clipboard
- macOS 14.0 (Sonoma) or later
- Microphone access
- Accessibility permission (for text field detection and insertion)
- Input Monitoring permission (for global keyboard shortcuts)
- Download
Mac-Speech-to-AI-to-Text.zipfrom thereleases/folder - Unzip and drag Mac Speech to AI to Text.app to your Applications folder
- Open the app — grant Microphone, Accessibility, and Input Monitoring permissions when prompted
git clone https://github.com/savedpixel/mac-speech-to-ai-to-text.git
cd mac-speech-to-ai-to-text
swift build
bash scripts/build-app.sh
open "build/Mac Speech to AI to Text.app"Or open Package.swift in Xcode and run directly.
| Layer | Technology |
|---|---|
| Language | Swift 5.9+ |
| UI | SwiftUI + AppKit (NSStatusItem) |
| Audio | AVAudioEngine |
| Transcription | WhisperKit (local Whisper inference) |
| Input | CGEvent (shortcuts, key simulation), Accessibility API |
| Speech | Speech framework (wake/send phrase detection) |
| Persistence | UserDefaults / @AppStorage |
| Logging | OSLog |
MacSpeechToAIToText/
App/ — Entry point, AppDelegate
Audio/ — Recording, playback, media control, signal sounds
Core/ — Settings, pipeline coordinator, history, AI providers
Input/ — Shortcuts, text insertion, wake/send phrase listeners
Transcription/ — Whisper engine, AI cleanup
UI/ — Menu bar, preferences, recording overlay, history
All settings are accessible from the app's main window:
- Shortcuts — Set global hotkeys, bind prompts to specific shortcuts
- AI Cleanup — Choose provider, model, and endpoint; toggle cleanup on/off
- Prompts — Create and manage cleanup prompt templates
- Audio — Select microphone, configure notification sounds
- Send Phrase — Set spoken phrase to stop recording hands-free
- Wake Phrase — Set spoken phrase to start recording hands-free
- History — Auto-delete old recordings, browse past transcriptions
- Core transcription runs entirely on-device using Whisper — no audio leaves your Mac
- AI cleanup (optional) sends only the transcribed text to your chosen provider
- No telemetry, no analytics, no accounts
MIT