Skip to content

Latest commit

 

History

History
executable file
·
504 lines (408 loc) · 25.4 KB

File metadata and controls

executable file
·
504 lines (408 loc) · 25.4 KB

plan.md — VoxPen Desktop (語墨桌面版) Development Plan

Vision

System-tray voice dictation for every desktop app. Press a hotkey, speak, text appears at your cursor. No input method switching, no app switching, no friction. Typeless/Wispr Flow experience with BYOK pricing.

Market Context

Product Model Price Our Advantage
Wispr Flow SaaS, closed $10/mo BYOK = near-zero cost, cross-platform
Typeless SaaS, closed $12-30/mo BYOK, open, desktop-native
macOS Dictation Built-in, free Free Multi-provider, LLM refinement, multilingual
Whisper.cpp CLI Open source Free GUI, auto-paste, refinement, no terminal needed

Positioning: Wispr Flow UX + BYOK model + cross-platform (macOS/Windows/Linux)

Prerequisites

  • VoxPen Android v1 shipped (prompts, API patterns proven)
  • Rust toolchain installed (rustup)
  • Tauri v2 CLI installed (cargo install tauri-cli)
  • Node.js + pnpm for React frontend

Phase 0: Project Scaffold (Day 1-2) ✅ COMPLETE

0.1 Tauri Project Init

  • Tauri v2 project with React + TypeScript + Tailwind frontend
  • Configure tauri.conf.json: app id com.voxpen.desktop, hidden settings window, tray icon
  • Capabilities: global-shortcut, store, sql, autostart permissions
  • All Rust dependencies in workspace Cargo.toml + voxpen-core/Cargo.toml
  • AppError enum with thiserror (6 variants, serializable for Tauri)
  • PipelineState enum (7 states matching Android ImeUiState)
  • Language enum with code() and prompt() methods
  • RecordingMode enum (HoldToRecord default, Toggle)
  • audio::encoder::pcm_to_wav() — 44-byte RIFF header, pure Rust
  • pipeline::prompts::for_language() — 4 prompts verbatim from Android
  • API constants: timeouts, base URL matching Android

0.2 Basic System Tray

  • Tray icon (placeholder, idle state)
  • Tray menu: Settings / Quit
  • Click Settings → show settings window
  • App starts with hidden window (tray-only)

Architecture Decision: Workspace Crate

Created src-tauri/crates/voxpen-core/ as a separate workspace crate containing all business logic (types, encoder, prompts, API modules). This crate has no Tauri dependency and can be built/tested independently without Linux system libraries (libgtk, libwebkit, etc.). The Tauri app crate depends on voxpen-core and handles only framework glue (tray, hotkey, IPC commands).

Verification

  • 41 unit tests passing
  • clippy clean (-D warnings)
  • Frontend builds (Vite + React + Tailwind)
  • Full Tauri binary build requires Linux dev libraries (libglib2.0-dev, libgtk-3-dev, libwebkit2gtk-4.1-dev, libasound2-dev)

Deliverable

Empty Tauri app living in system tray, with core types (AppError, PipelineState, Language, RecordingMode) defined and tested.


Phase 1: Hotkey → Record → Transcribe → Paste (Week 1-2)

This is the entire core loop. Get this working end-to-end first.

Note: audio/encoder.rs (pcm_to_wav) already implemented and tested in Phase 0. Phase 1 focuses on the remaining pipeline modules. All new modules go in src-tauri/crates/voxpen-core/src/ (framework-independent, testable without Tauri).

1.1 Groq STT API Client ✅ COMPLETE

File: voxpen-core/src/api/groq.rs

  • SttConfig struct with api_key, model, language, response_format
  • WhisperResponse struct for Groq API response
  • transcribe() async fn with reqwest multipart POST, per-call Bearer auth
  • HTTP error mapping: 401 → ApiKeyMissing, 413 → Transcription, 500 → Transcription
  • 5 wiremock tests: success, 401, 413, 500, custom model

1.2 Transcription Pipeline ✅ COMPLETE

File: voxpen-core/src/pipeline/transcribe.rs

  • transcribe() composing encoder::pcm_to_wav + groq::transcribe
  • Empty PCM validation → AppError::Audio
  • 3 wiremock tests: encode+call, empty rejection, error propagation

1.3 Pipeline Controller (State Machine) ✅ COMPLETE

File: voxpen-core/src/pipeline/controller.rs

  • PipelineController with tokio::sync::watch for state broadcast
  • PipelineConfig struct (API key, language, model)
  • on_start_recording(): validates API key → emits Recording (or Error)
  • on_stop_recording(pcm): emits Processing → transcribe → Result (or Error)
  • current_state(), subscribe(), update_config(), reset()
  • 7 tests: idle start, recording transition, error on missing key, processing on stop, empty pcm, reset, config update

1.4 Audio Recorder Trait ✅ COMPLETE

File: voxpen-core/src/audio/recorder.rs + src-tauri/src/audio.rs

  • AudioRecorder trait: start/stop/is_recording (mockall automock)
  • CpalRecorder concrete implementation (16kHz mono i16, Arc buffer)

1.5 Auto-Paste ✅ COMPLETE

File: voxpen-core/src/input/clipboard.rs + input/paste.rs + src-tauri/src/clipboard.rs + src-tauri/src/keyboard.rs

  • ClipboardManager trait: get_text/set_text (mockall automock)
  • KeySimulator trait: paste (mockall automock)
  • paste_text() orchestration: save → write → paste → sleep(100ms) → restore
  • 4 mock tests: save/restore, write+paste, graceful restore failure, paste failure propagation
  • ArboardClipboard concrete implementation (arboard crate)
  • EnigoKeyboard concrete implementation (Cmd+V on macOS, Ctrl+V on others)

1.6 Tauri Integration ✅ COMPLETE

Files: src-tauri/src/lib.rs, commands.rs, hotkey.rs, state.rs

  • Register global hotkey in Tauri setup() via tauri-plugin-global-shortcut
  • On hotkey press → PipelineController::on_start_recording() + start audio capture
  • On hotkey release → stop capture → PipelineController::on_stop_recording(pcm) → auto-paste
  • Emit pipeline-state Tauri events on each state change (background task with watch channel)
  • Tauri command: save_api_key(provider, key) → Tauri store
  • Tauri command: get_settings() / save_settings() → Tauri store
  • Tauri command: test_api_key(provider, key) → silent WAV test transcription
  • GroqSttProvider / GroqLlmProvider concrete impls reading settings at call time
  • AppState shared state with Arc<Mutex<>> for thread-safe Tauri access
  • Short recording guard (<0.5s → ignore, don't send to API)
  • History commands wired to SQLite via rusqlite (replaced tauri-plugin-sql)
  • Tray tooltip state changes (Ready/Recording/Processing/Done/Error)
  • Hotkey debounce guard (AtomicBool prevents concurrent pipeline runs)
  • Errors emitted to frontend overlay (not just stderr)
  • API key read from Tauri store before env var fallback
  • Settings sync to shared state on save/load
  • History saved after each transcription (original + refined, with UUID, timestamp, duration)

1.7 Minimal Visual Feedback ✅ COMPLETE

  • Tray tooltip state changes: Ready → Recording... → Processing... → Done → Ready
  • Overlay window: secondary always-on-top frameless transparent window
  • Overlay show/hide driven by pipeline state (visible during recording/processing, hidden on idle)
  • Window routing: App.tsx detects window label and renders Settings vs Overlay

Build Note

Full Tauri app compilation requires system libs (libgtk-3-dev, libwebkit2gtk-4.1-dev, libasound2-dev). The CI workflow validates full build. Local verification: voxpen-core tests (89 passing) + frontend build + clippy clean.

Deliverable

Working end-to-end: press hotkey → speak → text appears at cursor. This is a usable MVP.


Phase 2: LLM Refinement + Settings UI (Week 3-4) ✅ COMPLETE

2.1 LLM Refinement Pipeline (Rust) ✅ COMPLETE

  • api/groq.rs: Chat completion endpoint (chat_completion(), ChatConfig)
    • Models: openai/gpt-oss-120b (default) / openai/gpt-oss-20b
    • Temperature: 0.3, max_tokens: 2048
    • 5 wiremock tests: success, 401, 500, default model, empty choices
  • pipeline/refine.rs: Orchestrate text → LLM → refined text
    • Compose prompts::for_language() + groq::chat_completion()
    • 3 wiremock tests: success, empty rejection, error propagation
  • LlmProvider trait + updated PipelineController<S, L>
    • After STT success: if refinement enabled, emit Refining → call LLM → emit Refined
    • If refinement fails: fall back to Result with original text (graceful degradation)
    • 5s timeout via tokio::time::timeout — falls back to raw text
    • 5 new tests: refining→refined, fallback on failure, skip when disabled, timeout, debug masking
  • pipeline/settings.rs: Settings struct (Serialize/Deserialize/Default)

2.2 Settings Window (React) ✅ COMPLETE

  • Tabbed sidebar layout (General, Speech, Refinement, Appearance)
  • GeneralSection: hotkey display, recording mode, auto-paste, launch at login
  • SttSection: provider dropdown, API key (masked + save), language, model
  • RefinementSection: enable toggle, provider, API key, model
  • AppearanceSection: theme (system/light/dark), UI language
  • useSettings hook with debounced save
  • src/lib/tauri.ts invoke wrappers

2.3 Tray Menu Types ✅ COMPLETE

  • Settings struct with defaults (actual tray menu in Phase 1.6)

Deliverable

Full-featured voice dictation with refinement and configurable settings.


Phase 3: Floating Overlay + History (Week 5) ✅ COMPLETE

3.1 Floating Overlay Widget ✅ COMPLETE

  • Overlay React component with pipeline-state event listener
  • States: Recording (red pulse), Processing (spinner), Done (green check, auto-hide 1.5s), Error (red X, auto-hide 3s)
  • Semi-transparent background with backdrop-blur
  • Tauri secondary window (always-on-top, frameless, transparent, show/hide from Rust)

3.2 Transcription History ✅ COMPLETE

  • history.rs: TranscriptionEntry struct with display_text() method
  • SQL schema constants (CREATE, INSERT, QUERY, SEARCH, DELETE, CLEANUP, COUNT)
  • History UI (React): search, expand/collapse, copy, delete with confirmation
  • History tab integrated into Settings sidebar
  • Language badges (colored pills per language)
  • SQLite integration via rusqlite with HistoryDb wrapper (query, search, insert, delete)

3.3 Audio File Transcription ✅ COMPLETE

  • audio/chunker.rs: WAV-aware chunking for files > 25MB
    • Validates RIFF/WAVE headers, splits PCM data, updates chunk headers
    • 6 tests: single chunk, multi-chunk, header preservation, size updates, error cases
  • File transcription UI (deferred to post-v1.0)

Deliverable

Polished experience with visual feedback, history, and file transcription.


Phase 4: UI Polish + Platform Testing (Week 6-7) — i18n + Theme COMPLETE

4.1 Visual Design ✅ COMPLETE

  • Clean, minimal settings window with sidebar navigation
  • Smooth animations in overlay (CSS pulse, spin transitions)
  • Dark mode support (follows system, manual override)
  • useTheme hook: system/light/dark with media query listener
  • App icons generated (all sizes, .icns, .ico) — completed in Phase 5

4.2 i18n ✅ COMPLETE

  • react-i18next + i18next setup
  • en.json + zh-TW.json complete (all UI strings)
  • All React components wired with t() translation calls
  • Language switching in AppearanceSection with i18n.changeLanguage()
  • Tray menu localized (deferred — English-only acceptable for v1.0 MVP)
  • System notifications localized (deferred to post-v1.0)

4.3 Platform Testing (requires Phase 1.6 — Tauri integration)

  • macOS: accessibility/mic permissions, menu bar, code signing, DMG, Apple Silicon + Intel
  • Windows: global hotkey, paste simulation, system tray, NSIS, Win 10+11
  • Linux: X11/Wayland, paste simulation, AppImage, Ubuntu 22.04+

4.4 Edge Cases — Partially COMPLETE

  • No internet → error emitted to overlay (network error from reqwest)
  • Invalid API key → error emitted to overlay (ApiKeyMissing error)
  • Very short recording (<0.5s) → skip and reset (8000 sample guard)
  • Rapid hotkey presses → debounce via AtomicBool processing guard
  • Microphone in use → handle gracefully (deferred — cpal returns error)
  • Very long recording (>5 min) → auto-stop + warn (deferred)
  • System sleep/wake → re-register hotkey (deferred)

Deliverable

Production-ready app tested on all three platforms.


Phase 5: Distribution (Week 8) — Infrastructure COMPLETE

5.1 Platform Icons ✅ COMPLETE

  • Generated icon.icns (macOS), icon.ico (Windows) via tauri icon
  • All icon sizes: 32x32, 128x128, 128x128@2x, icon.png

5.2 Bundle Configuration ✅ COMPLETE

  • DMG with hardened runtime (macOS, min 10.15)
  • NSIS installer with currentUser mode, WebView2 bootstrapper (Windows)
  • AppImage + .deb (Linux)
  • Updater artifacts enabled (createUpdaterArtifacts: "v1Compatible")

5.3 Auto-Update ✅ COMPLETE

  • tauri-plugin-updater + tauri-plugin-process added (Rust + JS)
  • Updater plugin registered in lib.rs
  • Updater + process permissions in desktop.json capabilities
  • UpdateChecker React component with download progress bar
  • i18n strings for update UI (en + zh-TW)
  • Updater signing key generation (first real release)
  • GitHub repo URL in updater endpoint (placeholder set)

5.4 CI/CD Pipelines ✅ COMPLETE

  • .github/workflows/ci.yml: test core + build frontend on push/PR
  • .github/workflows/release.yml: cross-platform build on tag push
    • macOS (arm64 + x64), Windows (x64), Linux (x64)
    • tauri-apps/tauri-action with Rust cache
    • Code signing env vars ready (secrets need to be set)
    • Draft releases with platform-specific asset descriptions
  • Apple Developer account + code signing certificate
  • Apple notarization credentials
  • Windows code signing certificate (optional)

5.5 Build Scripts ✅ COMPLETE

  • pnpm test:core — run voxpen-core tests
  • pnpm lint:core — run clippy on voxpen-core
  • pnpm tauri:build / pnpm tauri:dev convenience scripts

5.6 Website / Landing Page (deferred)

  • Product page with demo GIF/video
  • Download links per platform
  • Setup guide (get API key, install, configure)
  • Homebrew cask formula (optional)

Deliverable

Distribution infrastructure ready. First release requires: signing keys, Apple Developer account, and GitHub repo configuration.


Phase 6: Post-Launch (Month 2+)

6.1 Selected Text Editing (Typeless Feature)

  • User selects text in any app
  • Press different hotkey (e.g., Cmd+Shift+E)
  • Speak editing command: "翻成英文" / "make it shorter" / "改成正式語氣"
  • App reads selected text (simulate Cmd+C)
  • Send to LLM: selected text + voice command
  • Replace selection with LLM output (simulate paste)

6.2 Streaming Transcription

  • Send audio chunks while recording (real-time STT)
  • Display partial transcription in overlay
  • Finalize on recording stop

6.3 On-Device Whisper (Offline)

  • Integrate whisper.cpp via whisper-rs crate (Rust backend ready)
  • Download model on first use (4 tiers: quick/balanced/quality/maximum)
  • Model management commands (download, status, delete)
  • Fallback when offline
  • User choice: cloud (faster, needs internet) vs local (private, slower)

Status: Backend complete, UI temporarily removed from settings. local-whisper is an opt-in Cargo feature (--features local-whisper), not in default build.

Windows build blocker: whisper-rs-sys (bindgen + whisper.cpp) fails to compile on Windows MSVC due to libclang struct layout mismatches. bindgen generates Linux glibc types (_G_fpos_t, _IO_FILE) or produces opaque structs (size=1) depending on environment configuration. Attempted solutions:

  • WHISPER_DONT_GENERATE_BINDINGS=1 → bundled bindings are Linux-only
  • BINDGEN_EXTRA_CLANG_ARGS=--target=x86_64-pc-windows-msvc + MSVC include paths → still fails
  • x64 Native Tools Command Prompt → missing scoop tools (cargo, git, cmake)
  • Combined VS dev shell + scoop → still produces wrong struct sizes

Decision: Ship v1.0 without local whisper in the default build. Enable via cargo build --features local-whisper on macOS/Linux where it compiles cleanly. Revisit Windows support when whisper-rs-sys upstream fixes MSVC bindgen compatibility.

6.4 Additional Languages

  • Add ko, es, fr, de, th, vi with refinement prompts
  • Whisper STT works out-of-box, only prompts need tuning

6.5 Taiwanese Hokkien

  • Same research as Android: 意傳科技 API, 教育部 ASR engine
  • If API available, integrate as custom STT provider

6.6 Sync with Android

  • Shared prompt library (export/import JSON)
  • Shared history (optional cloud sync, user-hosted)

Technical Decisions Log

Why a separate voxpen-core crate?

  1. Testability: Core business logic (encoder, API client, state machine, prompts) can be tested without Linux system libraries (libgtk, libwebkit, etc.)
  2. Separation of concerns: Framework-agnostic code (pure Rust) vs framework glue (Tauri setup, IPC commands, device access)
  3. Faster iteration: cargo test -p voxpen-core runs in seconds, no Tauri build needed
  4. Trait boundaries: Hardware-dependent code (audio, clipboard, keys) uses traits in voxpen-core, with concrete impls in the Tauri crate
  5. Future portability: Core crate could be reused in CLI, other frameworks

Why Tauri over Electron?

  1. Bundle size: Tauri ~5-15MB vs Electron ~150MB+
  2. Memory usage: Tauri uses system webview vs Electron's bundled Chromium
  3. Rust backend: system-level operations (hotkey, paste simulation, audio) are natural in Rust
  4. Startup speed: Tauri apps launch near-instantly
  5. Security: Tauri's permission model is stricter than Electron

Why Rust-side API calls (not React)?

API calls contain secret keys. Running them in Rust means:

  1. Keys never enter the webview/JS context
  2. No CORS issues (Rust HTTP client is not browser-bound)
  3. Can be called without any window open (tray-only mode)
  4. Better error handling and retry logic in Rust

Why not share code with Android (KMP)?

The overlap between Android IME and desktop tray app is minimal:

  • Shared: API endpoint URLs, prompt text, language list
  • Not shared: UI (Compose vs React), audio recording (AudioRecord vs cpal), system integration (IME vs hotkey+paste), architecture (MVVM vs Tauri commands)

Copying prompts and API patterns is simpler than maintaining a shared KMP module.

Why mirror Android's architecture patterns?

  1. Proven patterns: Android v1 shipped successfully with these patterns
  2. Consistency: Same team maintains both codebases — familiar mental model
  3. Portability: Same state machine, same API contracts, same error handling strategy
  4. Testing: Same TDD approach, same test naming, same coverage targets

Hold-to-dictate vs Toggle

Default is hold-to-dictate because:

  1. Matches muscle memory (like walkie-talkie)
  2. Natural endpoint — release = done
  3. Prevents accidental long recordings
  4. Faster for short dictations (most common use case)

Toggle mode available for accessibility and long dictations.

Paste Simulation Strategy

1. Read current clipboard → save to temp
2. Write transcription to clipboard
3. Simulate Cmd+V / Ctrl+V
4. Wait 100ms
5. Restore original clipboard from temp

Same as Typeless/Wispr Flow/1Password. Fragile on some Linux Wayland compositors but reliable on macOS and Windows.


Risk Assessment

Risk Impact Mitigation
macOS Accessibility permission UX High Clear first-run permission flow
Paste simulation fails in some apps Medium Clipboard-only fallback + notification
Wayland paste issues (Linux) Medium Detect Wayland → clipboard-only mode
Apple notarization rejection Medium Follow guidelines strictly, no private APIs
Windows SmartScreen warning Medium Consider EV code signing certificate
cpal audio device issues Low Fallback to system default, clear error messages
Tauri v2 breaking changes Low Pin versions, follow release notes
whisper-rs Windows MSVC build High local-whisper opt-in feature, excluded from default build. Windows ships cloud-only STT.

v1.0 Readiness Assessment

What's DONE (Phases 0-5 software layer)

Layer Status Details
Core types & state machine ✅ 89 tests AppError, PipelineState, Language, RecordingMode
Groq STT API client ✅ tested Multipart POST, error mapping, wiremock
LLM refinement pipeline ✅ tested Chat completion, prompts, 5s timeout fallback
Transcription pipeline ✅ tested PCM→WAV→STT orchestration
Audio encoder ✅ tested 44-byte RIFF header, pure Rust
Audio chunker ✅ tested WAV-aware splitting for >25MB files
Pipeline controller ✅ tested State machine with SttProvider + LlmProvider traits
Auto-paste logic ✅ tested save→write→paste→restore with trait mocks
Hardware traits ✅ defined AudioRecorder, ClipboardManager, KeySimulator
Settings UI ✅ built 5-tab sidebar: General, Speech, Refinement, Appearance, History
Floating overlay ✅ built Recording/Processing/Done/Error states with animations
History UI ✅ built Search, expand, copy, delete with language badges
i18n ✅ built en + zh-TW, all components wired
Theme system ✅ built System/light/dark with media query listener
Auto-update UI ✅ built UpdateChecker with progress bar, bilingual
Bundle config ✅ configured DMG, NSIS, AppImage, .deb, updater artifacts
CI/CD ✅ configured GitHub Actions: CI + cross-platform release
App icons ✅ generated All sizes, .icns, .ico
Tauri integration layer ✅ written Hotkey, commands, state, event emission
CpalRecorder ✅ written 16kHz mono, Arc buffer
ArboardClipboard ✅ written arboard get/set_text
EnigoKeyboard ✅ written Cmd+V / Ctrl+V simulation
IPC commands ✅ written get/save_settings, save/test_api_key
Global hotkey handler ✅ written Press→record, release→transcribe→paste
Pipeline state emission ✅ written watch channel → Tauri events → React

What's NOT DONE (the gap to v1.0)

Missing Item Blocked By Effort
Platform compilation + smoke test System libs / CI ~2-3 days
Platform testing (macOS/Win/Linux) Platform access ~1 week
Total ~1.5 weeks

Phase 4.3 — Platform Testing (~1 week after compilation):

  • macOS: accessibility/mic permissions, menu bar, code signing, DMG
  • Windows: global hotkey, paste simulation, system tray, NSIS
  • Linux: X11/Wayland, paste simulation, AppImage, Ubuntu 22.04+

Phase 5 — External Setup (non-code):

  • Apple Developer account + signing certificate
  • Updater signing key generation
  • GitHub repo URL configuration

Verdict: v1.0 code-complete — needs platform testing

We're at ~95% of v1.0. All business logic is proven (89 core tests), all UI is complete, all Tauri integration is wired:

  • SQLite history: real insert/query/search/delete via rusqlite
  • Overlay: secondary always-on-top frameless window with show/hide
  • Tray: dynamic tooltip updates (Ready/Recording/Processing/Done/Error)
  • Hotkey: debounced with AtomicBool, errors emitted to frontend
  • API keys: read from Tauri store first, env var fallback for dev
  • Settings: synced to shared state on save/load
  • History: entries saved automatically after each transcription

What remains is non-code work:

  • Install system libs and compile the full Tauri app
  • Test on real hardware (macOS, Windows, Linux)
  • Apple Developer account + signing for distribution
  • Total: ~1-1.5 weeks of platform testing

Success Metrics (v1.0)

  • End-to-end latency < 3s (hotkey release → text pasted) on Groq
  • Works in top 10 desktop apps: VS Code, Chrome, Slack, Notion, Word, LINE, Discord, Terminal, Notes, Mail
  • Zero-crash in 1-week dogfooding
  • macOS + Windows builds stable
  • Auto-paste success rate > 95% on macOS and Windows
  • 繁中 + English UI complete
  • Settings persist across restarts (Tauri store + shared state sync)
  • History searchable (SQLite LIKE search via rusqlite)
  • Test coverage ≥ 80% for Rust pipeline modules (89 tests, all core modules covered)

Timeline Summary

Phase Duration Status Deliverable
Phase 0: Scaffold 2 days ✅ COMPLETE Tray app + core types defined and tested
Phase 1.1-1.5: Core logic 2 weeks ✅ COMPLETE STT, LLM, pipeline, state machine, paste (89 tests)
Phase 1.6: Tauri integration ~1 week ✅ COMPLETE Hotkey, commands, state, events, history, API keys
Phase 1.7: Visual feedback ~0.5 week ✅ COMPLETE Tray tooltip, overlay window, show/hide
Phase 2: Refinement + Settings 2 weeks ✅ COMPLETE LLM pipeline + full settings UI
Phase 3: Overlay + History 1 week ✅ COMPLETE Overlay + history UI + SQLite + chunker
Phase 4.1-4.2: i18n + Theme 1 week ✅ COMPLETE Bilingual UI + dark mode
Phase 4.3: Platform testing ~1 week ❌ PENDING Needs system libs for compilation
Phase 4.4: Edge cases ~3 days ✅ PARTIAL Debounce, error emission, short guard done
Phase 5: Distribution infra 1 week ✅ COMPLETE CI/CD, bundle config, auto-update, icons
Total to v1.0 ~1-1.5 weeks remaining ~95% done Code-complete, needs platform testing