Open-source music analysis tool for chord recognition, beat tracking, piano visualizer, guitar diagrams, lyrics synchronization, and experimental melody transcription.
We welcome and appreciate all contributions made to the project, whether it's code, documentation, testing, or feedback. Please see our Contributing Guidelines for details on how to get involved.
If you use or reference ChordMini in your work and find it useful, please cite:
BibTeX:
@misc{phan2026enhancingautomaticchordrecognition,
title={Enhancing Automatic Chord Recognition via Pseudo-Labeling and Knowledge Distillation},
author={Nghia Phan and Rong Jin and Gang Liu and Xiao Dong},
year={2026},
eprint={2602.19778},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2602.19778},
}Plain text (for non-LaTeX users):
Nghia Phan, Rong Jin, Gang Liu, and Xiao Dong. "Enhancing Automatic Chord Recognition via Pseudo-Labeling and Knowledge Distillation." arXiv preprint arXiv:2602.19778, 2026. https://arxiv.org/abs/2602.19778
Clean, intuitive interface for YouTube search, URL input, and recent video access.
Chord progression visualization with synchronized beat detection and grid layout with add-on features: Roman Numeral Analysis, Key Modulation Signals, Simplified Chord Notation, Enhanced Chord Correction, and song segmentation overlays for structural sections like intro, verse, chorus, bridge, and outro.
Interactive guitar chord diagrams with accurate fingering patterns from the official @tombatossals/chords-db database, featuring multiple chord positions, synchronized beat grid integration, and exact slash-chord matching when the database includes a dedicated inversion shape.
Real-time piano roll visualization with falling MIDI notes synchronized to chord playback. Features a scrolling chord strip, interactive keyboard highlighting, smoother playback-synced rendering, segmentation-aware dynamics shaping, and MIDI file export for importing chord progressions into any DAW.
Sheet Sage can optionally add an estimated melodic line on top of the Piano Visualizer, with separate playback, caching, and MIDI export support. This feature is still experimental: inference is slower than the main beat/chord pipeline, and note timing or accuracy may be limited.
Synchronized lyrics transcription with AI chatbot for contextual music analysis and translation support.
- Node.js 20.9+ and npm 10+
- Python 3.10.x (3.10.16 recommended for the backend)
- Docker (recommended for the standalone Sheet Sage melody service)
- Git LFS (for SongFormer checkpoints)
- Firebase account (free tier)
- Gemini API (free tier)
-
Clone and install Clone with submodules in one command (for fresh clones)
git lfs install git clone --recursive https://github.com/ptnghia-j/ChordMiniApp.git cd ChordMiniApp git lfs pull npm installgit pull git lfs pull
Note
git lfs pull downloads the large SongFormer model files referenced by this repo, including the checkpoint binaries stored as Git LFS objects.
ls -la python_backend/models/Beat-Transformer/
ls -la python_backend/models/Chord-CNN-LSTM/
ls -la python_backend/models/ChordMini/Note
If chord recognition encounters an issue with FluidSynth, install it for MIDI synthesis.
# --- Windows ---
choco install fluidsynth
# --- macOS ---
brew install fluidsynth
# --- Linux (Debian/Ubuntu-based) ---
sudo apt update
sudo apt install fluidsynth-
Environment setup
cp .env.example .env.local
Edit
.env.local.Required for local frontend + main Python backend:
PYTHON_API_URL=http://localhost:5001 NEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your_project.firebaseapp.com NEXT_PUBLIC_FIREBASE_PROJECT_ID=your_project_id NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project.appspot.com NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=your_sender_id NEXT_PUBLIC_FIREBASE_APP_ID=your_app_id
Optional feature backends and feature keys:
LOCAL_SONGFORMER_API_URL=http://localhost:8080 LOCAL_SHEETSAGE_API_URL=http://localhost:8082 NEXT_PUBLIC_YOUTUBE_API_KEY=your_youtube_api_key MUSIC_AI_API_KEY=your_music_ai_key GEMINI_API_KEY=your_gemini_api_key GENIUS_API_KEY=your_genius_api_key
After bucket CORS is configured for your production domain, you can also enable direct Firebase/GCS audio redirects with:
AUDIO_PROXY_FIREBASE_REDIRECT_ENABLED=true
NEXT_PUBLIC_FIREBASE_MEASUREMENT_IDis optional and only needed if you want Firebase Analytics.
Important
Native Windows backend installs are not currently reliable because spleeter and madmom still pull in conflicting or outdated dependencies. On Windows x86_64, prefer WSL2/Ubuntu for local development, or build the Docker images locally for linux/amd64 instead of relying on the published Docker Hub tags.
If you are not testing Beat-Transformer, you can skip installing spleeter for now. It is only required by the current Beat-Transformer source-separation path. A newer compatible source-separation package will be considered in a future update.
-
Start Python backend (Terminal 1)
cd python_backend python -m venv myenv source myenv/bin/activate # On Windows: myenv\Scripts\activate pip install --upgrade pip setuptools wheel pip install "Cython>=0.29.0" numpy==1.26.4 pip install git+https://github.com/CPJKU/madmom pip install -r requirements.txt python app.py
If
pip install -r requirements.txtfails withResolutionImpossibleerrors involvingspleeter,librosa,httpx, orllvmlite, use WSL2/Ubuntu or Docker for the backend rather than continuing with a native Windows install.If you are not testing Beat-Transformer, you can skip
spleeterduring install:grep -v '^spleeter==' requirements.txt | grep -v '^typer==' > requirements_nospleeter.txt pip install --no-cache-dir -r requirements_nospleeter.txt
Beat-Transformer testing requires
spleeter.If you still need Beat-Transformer and want the more relaxed install chain used by the Dockerfile, install
spleeterandtyperafter the main requirements with--no-deps:grep -v '^spleeter==' requirements.txt | grep -v '^typer==' > requirements_nospleeter.txt pip install --no-cache-dir -r requirements_nospleeter.txt pip install --no-cache-dir --no-deps typer==0.9.0 pip install --no-cache-dir --no-deps spleeter==2.3.2
-
Start frontend (Terminal 2)
npm run dev
-
Optional: start the SongFormer segmentation backend (Terminal 3)
cd SongFormer docker build -t songformer-backend:local . docker run --rm -p 8080:8080 songformer-backend:local
The app will use this service for song segmentation. For the standalone service setup, Python workflow, and deployment notes, see SongFormer/README.md.
-
Optional: start the experimental Sheet Sage melody backend (Terminal 4)
cd sheetsage docker build --platform=linux/amd64 -t sheetsage-backend:local . docker run --rm --platform=linux/amd64 -p 8082:8082 -v "$(pwd)/cache:/app/cache" sheetsage-backend:local
For the standalone service image, Cloud Run deployment commands, and asset notes, see sheetsage/README.md.
-
Open application
Visit http://localhost:3000
Docker instructions live in docker/README.md, including Docker Desktop GUI steps, command-line setup, local image builds, and troubleshooting notes.
Note
If you are installing ChordMini with Docker Desktop, start with the Docker guide.
Some configuration files intentionally remain at the repository root because Next.js, npm, TypeScript, Jest, Tailwind/PostCSS, Vercel, Firebase, Docker, and GitHub Actions discover them there by default. Docker Compose and Firebase support files are grouped under docker and firebase, while obsolete duplicate config snapshots are kept in config/legacy.
-
Create Firebase project
- Visit Firebase Console
- Click "Create a project"
- Follow the setup wizard
-
Enable Firestore Database
- Go to "Firestore Database" in the sidebar
- Click "Create database"
- Choose "Start in test mode" for development
-
Get Firebase configuration
- Go to Project Settings (gear icon)
- Scroll down to "Your apps"
- Click "Add app" → Web app
- Copy the configuration values to your
.env.local
-
Create Firestore collections
The app uses the following Firestore collections. They are created automatically on first write (no manual creation required):
transcriptions— Beat and chord analysis results (docId:${videoId}_${beatModel}_${chordModel})translations— Lyrics translation cache (docId: cacheKey based on content hash)lyrics— Music.ai transcription results (docId:videoId)keyDetections— Musical key analysis cache (docId: cacheKey)segmentationJobs— Async SongFormer segmentation jobs and persisted results (docId:seg_<timestamp>_<uuid>)melody— Experimental Sheet Sage melody transcription cache (docId:videoId)
-
Enable Anonymous Authentication
- In Firebase Console: Authentication → Sign-in method → enable Anonymous
-
Configure Firebase Storage
- Set environment variable:
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=your_project_id.appspot.com - Note: Cloud Storage for Firebase can be used without a paid plan in some setups, but Firebase states that projects using the default
*.appspot.combucket must upgrade to the Blaze plan by February 2, 2026 to keep access to that default bucket. - Folder structure:
audio/for audio filesvideo/for optional video files
- Filename pattern requirement: filenames must include the 11-character YouTube video ID in brackets, e.g.
audio_[VIDEOID]_timestamp.mp3(enforced by Storage rules) - File size limits (enforced by Storage rules):
- Audio: up to 50MB
- Video: up to 100MB
- Set environment variable:
-
Enable Firebase temp storage for large uploads (optional, recommended for production)
- Add a temporary folder path in Firebase Storage:
temp/. - Deploy
storage.rulesthat allow temporary upload and cleanup fortemp/*. - Keep the max upload size for temp files at 100MB.
- Set server-side cleanup config:
FIREBASE_SERVICE_ACCOUNT_KEY(server-only JSON)FIREBASE_TEMP_CLEANUP_CRON(default0 */12 * * *)
- If upload fails with
storage/unauthorizedor HTTP 403, verify Anonymous Auth is enabled and rules are deployed to the same Firebase project used in.env.local.
- Add a temporary folder path in Firebase Storage:
Important
In local development, if Firebase Storage is unavailable, extracted YouTube audio falls back to the ignored local temp/ folder. Those cached files are reused for the same YouTube videoId so yt-dlp does not need to run again, but the folder is not auto-cleaned and can grow large over time. Remove old files from temp/ periodically if disk usage matters.
# 1. Sign up at music.ai
# 2. Get API key from dashboard
# 3. Add to .env.local
MUSIC_AI_API_KEY=your_key_here# 1. Visit Google AI Studio
# 2. Generate API key
# 3. Add to .env.local
GEMINI_API_KEY=your_key_hereFor local development, you must run the Python backend on localhost:5001:
- URL:
http://localhost:5001 - Port Note: Uses port 5001 to avoid conflict with macOS AirPlay/AirTunes service on port 5000
Production deployments should set the backend URL privately via the PYTHON_API_URL environment variable.
- Python 3.10.x (3.10.16 recommended)
- Virtual environment (venv or conda)
- Git for cloning dependencies
- System dependencies (varies by OS)
-
Navigate to backend directory
cd python_backend -
Create virtual environment
python -m venv myenv # Activate virtual environment # On macOS/Linux: source myenv/bin/activate # On Windows: myenv\Scripts\activate
-
Install dependencies
pip install --upgrade pip setuptools wheel pip install --no-cache-dir "Cython>=0.29.0" numpy==1.26.4 pip install --no-cache-dir git+https://github.com/CPJKU/madmom pip install --no-cache-dir -r requirements.txtIf you hit
ResolutionImpossibleerrors involvingspleeter,librosa,httpx, orllvmlite, the native install path is currently not considered reliable on Windows. Use WSL2/Ubuntu or Docker instead of continuing with a native Windows environment.If you are not testing Beat-Transformer, you can install without
spleeter:grep -v '^spleeter==' requirements.txt | grep -v '^typer==' > requirements_nospleeter.txt pip install --no-cache-dir -r requirements_nospleeter.txt
A newer compatible source-separation package will be considered in a future update.
-
Start local backend on port 5001
python app.py
The backend will start on
http://localhost:5001and should display:Starting Flask app on port 5001 App is ready to serve requests Note: Using port 5001 to avoid conflict with macOS AirPlay/AirTunes on port 5000 -
Verify backend is running
Open a new terminal and test the backend:
curl http://localhost:5001/health # Should return: {"status": "healthy"} -
Start frontend development server
# In the main project directory (new terminal) npm run devThe frontend will automatically connect to
http://localhost:5001based on your.env.localconfiguration.
- Beat Detection: Beat-Transformer and madmom models
- Chord Recognition: Chord-CNN-LSTM, BTC-SL, BTC-PL models
- Audio Processing: Support for MP3, WAV, FLAC formats
Create a .env file in the python_backend directory:
# Optional: Redis URL for distributed rate limiting
REDIS_URL=redis://localhost:6379
# Optional: Genius API for lyrics
GENIUS_ACCESS_TOKEN=your_genius_token
# Flask configuration
FLASK_MAX_CONTENT_LENGTH_MB=150
CORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000Backend connectivity issues:
# 1. Verify backend is running
curl http://localhost:5001/health
# Expected: {"status": "healthy"}
# 2. Check if port 5001 is in use
lsof -i :5001 # macOS/Linux
netstat -ano | findstr :5001 # Windows
# 3. Verify environment configuration
cat .env.local | grep PYTHON_API_URL
# Expected: PYTHON_API_URL=http://localhost:5001
# 4. Check for macOS AirTunes conflict (if using port 5000)
curl -I http://localhost:5000/health
# If you see "Server: AirTunes", that's the conflict we're avoidingFrontend connection errors:
# Check browser console for errors like:
# "Failed to fetch" or "Network Error"
# This usually means the backend is not running on port 5001
# Restart both frontend and backend:
# Terminal 1 (Backend):
cd python_backend && python app.py
# Terminal 2 (Frontend):
npm run devImportant
# Ensure virtual environment is activated
source myenv/bin/activate # macOS/Linux
myenv\Scripts\activate # Windows
# Reinstall dependencies
pip install -r requirements.txtWe sincerely thank the following APIs and services for their support and contribution to the project.
- Madmom - github.com/CPJKU/madmom - Beat detection and audio processing
- ISMIR2019-Large-Vocabulary-Chord-Recognition - github.com/music-x-lab/ISMIR2019-Large-Vocabulary-Chord-Recognition - Chord-CNN-LSTM model for chord recognition
- Google Gemini API - AI language model for roman numeral analysis, enharmonic corrections, and lyrics translation
- YouTube Search API - github.com/damonwonghv/youtube-search-api - YouTube search and video information
- yt-dlp - github.com/yt-dlp/yt-dlp - Browser production extraction and local development extraction
- yt-mp3-go - github.com/vukan322/yt-mp3-go - Rollback-only audio extraction service when explicitly configured
- chord-db - github.com/tombatossals/chords-db - Comprehensive chord database for accurate guitar diagram generation
- LRClib - github.com/tranxuanthang/lrclib - Lyrics synchronization
- Sheetsage -github.com/chrisdonahue/sheetsage - Experimental melody transcription model
- OpenSheetMusicDisplay -github.com/opensheetmusicdisplay - Sheet music rendering
- Music.ai SDK - AI-powered music transcription









