Stemmy

4-stem audio source separation (drums, bass, vocals, other) using U-Net.

Project Overview

Stemmy is a deep learning project for separating audio tracks into four stems:

Drums
Bass
Vocals
Other

The model uses a U-Net architecture trained on the MUSDB18-HQ dataset.

Repository Structure

AI-Stem-Separation/
├── .github/
│   └── workflows/
│       └── ci.yml               # CI/CD pipeline (linting, testing)
── README.md                     # Project overview + setup + usage
├── pyproject.toml               # Packaging/build config + deps/extras
├── requirements.txt             # Deprecated — deps live in pyproject.toml
├── ruff.toml                    # Ruff lint/format config
├── src                          # Source package root
│   ├── __init__.py              # Marks src as a package (exports, version, etc.)
│   ├── check_cross_sisdr.py     # Metric/eval script (cross SI-SDR checks)
│   ├── constants.py             # Centralized project-wide constants
│   ├── inference.py             # Inference entry/utilities for separation
│   ├── logging_config.py        # Logging setup/helpers
│   ├── models                   # Model architectures
│   │   ├── __init__.py          # Model module exports
│   │   └── unet_2d.py           # 2D U-Net model definition
│   ├── postprocessing           # Post-separation audio processing
│   │   ├── __init__.py          # Postprocessing exports
│   │   ├── audio.py             # Audio I/O + waveform ops (post)
│   │   ├── pipeline.py          # Postprocessing pipeline orchestration
│   │   ├── spectral.py          # STFT/ISTFT + spectral-domain ops (post)
│   │   └── utility              # Postprocessing helpers
│   │       ├── __init__.py      # Utility exports
│   │       └── output_validator.py # Validates outputs (paths/waveforms/etc.)
│   ├── preprocessing            # Pre-separation audio processing
│   │   ├── __init__.py          # Preprocessing exports
│   │   ├── audio.py             # Audio loading + waveform ops (pre)
│   │   ├── pipeline.py          # Preprocessing pipeline orchestration
│   │   ├── spectral.py          # STFT/feature prep (pre)
│   │   └── utility              # Preprocessing helpers
│   │       ├── __init__.py      # Utility exports
│   │       ├── audio_file_validator.py # Validates input audio files
│   │       └── audio_metadata_extractor.py # Reads SR/channels/duration/etc.
│   ├── tool                     # CLI/tools/scripts
│   │   ├── __init__.py          # Tool module exports
│   │   ├── cli.py               # CLI entry point (commands/options)
│   │   ├── fullsong_eval_masked.py # Full-song evaluation script (masked)
│   │   ├── select_best_checkpoint.py # Chooses best checkpoint from runs
│   │   └── separate_one_track.py # Runs separation for a single track
│   ├── train.py                 # Training entry point / trainer runner
│   └── training                 # Training utilities + datasets
│       ├── __init__.py          # Training module exports
│       ├── checkpointing.py     # Save/load checkpoints utilities
│       ├── musdb18hq_dataset.py # MUSDB18-HQ dataset loader
│       └── stft.py              # STFT utilities used during training
└── tests                        # Test suite
    ├── __init__.py              # Tests package marker
    ├── test_imports.py          # Smoke test: imports + basic wiring
    ├── integration              # Integration tests
    │   ├── __init__.py          # Integration tests package marker
    │   └── test_pipeline.py     # End-to-end pipeline integration test(s)
    ├── postprocessor            # Postprocessing unit tests
    │   ├── __init__.py          # Postprocessor tests package marker
    │   ├── test_audio.py        # Tests post audio helpers
    │   ├── test_pipeline.py     # Tests postprocessing pipeline
    │   ├── test_spectral.py     # Tests post spectral utilities
    │   └── utility              # Postprocessing utility tests
    │       ├── __init__.py      # Utility tests package marker
    │       └── test_output_validator.py # Tests output validation
    └── preprocessor             # Preprocessing unit tests
        ├── __init__.py          # Preprocessor tests package marker
        ├── samples              # Test fixtures (audio samples)
        │   └── plinky_key.wav   # Sample audio used in tests
        ├── test_audio.py        # Tests pre audio helpers
        ├── test_ensure_stereo.py # Tests stereo conversion/validation
        ├── test_normalize_waveform.py # Tests normalization logic
        ├── test_pipeline.py     # Tests preprocessing pipeline
        └── test_spectral.py     # Tests pre spectral utilities

Getting Started

Prerequisites

Python >=3.9, <3.13
pip (Python package manager)
Virtual environment tool (recommended)

Installation

Clone the repository:

git clone https://github.com/461x-senior-design/AI-Stem-Separation.git
cd AI-Stem-Separation

Create & source a virtual environment:

python -m venv .venv
source .venv/bin/activate

Activate the virtual environment:
- Windows (PowerShell): .venv\Scripts\Activate.ps1
- Windows (CMD): .venv\Scripts\activate.bat
- Linux/Mac: source .venv/bin/activate
Install project in development mode (includes training/inference/test deps):
```
pip install --upgrade pip
pip install -e ".[dev]"
```
On macOS Python 3.11, if llvmlite fails to build, add --only-binary llvmlite,numba.

For inference only (what PyPI wheel users get), drop [dev]:
```
pip install -e .
```

Running Tests

We use pytest for testing:

# Run all tests
pytest tests/ -v

# Run with coverage report
pytest tests/ --cov=src --cov-report=html

Code Quality

This project uses Ruff for fast Python linting and formatting.

Linting

# Check for errors
ruff check .

# Auto-fix issues
ruff check . --fix

Formatting

# Check formatting
ruff format --check .

# Apply formatting
ruff format .

Configuration is in ruff.toml.

Development Workflow

Branch Strategy

main - Production-ready code
dev - Integration branch for features
feature/* - Individual feature branches

CI/CD Pipeline

All pushes and pull requests trigger automated:

Linting with Ruff
Formatting checks
Unit tests with pytest

See .github/workflows/ci.yml for details.

Dependencies

Core Libraries

PyTorch (≥2.0.0) - Deep learning framework
torchaudio (≥2.0.0) - Audio processing for PyTorch
librosa (≥0.10.0) - Audio analysis
soundfile (≥0.12.0) - Audio file I/O
numpy (≥1.24.0) - Numerical computing
scipy (≥1.10.0) - Scientific computing

CLI & UI

click (≥8.1.0) - Command-line interface
rich (≥13.0.0) - Terminal formatting

Development Tools

pytest (≥7.4.0) - Testing framework
pytest-cov (≥4.1.0) - Coverage reporting
ruff (≥0.1.0) - Linter and formatter

Contributing

Create a feature branch: git checkout -b feature/your-feature-name
Make your changes
Ensure tests pass: pytest tests/ -v
Ensure code is formatted: ruff format .
Commit with clear messages
Push and open a pull request to dev

Additional Resources

MUSDB18 Dataset - Official dataset documentation

License

This project is licensed under the GNU Affero General Public License v3.0 or later. See LICENSE.

Commercial licenses may be available separately from the project authors.

Model weights, checkpoints, datasets, and third-party assets are not covered by this source-code license unless explicitly stated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Stemmy

Project Overview

Repository Structure

Getting Started

Prerequisites

Installation

Running Tests

Code Quality

Linting

Formatting

Development Workflow

Branch Strategy

CI/CD Pipeline

Dependencies

Core Libraries

CLI & UI

Development Tools

Contributing

Additional Resources

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 276 Commits
.github/workflows		.github/workflows
scripts		scripts
src/stemmy		src/stemmy
test/lib/python3.12/site-packages		test/lib/python3.12/site-packages
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
ruff.toml		ruff.toml

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Stemmy

Project Overview

Repository Structure

Getting Started

Prerequisites

Installation

Running Tests

Code Quality

Linting

Formatting

Development Workflow

Branch Strategy

CI/CD Pipeline

Dependencies

Core Libraries

CLI & UI

Development Tools

Contributing

Additional Resources

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages