A step-by-step guide to get EmergentDB running on your Mac, including all required tools and dependencies.
| Tool | Version | Purpose | Install Command |
|---|---|---|---|
| Xcode Command Line Tools | Latest | C/C++ compiler, linker | xcode-select --install |
| Rust | 1.75+ | Core language | curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh |
| Cargo | 1.75+ | Rust package manager | Installed with Rust |
| Bun | Latest | Fast JS runtime (frontend) | curl -fsSL https://bun.sh/install | bash |
| Python | 3.10+ | Ingestion CLI | brew install python@3.11 |
| Git | 2.x | Version control | Pre-installed on macOS |
| Tool | Purpose | Install Command |
|---|---|---|
| Homebrew | Package manager | /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" |
| cargo-watch | Auto-reload on changes | cargo install cargo-watch |
| jq | JSON processing | brew install jq |
This provides the C/C++ compiler needed for some Rust dependencies.
xcode-select --installA dialog will appear. Click "Install" and wait for completion.
# Download and run the Rust installer
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Follow the prompts (default installation is fine)
# Then reload your shell configuration
source ~/.cargo/env
# Verify installation
rustc --version # Should show 1.75.0 or higher
cargo --version # Should show 1.75.0 or higher# Install Bun
curl -fsSL https://bun.sh/install | bash
# Reload shell
source ~/.bashrc # or ~/.zshrc
# Verify
bun --version# Using Homebrew (recommended)
brew install python@3.11
# Or download from python.org
# Verify
python3 --version # Should show 3.10+git clone https://github.com/your-repo/emergentdb.git
cd emergentdbFor Apple Silicon (M1/M2/M3/M4) - Recommended:
RUSTFLAGS="-C target-cpu=apple-m1" cargo build --releaseThis enables ARM NEON SIMD optimizations for best performance.
For Intel Macs:
RUSTFLAGS="-C target-cpu=native" cargo build --releaseStandard build (works on all Macs):
cargo build --release# Run tests to ensure everything works
cargo test --workspaceYou should see output like:
test result: ok. 73 passed; 0 failed; 0 ignored
# Default configuration (port 3000, 768-dim vectors)
cargo run --release -p api-serverWith custom settings:
# Change port and vector dimension
PORT=8080 VECTOR_DIM=1536 cargo run --release -p api-serverVerify the server is running:
curl http://localhost:3000/healthExpected response: {"status":"ok"}
Open a new terminal window:
cd frontend
bun install
bun run devOpen your browser to: http://localhost:3000
If you want to ingest documents with Gemini OCR:
cd examples/ingestion
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txtSet up Gemini API key:
# Create .env file in project root
echo "GEMINI_API_KEY=your_api_key_here" > .env
# Or export directly
export GEMINI_API_KEY=your_api_key_hereTest the CLI:
python ingest.py --help# 1. Install tools (one-time)
xcode-select --install
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
curl -fsSL https://bun.sh/install | bash
# 2. Build (from project root)
cd emergentdb
RUSTFLAGS="-C target-cpu=apple-m1" cargo build --release
# 3. Run tests
cargo test --workspace
# 4. Start API server (Terminal 1)
cargo run --release -p api-server
# 5. Start frontend (Terminal 2)
cd frontend && bun install && bun run dev# Criterion micro-benchmarks
cargo bench -p vector-core
# Scale benchmark (1K, 10K, 50K vectors)
cargo run --release --example scale_benchmark
# PDF document simulation
cargo run --release --example pdf_benchmarkcd tests
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install comparison databases
pip install numpy lancedb chromadb
# Run comparison
python3 scale_comparison.pyResults are saved to tests/benchmark_results/.
curl -X POST http://localhost:3000/vectors/insert \
-H "Content-Type: application/json" \
-d '{
"id": 1,
"vector": [0.1, 0.2, 0.3, ...], # 768 dimensions
"metadata": {"source": "test.pdf"}
}'curl -X POST http://localhost:3000/vectors/search \
-H "Content-Type: application/json" \
-d '{
"query": [0.1, 0.2, 0.3, ...], # 768 dimensions
"k": 10
}'curl -X POST http://localhost:3000/qd/evolve \
-H "Content-Type: application/json" \
-d '{"sample_size": 1000, "generations": 10}'# Install Xcode Command Line Tools
xcode-select --install# Find process using port 3000
lsof -i :3000
# Kill the process
kill -9 <PID># Use release profile for faster runtime (slower build)
cargo build --release
# Use debug for faster builds (slower runtime)
cargo build# Reload shell configuration
source ~/.bashrc # or ~/.zshrc
# Or specify full path
~/.bun/bin/bun --versionIf performance is lower than expected on Apple Silicon:
# Verify you're using the optimized build
RUSTFLAGS="-C target-cpu=apple-m1" cargo build --release
# Check CPU features
sysctl -a | grep machdep.cpu.features# Check Python version
python3 --version
# Use specific version if multiple installed
/usr/local/bin/python3.11 --version
# Create venv with specific version
/usr/local/bin/python3.11 -m venv venv# Install cargo-watch
cargo install cargo-watch
# Run with auto-reload
cargo watch -x "run --release -p api-server"# All tests
cargo test --workspace
# Specific crate
cargo test -p vector-core
cargo test -p qd-engine
# Specific test
cargo test test_hnsw_search
# With output
cargo test -- --nocapturecargo fmt # Format code
cargo clippy # Lint code- Always use
--releasefor production workloads - Use CPU-specific flags for best SIMD performance:
- Apple Silicon:
RUSTFLAGS="-C target-cpu=apple-m1" - Intel:
RUSTFLAGS="-C target-cpu=native"
- Apple Silicon:
- Match vector dimensions to your embedding model exactly
- Use batch insert for large datasets
- Let evolution complete - the QD engine needs time to find optimal configurations
| Variable | Default | Description |
|---|---|---|
PORT |
3000 | API server port |
VECTOR_DIM |
768 | Vector dimension |
GEMINI_API_KEY |
- | Gemini API key (for ingestion) |
RUST_LOG |
- | Logging level (debug, info, warn, error) |
- Run tests:
cargo test --workspace - Check API health:
curl http://localhost:3000/health - View logs:
RUST_LOG=debug cargo run --release -p api-server - GitHub Issues: Report bugs and request features