This guide provides step-by-step instructions for building and running Pinguin on Windows on Arm devices, specifically optimized for the Arm AI Developer Challenge 2025.
- Windows on Arm Device: Snapdragon X Elite, Surface Pro X, or similar
- RAM: Minimum 4GB, 8GB+ recommended
- Storage: 10GB free space (5GB for models, 5GB for documents)
- CPU: Modern Arm64 processor (Snapdragon X Elite, SQ1, SQ2, etc.)
- Windows 11 on Arm (build 22000 or later)
- Visual Studio 2022 with C++ build tools (Arm64)
- Node.js 18+ (Arm64 build)
- Python 3.10+ (Arm64 build)
- Git for Windows (Arm64)
If you're a hackathon judge testing Pinguin, follow these simplified steps:
- Visit ollama.com
- Download "Ollama for Windows (ARM64)"
- Run the installer
- Ollama will start automatically in the background
- Go to Pinguin Releases
- Download
Pinguin-Setup-1.0.0-arm64.exe
- Double-click
Pinguin-Setup-1.0.0-arm64.exe - Follow the installation wizard
- Launch Pinguin from Start Menu or desktop shortcut
When you launch Pinguin for the first time:
-
Welcome Screen: Click "Next"
-
LLM Setup:
- Recommended:
llama3.2:3b(fast, 2GB) - Alternative:
qwen2.5:3b(better reasoning, 2GB) - Click "Download" - this will take 2-5 minutes
- Wait for "Model Ready" status
- Recommended:
-
Embedding Setup:
- Recommended:
nomic-embed-text(fast, 274MB) - Alternative:
mxbai-embed-large(better quality, 670MB) - Click "Download" - this will take 1-2 minutes
- Wait for "Model Ready" status
- Recommended:
-
Finish: Click "Finish" to start using Pinguin
-
Upload a Document:
- Click "Documents" in sidebar
- Click "Upload Document"
- Recommended: Select a text-based PDF, DOCX, or TXT file for first test
- Wait for processing (10-30 seconds for text-based documents)
- Note: Scanned PDFs with OCR can take 20-30 minutes - use text-based documents for best experience
-
Ask a Question:
- Go to "Chat" tab
- Type a question about your document
- Press Enter or click Send
- First query may take 1-2 minutes (models loading into memory)
- Subsequent queries: 30-50 seconds depending on complexity
- Watch the AI generate an answer with sources
-
Verify Arm Optimization:
- Open Task Manager / Activity Monitor
- Check CPU usage during inference
- Note the efficient performance on Arm hardware
Troubleshooting: If the UI doesn't update after sending a message, navigate to another chat and back. See KNOWN_ISSUES.md for complete details and workarounds.
For developers who want to build Pinguin from source:
git clone https://github.com/Kehn-Marv/Pinguin.git
cd Pinguinnpm installThis will install all Electron and React dependencies, including Arm64-specific native modules.
Windows on Arm
cd backend
python -m venv venv
.\venv\Scripts\activate
pip install -r requirements.txt
cd ..macOS / Linux
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd ..Pinguin requires Tesseract (OCR) and Poppler (PDF processing):
Automated Setup (Recommended)
npm run setup-architecture-binariesThis script will:
- Detect your Arm architecture
- Download Arm64 builds of Tesseract and Poppler
- Extract them to
extraResources/ - Verify installation
Manual Setup
If automated setup fails, download manually:
Tesseract (Windows on Arm)
- Download from UB-Mannheim/tesseract
- Extract to
extraResources/tesseract/win32-arm64/
Poppler (Windows on Arm)
- Download from oschwartz10612/poppler-windows
- Extract to
extraResources/poppler/
macOS (Apple Silicon)
brew install tesseract popplerLinux Arm64
sudo apt-get install tesseract-ocr poppler-utilsDevelopment Build
npm startThis starts Electron in development mode with hot reload.
Production Build
npm run makeThis creates a distributable package in the out/ directory.
Build for Windows on Arm
npm run make -- --arch=arm64 --platform=win32After building, test the application:
.\out\Pinguin-win32-arm64\Pinguin.exePinguin uses several native Node.js modules that must be compiled for Arm64:
# Rebuild native modules for Arm64
npm rebuild --arch=arm64Key native modules:
better-sqlite3: Database operationselectron: Native Electron bindings- Various cryptography modules
Some Python packages have Arm-specific wheels:
# Install with Arm64 wheels
pip install --platform manylinux2014_aarch64 --only-binary=:all: numpyPackages with Arm optimizations:
numpy: BLAS/LAPACK with Arm NEONchromadb: Vector operationssentence-transformers: Model inference
Ollama automatically detects Arm architecture and uses optimized builds:
# Verify Ollama is using Arm64
ollama --version
# Should show: ollama version x.x.x (arm64)
# Check available models
ollama list
# Pull Arm-optimized models
ollama pull llama3.2:3b
ollama pull nomic-embed-textEnvironment Variables
# Optimize for Arm NEON
export GGML_NEON=1
# Set thread count (adjust for your CPU)
export OMP_NUM_THREADS=8
# Enable Arm-specific optimizations
export OLLAMA_NUM_PARALLEL=2Model Quantization
Use quantized models for better performance:
llama3.2:3b-q4_0- 4-bit quantization (faster)llama3.2:3b-q8_0- 8-bit quantization (balanced)llama3.2:3b- Full precision (best quality)
Error: Cannot find module 'electron'
npm install --force
npm rebuild electronError: Python module not found
cd backend
pip install -r requirements.txt --force-reinstallError: Tesseract not found
npm run verify-tesseract
# If fails, run: npm run setup-tesseractOllama Connection Failed
# Check if Ollama is running
curl http://localhost:11434/api/tags
# If not running, start it
ollama serveBackend Won't Start
# Check port 8000 is free
netstat -an | grep 8000
# Kill process using port 8000
# Windows: taskkill /F /PID <pid>
# macOS/Linux: kill -9 <pid>Slow Performance
- Use smaller models (3B instead of 7B)
- Reduce batch size in settings
- Close other applications
- Ensure adequate cooling (Arm CPUs can throttle)
Windows on Arm
- Install Visual C++ Redistributable (Arm64)
- Disable antivirus temporarily during build
- Run PowerShell as Administrator for scripts
macOS (Apple Silicon)
- Install Rosetta 2 for compatibility:
softwareupdate --install-rosetta - Grant Full Disk Access to Terminal in System Preferences
- Use native Arm64 terminal, not Rosetta
Linux Arm64
- Install build essentials:
sudo apt-get install build-essential - Update to latest kernel for best Arm support
- Check SELinux/AppArmor isn't blocking Electron
npm install: ~3 minutesnpm run make: ~5 minutes- Total build time: ~8 minutes
Startup Time
- Cold start: ~4 seconds
- Warm start: ~2 seconds
Document Processing
- PDF (10 pages): ~5 seconds
- PDF with OCR (10 pages): ~30 seconds
- DOCX (50 pages): ~8 seconds
Inference Speed (llama3.2:3b)
- Tokens per second: 25-40 (CPU only)
- Query latency: 2-4 seconds
- Embedding generation: ~100ms
Memory Usage
- Idle: ~300MB
- With 3B model: ~2.5GB
- With 7B model: ~5GB
- Peak during processing: ~3.5GB
Pinguin uses GitHub Actions for automated builds:
# .github/workflows/build-arm.yml
name: Build Arm64
on: [push, pull_request]
jobs:
build:
runs-on: [self-hosted, ARM64]
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: '18'
architecture: 'arm64'
- run: npm install
- run: npm run makeWindows (Squirrel)
npm run makeSquirrelmacOS (DMG)
npm run make -- --targets=@electron-forge/maker-dmgLinux (AppImage)
npm run make -- --targets=@electron-forge/maker-appimageFor production releases, sign your binaries:
Windows
signtool sign /f certificate.pfx /p password /tr http://timestamp.digicert.com Pinguin.exemacOS
codesign --deep --force --verify --verbose --sign "Developer ID" Pinguin.app- Electron Documentation
- Ollama Documentation
- Arm Developer Resources
- Node.js Arm64 Support
- Python on Arm
For build issues or questions:
- Open an issue on GitHub
- Email: kehnmarv30@gmail.com
- Check Discussions