Skip to content

Releases: getainode/ainode

v0.4.11 — dashboard tells the truth (Phase 1+2)

18 Jun 03:08

Choose a tag to compare

[0.4.11] — 2026-06-17

Fixed (dashboard now reports the truth — Phase 1+2)

  • Phantom READY eliminated/api/status engine_ready is now a live /v1/models probe each call (latched engine.ready no longer trusted); the instances panel + master node card derive status from it (READY vs STARTING) instead of a hardcoded READY.
  • Per-node VRAM no longer stuck at 0%metrics/collector.py falls back to psutil for GB10 unified memory (nvidia-smi reports N/A), and the UI merges live GPU metrics into the (local) node so the memory ring shows real %. (Per-peer VRAM still pending a metrics fan-out — Phase 3.)
  • Cluster graphic shows distributed membership — participating nodes are stamped with the model + TP=N from the authoritative /api/cluster/resources distributed_instance (the old path keyed off active_sharding, which is null while serving, so workers showed nothing).
  • Ray status no longer falsely "not installed"/api/sharding/status derives Ray health from the head engine when a distributed instance is serving (the orchestrator container has no ray binary to probe).
  • DELETE works on a dead/phantom instanceunload force-clears (stopped: true) when the engine is unreachable instead of blocking on a SIGTERM that can't confirm.

Changed

  • AUTO sharding now defaults to Tensor parallel (was Pipeline) — matches the distributed launch path and this hardware. Launch UI defaults the Sharding pill to Tensor and relabels the node selector "Tensor Parallel Size (nodes)".
  • Launch dropdown marks model state — loaded (●) / on-disk (○) so you can tell what's downloaded.

v0.4.7 — UI polish, dynamic cluster sizing, NFS shared storage

16 Apr 20:41

Choose a tag to compare

v0.4.7

Five UI fixes and infrastructure documentation for shared model storage.

Fixed

Image drop overlay stuck
Dragging a file across the chat window showed "DROP IMAGES TO ATTACH" that couldn't be dismissed. Now clears on Escape, click-outside, or dragging away.

Master stuck on "starting..." forever
When the master had no model configured (fresh install or idle cluster), the topology showed a permanent loading overlay. Now correctly shows "online" — the server IS ready, there's just no model to wait for.

Minimum Nodes capped at 3
Was hardcoded. Now populated dynamically from the discovered cluster size — shows 1-2-3-4 with four nodes.

Launch model dropdown incomplete
Only showed models loaded in vLLM. Now merges catalog + disk-scanned models so everything you've downloaded appears.

Download button on already-downloaded models
Live catalog tabs (Trending, Most Used, Latest, Search HF) now check disk presence. Shows "Downloaded" badge and blocks re-download with a toast. Fixes #35.

Shared model storage

AINode supports NFS-shared model storage across a cluster. Download a model once to a central NVMe-over-TCP volume (NAS, MikroTik ROSA, TrueNAS, etc.), export via NFS, and mount on every node. Set models_dir in each node's config to the shared path — all nodes serve the same weights, zero duplicate downloads.

// ~/.ainode/config.json on each node
{ "models_dir": "/mnt/shared-models" }

Upgrade

ainode update

Full changelog · Docs

v0.4.6 — Download button fix

16 Apr 04:01

Choose a tag to compare

Fixed

  • Download button no longer shows for models already on disk. All catalog views (trending, latest, HF search, main) now check disk presence. Re-downloading blocked with a toast. Fixes #35.
ainode update

Changelog · Docs

v0.4.5 — Master node overlay, smart Update All button

16 Apr 03:13

Choose a tag to compare

v0.4.5

Fixed

Master node shows its identity while loading
The master node circle now always shows the node name, GPU type, and crown as soon as it's discovered — even while vLLM is still warming up. A subtle spinning arc + dim veil + "starting..." overlay communicates the loading state without hiding the node's identity. Fades out cleanly when the engine is ready.

"Update all" button is now context-aware
Hidden by default. Only appears in the CLUSTER pill when a newer version is available on GHCR (/api/version/check). Shows the target version number in the button label. Hides again after a successful update.

Upgrade

ainode update

Full changelog · Docs

v0.4.4 — AWQ fix, cluster update button, loading animation

16 Apr 02:53

Choose a tag to compare

What's new in v0.4.4

Fixed

AWQ models on GB10 now work correctly.
vLLM auto-upgrades AWQ → awq_marlin (a fused Marlin kernel), but awq_marlin CUDA kernels aren't compiled for sm_12.1 (GB10/Blackwell) in the base image. AINode now pins --quantization awq automatically when loading an AWQ model, preventing the upgrade.
Fixes #34 — reported by Chennu@riai360.

Added

⬆ Update all nodes from the master UI
The CLUSTER pill in the topology view has a new ⬆ Update all button. Click to update every node in the cluster simultaneously — master SSHes into workers in parallel, runs docker pull + restart, then updates itself last. Live per-node progress panel shows pending → updating → done/failed.

Topology loading animation
Before the engine is ready, the cluster canvas shows a pulsating "Loading..." circle at center (same size as the real master node). When the engine comes online, the loading ghost cross-fades out and the real node fades in. Worker nodes fade in individually as they're discovered.

Install / upgrade

# Fresh install
curl -fsSL https://ainode.dev/install | bash

# Upgrade existing install  
ainode update

# Update entire cluster from master UI
# → open http://<master>:3000 → click ⬆ Update all in the cluster pill

Images

docker pull ghcr.io/getainode/ainode:0.4.4
docker pull argentaios/ainode:0.4.4

Full changelog · Docs

v0.4.3 — Training: Artifacts, Merge, Eval, W&B

16 Apr 00:50

Choose a tag to compare

What's new in v0.4.3

Full training pipeline — from raw dataset to deployable adapter, entirely in the browser.

Training: Artifact retrieval

  • Download any training output file (adapter weights, tokenizer, checkpoints) directly from the UI or API
  • GET /api/training/jobs/{id}/output — list artifacts
  • GET /api/training/jobs/{id}/output/{filename} — stream download

Training: LoRA merge

  • Merge a LoRA/QLoRA adapter into the base model with one click
  • POST /api/training/jobs/{id}/merge — async merge via PEFT.merge_and_unload()
  • Merged model ready for vLLM inference

Training: Checkpoint resume

  • Resume interrupted or failed jobs from the latest checkpoint
  • POST /api/training/jobs/{id}/resume

Training: Evaluation loop

  • Configurable train/eval split (default 10%)
  • eval_loss + eval_samples_per_second reported in real-time progress
  • Best checkpoint saved automatically

Training: W&B integration

  • Set wandb_project to stream loss curves to Weights & Biases

Training: Custom templates

  • Save your own training templates from the wizard
  • POST /api/training/templates — persisted to disk

Other fixes (v0.4.2 features also in this image)

  • Cancel in-progress downloads (✕ button)
  • Downloaded models show correctly in catalog + "Launch Model" button
  • Version update badge in top bar — click to update from the browser
  • pynvml FutureWarning suppressed from logs

Install / upgrade

# Fresh install
curl -fsSL https://ainode.dev/install | bash

# Upgrade existing install
ainode update

Container images

docker pull ghcr.io/getainode/ainode:0.4.3   # GHCR (canonical)
docker pull argentaios/ainode:0.4.3            # Docker Hub mirror

Full changelog · Docs

AINode v0.4.0 — container-native distribution

15 Apr 01:14

Choose a tag to compare

Highlights

AINode is now a single-container install: docker pull ghcr.io/getainode/ainode:0.4.0.
No host Python venv, no source-built vLLM. Web UI, OpenAI-compatible API, and
GB10-patched vLLM all version-locked in one image.

What's in this release

  • Unified image — one docker run per node, systemd unit on the host.
  • Three node modes: solo, head, member. The head orchestrates
    cross-node tensor-parallel via a patched NCCL (dgxspark-3node-ring);
    members broadcast their presence on UDP 5679 and reserve GPUs for
    Ray workers placed by the head.
  • UI auto-wires distributed launches — pick Minimum Nodes ≥ 2 +
    Tensor in Launch Instance, the UI writes config and hot-swaps the
    engine.
  • Real multi-node cluster topology — aggregated VRAM across members,
    peer IPs captured via UDP recvfrom, "DISTRIBUTED · TP=N" badges.
  • NFS-shared model storage pattern + docs.
  • Verified TP=2 cross-node inference on NVIDIA GB10: 61 GB of model
    weights on each GPU, NCCL over RoCE @ 200 Gb/s, ~35 tok/s for warm
    1.5B model.
  • Honest State-of-Distributed-Inference section in the README.
    What works, what doesn't, lessons learned, and our "why 3 nodes is
    harder than 2, 4 is probably easier" hypothesis.

Install

curl -fsSL https://ainode.dev/install | bash

or directly:

docker pull ghcr.io/getainode/ainode:0.4.0
docker pull argentos/ainode:0.4.0   # Docker Hub mirror

Screenshots

See README.md on main for the full product tour.

Known limitations

  • 3-node TP requires a proper mesh or dedicated switch subnet — see
    Networking requirements
    in the README.
  • Browser-based fine-tuning UI is scaffolded but not yet validated
    end-to-end on real GPUs. Tracked as roadmap item.
  • Ray over VPN (Tailscale) doesn't work for NCCL — use physical
    cables or a dedicated switch.

Artifacts

  • ghcr.io/getainode/ainode:0.4.0 (primary)
  • argentos/ainode:0.4.0 (Docker Hub mirror)

Contributors

This release was driven by Jason Brashear with AI-pair-programming via
Claude Opus 4.6 (1M context). All code and docs in this repo are
Apache-2.0.