Releases · getainode/ainode

18 Jun 03:08

webdevtodayjason

v0.4.11

305293b

v0.4.11 — dashboard tells the truth (Phase 1+2) Latest

Latest

[0.4.11] — 2026-06-17

Fixed (dashboard now reports the truth — Phase 1+2)

Phantom READY eliminated — /api/status engine_ready is now a live /v1/models probe each call (latched engine.ready no longer trusted); the instances panel + master node card derive status from it (READY vs STARTING) instead of a hardcoded READY.
Per-node VRAM no longer stuck at 0% — metrics/collector.py falls back to psutil for GB10 unified memory (nvidia-smi reports N/A), and the UI merges live GPU metrics into the (local) node so the memory ring shows real %. (Per-peer VRAM still pending a metrics fan-out — Phase 3.)
Cluster graphic shows distributed membership — participating nodes are stamped with the model + TP=N from the authoritative /api/cluster/resources distributed_instance (the old path keyed off active_sharding, which is null while serving, so workers showed nothing).
Ray status no longer falsely "not installed" — /api/sharding/status derives Ray health from the head engine when a distributed instance is serving (the orchestrator container has no ray binary to probe).
DELETE works on a dead/phantom instance — unload force-clears (stopped: true) when the engine is unreachable instead of blocking on a SIGTERM that can't confirm.

Changed

AUTO sharding now defaults to Tensor parallel (was Pipeline) — matches the distributed launch path and this hardware. Launch UI defaults the Sharding pill to Tensor and relabels the node selector "Tensor Parallel Size (nodes)".
Launch dropdown marks model state — loaded (●) / on-disk (○) so you can tell what's downloaded.

Assets 2

16 Apr 20:41

webdevtodayjason

v0.4.7

74bce93

v0.4.7 — UI polish, dynamic cluster sizing, NFS shared storage

v0.4.7

Five UI fixes and infrastructure documentation for shared model storage.

Fixed

Image drop overlay stuck
Dragging a file across the chat window showed "DROP IMAGES TO ATTACH" that couldn't be dismissed. Now clears on Escape, click-outside, or dragging away.

Master stuck on "starting..." forever
When the master had no model configured (fresh install or idle cluster), the topology showed a permanent loading overlay. Now correctly shows "online" — the server IS ready, there's just no model to wait for.

Minimum Nodes capped at 3
Was hardcoded. Now populated dynamically from the discovered cluster size — shows 1-2-3-4 with four nodes.

Launch model dropdown incomplete
Only showed models loaded in vLLM. Now merges catalog + disk-scanned models so everything you've downloaded appears.

Download button on already-downloaded models
Live catalog tabs (Trending, Most Used, Latest, Search HF) now check disk presence. Shows "Downloaded" badge and blocks re-download with a toast. Fixes #35.

Shared model storage

AINode supports NFS-shared model storage across a cluster. Download a model once to a central NVMe-over-TCP volume (NAS, MikroTik ROSA, TrueNAS, etc.), export via NFS, and mount on every node. Set models_dir in each node's config to the shared path — all nodes serve the same weights, zero duplicate downloads.

// ~/.ainode/config.json on each node
{ "models_dir": "/mnt/shared-models" }

Upgrade

ainode update

Full changelog · Docs

Assets 2

16 Apr 04:01

webdevtodayjason

v0.4.6

cc7fa1c

v0.4.6 — Download button fix

Fixed

Download button no longer shows for models already on disk. All catalog views (trending, latest, HF search, main) now check disk presence. Re-downloading blocked with a toast. Fixes #35.

ainode update

Changelog · Docs

Assets 2

16 Apr 03:13

webdevtodayjason

v0.4.5

cb9954b

v0.4.5 — Master node overlay, smart Update All button

v0.4.5

Fixed

Master node shows its identity while loading
The master node circle now always shows the node name, GPU type, and crown as soon as it's discovered — even while vLLM is still warming up. A subtle spinning arc + dim veil + "starting..." overlay communicates the loading state without hiding the node's identity. Fades out cleanly when the engine is ready.

"Update all" button is now context-aware
Hidden by default. Only appears in the CLUSTER pill when a newer version is available on GHCR (/api/version/check). Shows the target version number in the button label. Hides again after a successful update.

Upgrade

ainode update

Full changelog · Docs

Assets 2

16 Apr 02:53

webdevtodayjason

v0.4.4

5b51ee4

v0.4.4 — AWQ fix, cluster update button, loading animation

What's new in v0.4.4

Fixed

AWQ models on GB10 now work correctly.
vLLM auto-upgrades AWQ → awq_marlin (a fused Marlin kernel), but awq_marlin CUDA kernels aren't compiled for sm_12.1 (GB10/Blackwell) in the base image. AINode now pins --quantization awq automatically when loading an AWQ model, preventing the upgrade.
Fixes #34 — reported by Chennu@riai360.

Added

⬆ Update all nodes from the master UI
The CLUSTER pill in the topology view has a new ⬆ Update all button. Click to update every node in the cluster simultaneously — master SSHes into workers in parallel, runs docker pull + restart, then updates itself last. Live per-node progress panel shows pending → updating → done/failed.

Topology loading animation
Before the engine is ready, the cluster canvas shows a pulsating "Loading..." circle at center (same size as the real master node). When the engine comes online, the loading ghost cross-fades out and the real node fades in. Worker nodes fade in individually as they're discovered.

Install / upgrade

# Fresh install
curl -fsSL https://ainode.dev/install | bash

# Upgrade existing install  
ainode update

# Update entire cluster from master UI
# → open http://<master>:3000 → click ⬆ Update all in the cluster pill

Images

docker pull ghcr.io/getainode/ainode:0.4.4
docker pull argentaios/ainode:0.4.4

Full changelog · Docs

Assets 2

16 Apr 00:50

webdevtodayjason

v0.4.3

3f3a48b

v0.4.3 — Training: Artifacts, Merge, Eval, W&B

What's new in v0.4.3

Full training pipeline — from raw dataset to deployable adapter, entirely in the browser.

Training: Artifact retrieval

Download any training output file (adapter weights, tokenizer, checkpoints) directly from the UI or API
GET /api/training/jobs/{id}/output — list artifacts
GET /api/training/jobs/{id}/output/{filename} — stream download

Training: LoRA merge

Merge a LoRA/QLoRA adapter into the base model with one click
POST /api/training/jobs/{id}/merge — async merge via PEFT.merge_and_unload()
Merged model ready for vLLM inference

Training: Checkpoint resume

Resume interrupted or failed jobs from the latest checkpoint
POST /api/training/jobs/{id}/resume

Training: Evaluation loop

Configurable train/eval split (default 10%)
eval_loss + eval_samples_per_second reported in real-time progress
Best checkpoint saved automatically

Training: W&B integration

Set wandb_project to stream loss curves to Weights & Biases

Training: Custom templates

Save your own training templates from the wizard
POST /api/training/templates — persisted to disk

Other fixes (v0.4.2 features also in this image)

Cancel in-progress downloads (✕ button)
Downloaded models show correctly in catalog + "Launch Model" button
Version update badge in top bar — click to update from the browser
pynvml FutureWarning suppressed from logs

Install / upgrade

# Fresh install
curl -fsSL https://ainode.dev/install | bash

# Upgrade existing install
ainode update

Container images

docker pull ghcr.io/getainode/ainode:0.4.3   # GHCR (canonical)
docker pull argentaios/ainode:0.4.3            # Docker Hub mirror

Full changelog · Docs

Assets 2

15 Apr 01:14

webdevtodayjason

v0.4.0

818fb5c

AINode v0.4.0 — container-native distribution

Highlights

AINode is now a single-container install: docker pull ghcr.io/getainode/ainode:0.4.0.
No host Python venv, no source-built vLLM. Web UI, OpenAI-compatible API, and
GB10-patched vLLM all version-locked in one image.

What's in this release

Unified image — one docker run per node, systemd unit on the host.
Three node modes: solo, head, member. The head orchestrates
cross-node tensor-parallel via a patched NCCL (dgxspark-3node-ring);
members broadcast their presence on UDP 5679 and reserve GPUs for
Ray workers placed by the head.
UI auto-wires distributed launches — pick Minimum Nodes ≥ 2 +
Tensor in Launch Instance, the UI writes config and hot-swaps the
engine.
Real multi-node cluster topology — aggregated VRAM across members,
peer IPs captured via UDP recvfrom, "DISTRIBUTED · TP=N" badges.
NFS-shared model storage pattern + docs.
Verified TP=2 cross-node inference on NVIDIA GB10: 61 GB of model
weights on each GPU, NCCL over RoCE @ 200 Gb/s, ~35 tok/s for warm
1.5B model.
Honest State-of-Distributed-Inference section in the README.
What works, what doesn't, lessons learned, and our "why 3 nodes is
harder than 2, 4 is probably easier" hypothesis.

Install

curl -fsSL https://ainode.dev/install | bash

or directly:

docker pull ghcr.io/getainode/ainode:0.4.0
docker pull argentos/ainode:0.4.0   # Docker Hub mirror

Screenshots

See README.md on main for the full product tour.

Known limitations

3-node TP requires a proper mesh or dedicated switch subnet — see
Networking requirements
in the README.
Browser-based fine-tuning UI is scaffolded but not yet validated
end-to-end on real GPUs. Tracked as roadmap item.
Ray over VPN (Tailscale) doesn't work for NCCL — use physical
cables or a dedicated switch.

Artifacts

ghcr.io/getainode/ainode:0.4.0 (primary)
argentos/ainode:0.4.0 (Docker Hub mirror)

Contributors

This release was driven by Jason Brashear with AI-pair-programming via
Claude Opus 4.6 (1M context). All code and docs in this repo are
Apache-2.0.

Assets 2

Uh oh!

Releases: getainode/ainode

v0.4.11 — dashboard tells the truth (Phase 1+2)

[0.4.11] — 2026-06-17

Fixed (dashboard now reports the truth — Phase 1+2)

Changed

Uh oh!

v0.4.7 — UI polish, dynamic cluster sizing, NFS shared storage

v0.4.7

Fixed

Shared model storage

Upgrade

Uh oh!

v0.4.6 — Download button fix

Fixed

Uh oh!

v0.4.5 — Master node overlay, smart Update All button

v0.4.5

Fixed

Upgrade

Uh oh!

v0.4.4 — AWQ fix, cluster update button, loading animation

What's new in v0.4.4

Fixed

Added

Install / upgrade

Images

Uh oh!

v0.4.3 — Training: Artifacts, Merge, Eval, W&B

What's new in v0.4.3

Training: Artifact retrieval

Training: LoRA merge

Training: Checkpoint resume

Training: Evaluation loop

Training: W&B integration

Training: Custom templates

Other fixes (v0.4.2 features also in this image)

Install / upgrade

Container images

Uh oh!

AINode v0.4.0 — container-native distribution

Highlights

What's in this release

Install

Screenshots

Known limitations

Artifacts

Contributors

Uh oh!