unified-memory

Here are 37 public repositories matching this topic...

hogeheer499-commits / strix-halo-guide

AMD Strix Halo local LLM guide: setup for Ryzen AI MAX+ 395 / Radeon 8060S, Ollama, llama.cpp Vulkan/RADV, ROCm, raw evidence, direct 100 t/s 30B Qwen, 140 t/s CHADROCK MTP, 120B GGUF.

Updated Jun 21, 2026
Python

parallelArchitect / sparkview

Star

Operator-grade GPU monitor for NVIDIA GPUs with native GB10 / DGX Spark coherent UMA support — PSI pressure, clock detection, ConnectX-7 network layer

python monitoring gpu cuda tui nvidia psi unified-memory gb10 dgx-spark

Updated May 31, 2026
Python

hamtun24 / openuma

Star

Unified Memory Abstraction Layer for AI Inference on AMD APUs and Intel iGPUs

rust machine-learning inference unified-memory llm

Updated Apr 3, 2026
Rust

real-space / tfQMRgpu

Star

A CUDA implementation of the transpose-free Quasi-Minimal Residual method

c library cplusplus fortran gpu cuda complex-numbers gpu-computing sparse-linear-systems multiprecision sparse-matrix linear-solvers sparse-linear-solver iterative-algorithms block-sparsity sparse- unified-memory tfqmr

Updated Sep 2, 2025
C++

raspoli / mlx-serve

Star

Local inference server for Apple Silicon — hot-swaps MLX models (LLM, vision, embeddings, TTS, STT) via OpenAI API

python macos machine-learning text-to-speech embeddings speech-to-text inference-server mlx model-serving fastapi unified-memory apple-silicon openai-api llm local-inference vision-language-model local-llm mlx-lm openai-compatible

Updated Jun 23, 2026
Python

parallelArchitect / cuda-unified-memory-analyzer

Star

gpu thrashingNVIDIA GPU Unified Memory diagnostic tool — architecture-aware, measurement-based, PCIe/coherent transport detection

Updated May 4, 2026
Cuda

ChrisJR035 / Talos-O-Architecture

Star

Talos-O (Omni): A sovereign, embodied agentic organism forged on AMD Strix Halo. Integrating the Chimera Kernel (Linux 7.0), Zero-Copy Introspection, and the Phronesis Engine. Built from First Principles.

zero-copy linux-kernel first-principles linux-kernel-hacking unified-memory embodied-ai unified-memory-parallelism ryzen-ai sovereign-ai strix-halo ryzen-ai-max first-principles-ai rocm-6-2 neo-techne phronesis

Updated Apr 21, 2026
Python

shumbul / Accelerated-Computing

Star

Fundamentals of Accelerated Computing C/C++ is a course provided by NVIDIA.

cuda nvidia high-performance-computing accelerated-computing unified-memory

Updated Oct 9, 2020
Cuda

CINOAdam / nvml-unified-shim

Star

NVML unified memory shim for NVIDIA DGX Spark Grace Blackwell GB10 - enables MAX Engine, PyTorch, and GPU monitoring

machine-learning tensorflow gpu cuda pytorch nvidia nvml arm64 unified-memory max-engine dgx-spark grace-blackwell

Updated Jan 28, 2026
C

sadopc / unified-db-2

Star

Apple Silicon Unified Memory for GPU-Accelerated Analytics — TPC-H benchmarks across DuckDB, NumPy, and MLX

benchmark numpy gpu-computing mlx tpc-h unified-memory duckdb apple-silicon apple-m4 gpu-analytics

Updated Feb 18, 2026
Python

lintenn / cudaAddVectors-explicit-vs-unified-memory

Star

Performance comparison of two different forms of memory management in CUDA

c performance memory cuda memory-management explicit unified-memory

Updated Oct 3, 2021
Cuda

parallelArchitect / gb10-kernel-probe

Star

Empirical kernel scheduling characterization for NVIDIA GB10 (SM121a). Sweeps GEMM tile configurations, classifies PTX instruction paths, captures hardware telemetry