#

on-device-llm

Here are 33 public repositories matching this topic...

google-ai-edge / LiteRT-LM

LiteRT-LM is Google's production-ready, high-performance, open-source inference framework for deploying Large Language Models on edge devices.

edge-ai on-device-ai on-device-llm

Updated Jun 18, 2026
C++

HiveForensics-AI / knolo-core

KnoLo Core is a local-first knowledge base engine built for small language models (LLMs). It packages your documents into a compact .knolo file and enables fully deterministic querying — no embeddings, no vector databases, no cloud services required. Designed for on-device and edge LLM deployments.

offline-first knowledge-base document-retrieval edge-computing edge-ai local-first lexical-search offline-llm rag-alternative vector-database-alternative small-llms on-device-llm retrieval-engine deterministic-search knolo

Updated May 19, 2026
TypeScript

es617 / hunch

On-device shell command generator for macOS Tahoe. Uses Apple's 3B model with dynamic few-shot retrieval from 21k tldr examples.

shell zsh cli terminal developer-tools on-device-ai foundation-models llm-tools apple-intelligence on-device-llm local-llm-macos

Updated Apr 21, 2026
Python

amareshhebbar / TrueNorth

Declare the outcome, skip the logic. The developer-first infrastructure engine for structured, multi-turn AI conversations with built-in safety, async follow-ups, and native compliance

Updated Jun 17, 2026
Python

Jibar-OS / JibarOS

Android 16 fork. AI as a platform primitive. Twelve capabilities, one shared runtime, every app. OEM-pluggable. Apache 2.0.

android operating-system aosp vlm platform-service multimodal on-device-ai llm oir ai-runtime android-16 on-device-llm inference-runtime jibaros

Updated May 6, 2026
Shell

rufolangus / AAOSP

Agentic Android Open Source Project (AAOSP) — Android fork with native LLM system service, MCP-aware apps, and an agent-driven launcher. On-device Qwen 2.5 via llama.cpp. Apps declare tools in their manifest. The OS runs the model.

android agent mcp aosp android-framework edge-ai tool-use jetpack-compose cuttlefish on-device-ai system-service ai-agent llm android-15 llama-cpp agentic qwen model-context-protocol on-device-llm

Updated Apr 23, 2026
Java

whyisitworking / llama-bro

High-performance Android SDK for on-device LLM inference (GGUF). Privacy-focused, offline-first, and powered by llama.cpp with a clean Kotlin Coroutines API.

android cmake ai ndk android-library llama android-app android-package on-device-ai ndk-jni ai-assistant llamacpp llama-cpp on-device-models on-device-inference on-device-llm

Updated Mar 27, 2026
Kotlin

john-rocky / PrivateFoundationModels

Apple FoundationModels API on iOS 18+. Same call site, native passthrough on iOS 26 (Apple Intelligence), CoreML / MLX backends on older OSes. Drop-in source compatible.

macos swift ios mlx swift-package coreml on-device-ai llm generative-ai apple-neural-engine visionos mlx-swift apple-intelligence foundationmodels on-device-llm

Updated May 14, 2026
Swift

Nova-IDE

carrycooldude / Nova-IDE

on-device-ai on-device-inference on-device-llm qualcomm-gpu

Updated Jun 4, 2026
JavaScript

llostinthesauce / fm-teardown

Reverse-engineering notes on fm, Apple's Foundation Models CLI in macOS 27: on-device model catalog (9M/85M/300M/3B + code/vision/speech), Private Cloud Compute, Siri local<->cloud routing, and the OpenAI-compatible 'fm serve' API.

macos cli apple reverse-engineering siri on-device-ai apple-silicon foundation-models llm apple-intelligence openai-compatible on-device-llm private-cloud-compute

Updated Jun 10, 2026

wittyunforgiving119 / PrivateFoundationModels

Run Apple Intelligence, CoreML, and MLX models using a unified Swift interface for local language model sessions on iOS and macOS.

macos swift ios mlx swift-package coreml on-device-ai llm generative-ai apple-neural-engine visionos mlx-swift apple-intelligence foundationmodels on-device-llm

Updated Jun 16, 2026
Swift

RaccoonOnion / ash

Ash — offline survival assistant for iOS. Gemma 4 E2B/E4B fully on-device (text · image · voice) with RAG-grounded answers over 56 emergency-response packs. Built for the Kaggle Gemma 4 Good Hackathon.

ios offline-first survival flutter gemma emergency-response objectbox litert rag hnsw on-device-ai minilm speculative-decoding on-device-llm litert-lm gemma-4

Updated Jun 10, 2026
Dart

avisre / snapdragon-npu-llm

Run LLMs on Snapdragon NPU — including the 'unsupported' 8 Gen 1 (Hexagon v69). Verified at 31 tok/s on OnePlus 10 Pro.

snapdragon qnn edge-ai llm-inference oneplus-10-pro mobile-llm executorch on-device-llm android-llm hexagon-npu qualcomm-ai snapdragon-8-gen-1 samsung-galaxy-s22 hexagon-v69

Updated Jun 15, 2026
Shell

YueLich / aios-wiki

📱 手机端 AI 操作系统全景知识库 — 334+ 篇深度页面，覆盖端侧大模型、AI Agent、芯片适配、推理优化 | 自动更新

wiki xiaomi arxiv knowledge-base quantization npu inference-optimization edge-ai ai-os harmonyos mobile-ai ai-assistant llm mobile-agent on-device-llm

Updated Jun 15, 2026

dsngeu / on-device-ai-model

iOS app that runs a local LLM on-device to transcribe meetings and generate structured notes — action items, decisions, and summaries. No cloud, no API keys, no data leaves the phone.

llama meeting-notes privacy-first edge-ai local-llm gguf privacy-first-ai on-device-llm

Updated May 26, 2026
Swift

sagar-develop / litertlm-kmp

Kotlin Multiplatform engine for running Gemma LLMs on-device on Android via LiteRT-LM — stateful KV-cache chat sessions, resumable model management, function calling. Includes NativeLM, a private Local AI chat app. AGPL-3.0 / commercial.

android kotlin ios clean-architecture gemma litert kotlin-multiplatform rag edge-ai mediapipe local-llm kotlin-inject offline-ai structured-outputs on-device-llm dual-licensing android-llm gemma-4

Updated Jun 8, 2026
Kotlin

Jibar-OS / .github

JibarOS organization profile.

Updated Apr 23, 2026

coreline-ai / kotlin_llm_playlists

온디바이스 LLM + RAG 기반 로컬 음악 추천 Android 앱 | Android local music recommendation app powered by on-device LLM and RAG

audio android kotlin open-source ai music-recommendation rag llm on-device-llm coreline-ai

Updated May 3, 2026

uny / ondevice-llm

Unified Kotlin API for on-device LLMs using each platform's built-in models.

android kotlin ios mlkit kotlin-multiplatform foundation-models llm gemini-nano apple-intelligence on-device-llm

Updated May 31, 2026
Kotlin

privane-ai / privane-core

Execution infrastructure for local-first AI. Reason locally, execute globally.

mcp webgpu ai-agents local-first playwright secure-sandbox llm-orchestration ai-infrastructure agentic-workflows hybrid-inference sovereign-ai on-device-llm

Updated May 26, 2026
TypeScript

Improve this page

Add a description, image, and links to the on-device-llm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the on-device-llm topic, visit your repo's landing page and select "manage topics."