long-horizon

Fixing GRPO training collapse in long-horizon multi-tool agents. A lightweight PRM-Lite + LATA joint approach achieves +37% over vanilla GRPO on τ-bench airline (50-task, multi-turn).

reinforcement-learning long-horizon qwen agentic-ai tool-calling process-reward-model grpo tau-bench multi-turn-agents

Updated May 11, 2026
Python

avanturist322 / awesome-memory-vla

Star

🧠 Awesome Memory-VLA: A curated list of Visual-Language-Action models with memory

robotics memory vla pomdp vlm embodied-ai long-horizon visual-language-models long-context-modeling visual-language-action-models memory-vlm memory-vla

Updated Jun 17, 2026

JiayuJeff / PlanBench-XL

Star

Official Repository for our paper: PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems

agent tool-use llm long-horizon long-horizon-agents

Updated Jun 24, 2026
Python

kwanyoungpark / MAC

Star

Code for Scalable Offline Model-Based RL with Action chunking

reinforcement-learning model-based-reinforcement-learning offline-reinforcement-learning long-horizon action-chunking

Updated Feb 20, 2026
Python

kwanyoungpark / LEQ

Star

Code for Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

reinforcement-learning model-based-reinforcement-learning offline-reinforcement-learning long-horizon

Updated Feb 6, 2025
Python

TheSeamau5 / envoi

Star

The simplest way to build long-horizon environments

ai rl agents long-horizon agentic-ai long-horizon-ai

Updated Apr 8, 2026
Python

mturan33 / isaac-g1-hierarchical

Star

VLM-RL Hierarchical Loco-Manupilation For Long-Horizon Tasks With G1 robot in Isaac Lab/Sim

vlm g1 ppo semantic-map isaacsim loco-manipulation isaac-sim unitree long-horizon long-horizon-robotic-manipulation isaac-lab isaaclab unitree-g1 long-horizon-ai long-horizon-manipulation long-horizon-intelligence long-horizon-tasks

Updated May 20, 2026
Python

vornicx / Midas

Star

Local-first, eval-first memory for long-horizon AI agents — no LLM at ingest. Python SDK + MCP server with source-traceable recall, belief revision, selective forgetting, and reproducible benchmarks.

python mcp embeddings ai-agents rag llm long-horizon agent-memory

Updated Jun 25, 2026
Python

3xcaffeine / frontier-swe-openenv

Star

A family of long-horizon software-engineering environments for OpenEnv, adapted from https://github.com/Proximal-Labs/frontier-swe

rl-environment long-horizon openenv agent-harness

Updated Apr 26, 2026
C

zjunlp / MobileMem

Star

MobileMem: On-Device Memory for Continually Evolving Agents

agent benchmark natural-language-processing mobile memory phone mobile-app artificial-intelligence dataset large-language-models long-horizon mobilemem mobile-omni

Updated Jun 24, 2026
Python

hayoungjungg / SciConBench

Star

Official repository for the paper: Can AI Agents Synthesize Scientific Conclusions?

benchmark ai-agents long-form long-horizon agentic-workflow scientific-conclusion-synthesis clean-room-evaluation

Updated Jun 11, 2026
Python

Stanford-CongLab / LabHorizon

Star

Pushing the Limits of Laboratory 3D Perception and Long-Horizon Planning via Protocol-Aligned Action Prediction

agent science benchmark ai assets protocol dataset 3d planing llm long-horizon

Updated Jun 13, 2026
Python

minar09 / steady-forcing

Star

Steady-Forcing: Balancing Spatial Persistence and Motion Continuity in Long-Horizon Nature Video Diffusion

autoregressive-models long-horizon long-video fixed-camera steady-forcing nature-flow static-view static-stability motion-persistence

Updated Jun 16, 2026
Python

Aditya-Ranjan1234 / Long-Horizon-Memory-V2

Star

A real-world inspired environment for selective context retention under noise. It evaluates an LLM's ability to manage a fixed-capacity memory buffer, retaining high-value information while filtering out distractors

learning environment context retention long-horizon

Updated Apr 25, 2026
Jupyter Notebook

OtherPowers / clawdbot

Star

OpenClaw humanity infusions OtherPowers Creative Intelligence Agency. 🦞

intelligence agency creative long-horizon walksonthebeach age-of-aquarius-tech sgidoula

Updated Feb 24, 2026
TypeScript

Aditya-Ranjan1234 / Long-Horizon-Memory-V2-Dashboard

Star

Dashboard for real-world inspired environment for selective context retention under noise. It evaluates an LLM's ability to manage a fixed-capacity memory buffer, retaining high-value information while filtering out distractors

monitoring dashboard reinforment-learning long-horizon

Updated Apr 25, 2026
Python

natb1 / commons.systems

Star

Portable harness for autonomous, long-horizon, multi-agent workflows

autonomous-agents long-horizon agentic-workflow ai-coding agent-orchestration claude-code coding-agent harness-engineering

Updated Jun 25, 2026
TypeScript

Improve this page

Add a description, image, and links to the long-horizon topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the long-horizon topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

long-horizon

Here are 25 public repositories matching this topic...

zai-org / GLM-5

microsoft / delegate52

abundant-ai / swe-marathon

qiqihezh / agentic-grpo-longhorizon

avanturist322 / awesome-memory-vla

JiayuJeff / PlanBench-XL

kwanyoungpark / MAC

kwanyoungpark / LEQ

TheSeamau5 / envoi

mturan33 / isaac-g1-hierarchical

vornicx / Midas

3xcaffeine / frontier-swe-openenv

zjunlp / MobileMem

hayoungjungg / SciConBench

Stanford-CongLab / LabHorizon

minar09 / steady-forcing

Aditya-Ranjan1234 / Long-Horizon-Memory-V2

OtherPowers / clawdbot

Aditya-Ranjan1234 / Long-Horizon-Memory-V2-Dashboard

natb1 / commons.systems

Improve this page

Add this topic to your repo