GLM-5: From Vibe Coding to Agentic Engineering
-
Updated
Jun 20, 2026
GLM-5: From Vibe Coding to Agentic Engineering
Code that accompanies the paper release for "LLMs Corrupt Your Documents When You Delegate"
SWE-Marathon: an ultra long-horizon SWE benchmark
Fixing GRPO training collapse in long-horizon multi-tool agents. A lightweight PRM-Lite + LATA joint approach achieves +37% over vanilla GRPO on τ-bench airline (50-task, multi-turn).
🧠 Awesome Memory-VLA: A curated list of Visual-Language-Action models with memory
Official Repository for our paper: PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems
Code for Scalable Offline Model-Based RL with Action chunking
Code for Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning
The simplest way to build long-horizon environments
VLM-RL Hierarchical Loco-Manupilation For Long-Horizon Tasks With G1 robot in Isaac Lab/Sim
Local-first, eval-first memory for long-horizon AI agents — no LLM at ingest. Python SDK + MCP server with source-traceable recall, belief revision, selective forgetting, and reproducible benchmarks.
A family of long-horizon software-engineering environments for OpenEnv, adapted from https://github.com/Proximal-Labs/frontier-swe
MobileMem: On-Device Memory for Continually Evolving Agents
Official repository for the paper: Can AI Agents Synthesize Scientific Conclusions?
Steady-Forcing: Balancing Spatial Persistence and Motion Continuity in Long-Horizon Nature Video Diffusion
A real-world inspired environment for selective context retention under noise. It evaluates an LLM's ability to manage a fixed-capacity memory buffer, retaining high-value information while filtering out distractors
OpenClaw humanity infusions OtherPowers Creative Intelligence Agency. 🦞
Dashboard for real-world inspired environment for selective context retention under noise. It evaluates an LLM's ability to manage a fixed-capacity memory buffer, retaining high-value information while filtering out distractors
Portable harness for autonomous, long-horizon, multi-agent workflows
Add a description, image, and links to the long-horizon topic page so that developers can more easily learn about it.
To associate your repository with the long-horizon topic, visit your repo's landing page and select "manage topics."