NanoRollout

Scale digital agent rollouts without pain.

Quick Start | Agent RL | Blog | Huggingface | Wandb

NanoRollout Overview

What is NanoRollout?

Scaling digital agents is bottlenecked by environments. Environments demand resources (CPU/memory) orthogonal to model training (GPU). NanoRollout is a lightweight rollout repo that (1) decouples agent harnesses (e.g., OpenHands, mini-swe-agent, Terminus2, OSWorld-MM-Agent, Cocoa-Agent) and environment backends (e.g., Docker, Modal, AWS EC2) from the trainer logic, so each can be developed and scaled independently; and (2) unifies the rollout service in evaluation, distillation, and reinforcment learning (RL) behind a single rollout server endpoint where clients submit a task and receive a trajectory.

Nanorollout powers fast parallel evaluation (SWE-Bench Verified in 18 min with 500 workers), large-scale distillation (300K+ trajectories → Mocha-Coder-32B), and stable RL at large batch sizes (bsz 4,096 → Mocha-RL-Alpha-32B), integrating with miles, veRL, and tunix.

Installation

git clone https://github.com/cocoa-org/NanoRollout.git
cd NanoRollout

We recommend using uv with Python 3.12.

uv python pin 3.12
uv sync

This creates or reuses the project virtual environment and installs NanoRollout from pyproject.toml/uv.lock.

If you prefer a minimal editable install instead of syncing the lockfile:

uv python pin 3.12
uv venv
uv pip install -e .

Check that the CLI is available:

nro --help

For RL training, also fetch the trainers/ submodule:

git submodule update --init --recursive

Supported Agents

Domain	Benchmark	Harness (`--agent`)	Sandbox (`--env-type`)
SWE	SWE-Bench Verified / Pro	`oh-core` (OpenHands), `oh-lite`, `mini-swe-agent`, `r2egym`, `claude-code`, `qwen-code`, `opencode`	`docker`, `modal`, `enroot`
Terminal	Terminal-Bench 2.0	`terminus2`, `mini-swe-agent`, `claude-code`, `qwen-code`, `opencode`	`docker`, `modal`, `enroot`
Computer Use	OSWorld-Verified	`qwen3vl-mmagents`	`aws`, `docker`, et al.
Unified	CocoaBench	`cocoa-agent`	`docker`, `modal`

Quick Start

`nro run` — Synchronous Rollout

Run a single SWE instance directly from the CLI:

nro run \
  --task swe --agent oh-core \
  --model-name deepseek-v4-flash \
  --base-url https://api.deepseek.com/v1 --api-key $OPENAI_API_KEY \
  --env-type docker --instance-id django__django-11095

Scale to 500 parallel workers on Modal:

nro run \
  --task swe --agent oh-core \
  --model-name deepseek-v4-flash \
  --base-url https://api.deepseek.com/v1 --api-key $OPENAI_API_KEY \
  --env-type modal \
  --request-file examples/eval/swe/data/swebench_verified.jsonl \
  --concurrency 500

nro run is best suited when environment resources are managed externally (e.g. Modal), so no Ray is needed. For self-hosted model endpoints (e.g. vLLM, SGLang), replace --base-url with your local endpoint (e.g. --base-url http://<server-ip>:8000/v1). For detailed examples across tasks (SWE-Bench, Terminal-Bench, OSWorld, CocoaBench) and agents, see examples/eval/.

`nro serve` — Async Rollout Server

We recommend starting an async rollout server for flexible async requests and self-managed resources (like CPU/RAM), for evaluation, distillation, or RL training at scale.

ray start --head
nro serve host=0.0.0.0 port=11000 concurrency=64

Clients submit tasks to POST /run and receive trajectories with rewards and messages:

curl -s http://localhost:11000/run \
  -H "Content-Type: application/json" \
  -d '{
    "instance_id": "django__django-11095",
    "task": "swe", "agent": "oh-core",
    "model_name": "deepseek-v4-flash",
    "base_url": "https://api.deepseek.com/v1",
    "api_key": "<your-api-key>"
  }'

RL trainers (miles, veRL, tunix) call this endpoint to generate rollout batches during training. See examples/server/ for multi-node Ray cluster setup.

Agent RL

NanoRollout serves trajectories to RL trainers through the same POST /run endpoint. Start nro serve (see Quick Start) first, then point your trainer at NANOROLLOUT_URL=http://<host>:11000. We have validated integration with miles, veRL, and tunix; veRL and tunix reference code is coming soon.

miles

The miles side captures exact tokens and logprobs from agent calls via a TITO proxy so the trainer sees the same token stream the agent saw. See miles/examples/nanorollout for the launch script, hyperparameters, and full setup for an example to train Qwen3-4B-Instruct.

Contributing

NanoRollout is an open-source effort to democratize large-scale agent training and evaluation. We are actively seeking collaborators to help build the future of digital agent infra.

Submit PRs: We welcome contributions to both the core code and expansion of agent harnesses or benchmarks.
Join the Discussion: Have an idea or need help? Chat with us on Discord.
Report Bugs: Use GitHub Issues to report bugs or request new features.

Citation

If you use NanoRollout in academic work, please cite it using the following BibTeX entry:

@misc{nanorollout,
  title  = {NanoRollout: A Lightweight Infra for Digital Agent Rollout at Scale},
  author = {Wang, Junli and Cheng, Zhoujun and Zhang, Yuxuan and Hao, Shibo and Tang, Yao and Hu, Zhiting and Ammanabrolu, Prithviraj and Zhang, Hao},
  year   = {2026},
  howpublished = {\url{https://cocoa-org.notion.site/nanorollout}},
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
assets		assets
examples		examples
nanorollout		nanorollout
tests		tests
trainers		trainers
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NanoRollout

NanoRollout Overview

What is NanoRollout?

Installation

Supported Agents

Quick Start

`nro run` — Synchronous Rollout

`nro serve` — Async Rollout Server

Agent RL

miles

Contributing

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NanoRollout

NanoRollout Overview

What is NanoRollout?

Installation

Supported Agents

Quick Start

nro run — Synchronous Rollout

nro serve — Async Rollout Server

Agent RL

miles

Contributing

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`nro run` — Synchronous Rollout

`nro serve` — Async Rollout Server

Packages