Skip to content

On-disk gadget cache (#23, part 1)#27

Merged
ricardojrdez merged 1 commit into
masterfrom
feature-gadget-cache
Jun 23, 2026
Merged

On-disk gadget cache (#23, part 1)#27
ricardojrdez merged 1 commit into
masterfrom
feature-gadget-cache

Conversation

@ricardojrdez

Copy link
Copy Markdown
Member

First part of #23 (scalability). Adds an opt-in on-disk cache; parallel scanning will follow in a second PR, so this does not close the issue yet.

--cache / --cache-dir

With --cache, the raw (vaddr, bytes) gadget records discovered for a binary are stored on disk and reused on later runs over the same file and options, skipping the scan. --cache-dir overrides the location (default: $XDG_CACHE_HOME/rop3).

The cache key binds the file content hash plus every parameter that affects the record set (depth, flags, arch slice, badchars, badchar-bytes, section layout, which captures --base). A changed binary or option misses cleanly.

Design notes

  • Capstone decode objects are not serialisable, so only the raw (vaddr, bytes) records are cached; decodes and the nearest-symbol annotation are always rebuilt locally.
  • The scan is split from gadget construction. _scan does a single disassembly pass, so the default (non-cached) path has no extra work; on a warm cache the scan is skipped entirely and gadgets are rebuilt from the records.
  • Writes are atomic (os.replace) and best-effort (a write failure logs a warning, never aborts).
  • Wired through GadFinder, the Rop3 API (cache=, cache_dir=) and the CLI.

Testing

  • 90 tests pass on Python 3.11 and 3.13 (+5).
  • test_cache.py covers key invalidation (content and params), store/load round-trip, and that a warm cache yields the same gadgets as a cold scan and as the uncached path.
  • End-to-end verified: cold vs warm output identical, cached vs uncached identical, distinct options produce distinct cache entries.

Part of #23.

--cache stores the raw (vaddr, bytes) gadget records discovered for a
binary and reuses them on later runs over the same file and options,
skipping the scan. --cache-dir overrides the location (default:
$XDG_CACHE_HOME/rop3).

- cache.py: GadgetCache keys entries by the file content hash plus every
  parameter that affects the record set (depth, flags, arch, badchars,
  badchar-bytes, section layout). Writes are atomic and best-effort.
- gadfinder: the scan is split from gadget construction. _scan does a single
  disassembly pass (no regression on the default, non-cached path); only the
  raw records are cached, and decodes/symbol are always rebuilt locally so
  the (unpicklable) capstone objects never need serialising.
- wired through GadFinder, the Rop3 API and the CLI.

This is part 1 of #23 (cache); parallel scanning will follow.

Tests: 90 pass (3.11/3.13). New test_cache.py covers key invalidation,
store/load round-trip, and that a warm cache yields the same gadgets as a
cold scan (and as the uncached path).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ricardojrdez ricardojrdez merged commit 93b5436 into master Jun 23, 2026
3 checks passed
@ricardojrdez ricardojrdez deleted the feature-gadget-cache branch June 23, 2026 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant