Layouts

Precise reference for every storage layout: the (x, y) → flat index mapping, the chunk/storage model, halo semantics, world-size constraints, and the exact CSV layout string each emits.

Adjacent docs: docs/workloads.md — per-workload algorithms and the CSV schema layouts feed into; docs/perf/README.md — sanitizer / profile / optimization-record builds for verifying layout codegen (e.g. confirming lx & (B-1) constant-folds).

This document covers mechanics of the implemented layouts only. For the pipeline of shipped / planned / considered layouts (the source of truth for what exists), see roadmap.md → Potential layouts. For measured numbers, see RESULTS.md.

Source: src/layouts/* (square), src/hex/layouts/* (hex). Registries (the authoritative name → type mapping) are src/layouts/registry.cpp and src/hex/registry.cpp.

What every layout provides

All layouts model the same world as a structure-of-arrays (one flat array per field); a layout is purely a permutation of logical tiles into flat slots. Each satisfies the LayoutT / HexLayoutT concept (src/layouts/layout.h, src/hex/layout.h), and is dispatched through a std::variant + std::visit so every workload call monomorphizes on the concrete type (no virtual dispatch). Key members:

Member	Meaning
`index(x, y)`	The flat index for logical tile `(x, y)` (hex: `(q, r)`). The defining operation.
`chunk_side()`	Chunk side `B` for chunked layouts; 0 for non-chunked. Span workloads use it to compute `chunks_touched`; 0 means "not applicable".
`storage_size()`	Flat-array extent. Equals `W·H` for canonical layouts; larger for persistent-halo layouts. All SoA arrays and any buffer indexed by `index()` are sized to this.
`populate_halos(buf)`	Fills halo cells from canonical centers (persistent-halo layouts only; no-op otherwise).
`radius_spans_exact`, `rect_spans`	Emit physical-address-sorted spans for the disk / rectangle.
`stencil8_update` / `stencil6_update` (+ `_to`)	The dense-stencil kernel; the `_to` form takes explicit in/out buffers (used by `diffusion`).
`row_indices(...)`	Fill a row of indices via stride math (used by `flow_direction`).
`has_uniform_stride_interior()`	`true` when chunk interiors admit `±1`/`±B` pointer math; `false` only for `hex_gosper_7`. Workloads with chunk-aware fast paths branch on this.
`chunk_id_for_sort(x, y)`	A cache-locality bucket key (diagnostic; not currently used by any shipping workload — see README → Potential optimizations → Tested, rejected).

Chunk and storage models

Non-chunked (row_major*, hex_row_major, hex_gosper_7): chunk_side() == 0.
Canonical chunked / curve: the grid is tiled into B × B chunks. index = chunk_id · B² + ly · B + lx, where (lx, ly) is the cell's position inside its chunk and chunk_id orders the chunks. Storage is W·H. The chunk order is what distinguishes the families:
- chunked_row_major — chunk_id = cy · chunks_x + cx (row-major).
- morton / hilbert — chunk_id = Encoder(cx, cy) via a precomputed chunk_id_lut_ (so index() pays one LUT load, not a per-call encode).
Persistent halo (*_halo_*): each chunk is stored as (B+2)² cells — a 1-cell border of halo copies around the B × B centers. Center (lx, ly) sits at storage offset chunk_id · (B+2)² + (ly+1)·(B+2) + (lx+1). Halos hold copies of the adjacent chunks' edge cells, populated once at world init via populate_halos, so the stencil kernel reads all 9 neighbors from one chunk's storage with no cross-chunk index() and no boundary branch. Cost: storage overhead (B+2)²/B² (1.13× at B=32, 1.06× at B=64); halo cells are not addressable through index() — only the stencil kernel touches them.

Curve encoders (`src/layouts/curve_encoders.h`)

Morton (Z-order): bit-interleave x and y. Works for any range.
Hilbert: the standard power-of-two xy → d encode (Wikipedia / Press et al. reference). Better locality than Morton (consecutive chunk_ids are spatial neighbors), at a higher encode cost — amortized to a LUT load here.

Square layouts (`layout_bench`, `grid_kind = "square"`)

Registered in src/layouts/registry.cpp. All carry grid_kind = "square".

`layout` string	`chunk_side()`	Index / storage	World-size constraint	Notes
`row_major_dense`	0	`y·W + x`	none	Baseline. Emits exact Euclidean row intervals for `radius` (same precision the chunked variants get).
`row_major_dense_simd`	0	`y·W + x`	none	Same mapping as `row_major_dense`; only the `stencil8` interior loop differs (Google Highway, runtime ISA dispatch).
`chunked_row_major_32`	32	`chunk_id·B² + ly·B + lx`, `chunk_id = cy·chunks_x + cx`	`W%32==0`, `H%32==0`	Per-call halo scratch in `stencil8` (B=32).
`chunked_row_major_64`	64	as above, `B=64`	`W%64==0`, `H%64==0`
`chunked_row_major_halo_32`	32	`(B+2)²`-per-chunk storage	`W%32==0`, `H%32==0`	Persistent halo (1.13× storage).
`chunked_row_major_halo_64`	64	`(B+2)²`-per-chunk storage	`W%64==0`, `H%64==0`	Persistent halo (1.06× storage).
`morton_chunked_32`	32	curve `chunk_id` via Morton LUT	`W%32==0`, square chunk grid, power-of-two side	Per-call halo (B=32).
`hilbert_chunked_32`	32	curve `chunk_id` via Hilbert LUT	same as Morton	Per-call halo (B=32).
`block_hilbert_32`	32	identical to `hilbert_chunked_32`	same	Type-distinct stub for the future kernel-filtered variant; algorithmically the clipped Hilbert (a milestone-1 duplicate of `hilbert_chunked_32`).
`morton_chunked_halo_32` / `_64`	32 / 64	Morton order + persistent halo	curve constraint + `W%B==0`
`hilbert_chunked_halo_32` / `_64`	32 / 64	Hilbert order + persistent halo	same
`block_hilbert_halo_32` / `_64`	32 / 64	as `hilbert_chunked_halo_*`	same

B-sweep variants (opt-in)

Built only with -DLAYOUT_BENCH_CURVE_B_SWEEP=ON; diagnostic only. They add morton_chunked_{16,64,128}, hilbert_chunked_{16,64,128}, block_hilbert_{16,64,128} (same families, varied chunk side). Boundary fraction in stencil8 ≈ 4/B; curve LUT footprint = (W/B)² × 8 B.

Hex layouts (`hex_layout_bench`, `grid_kind = "hex"`)

Registered in src/hex/registry.cpp. The world is a parallelogram 0 ≤ q < W, 0 ≤ r < H in axial coordinates (pointy-top hexes, six constant neighbor offsets — no parity branch). All carry grid_kind = "hex".

`layout` string	`chunk_side()`	Index / storage	World-size constraint	Notes
`hex_row_major`	0	`r·W + q`	none	Byte-for-byte the square row-major formula; only neighbor/distance semantics differ.
`hex_chunked_row_major_32` / `_64`	32 / 64	`chunk_id·B² + lr·B + lq`, `chunk_id = cr·chunks_q + cq`	`W%B==0`, `H%B==0`	B×B parallelogram chunks.
`hex_chunked_row_major_halo_32` / `_64`	32 / 64	`(B+2)²`-per-chunk storage	`W%B==0`, `H%B==0`	Persistent halo.
`hex_morton_chunked_32`	32	Morton-ordered chunks (LUT)	curve constraint	Encoder reused from the square side (operates on integer pairs).
`hex_hilbert_chunked_32`	32	Hilbert-ordered chunks (LUT)	curve constraint
`hex_morton_chunked_halo_32` / `_64`	32 / 64	Morton order + persistent halo	curve + `W%B==0`
`hex_hilbert_chunked_halo_32` / `_64`	32 / 64	Hilbert order + persistent halo	same
`hex_gosper_7`	0	per-cell `index_table_` lookup (size `W·H`)	`W%7==0`, `H%7==0`	The only hex-native chunk shape: 7-cell Gosper clusters (center + 6 neighbors). Cluster ordering is Hilbert over `(cluster_a, cluster_b)` as a stand-in for the true Gosper L-system (the L-system needs `7^k` cluster counts). `has_uniform_stride_interior() == false` → workloads fall back to per-cell loops.

Hex world-size note

gcd(7, 32) = 1, so no single W satisfies both Gosper-7 (W % 7 == 0) and the power-of-two-chunk-side requirement of the curve layouts at B = 32. hex_layout_bench runs each requested (W, H) and silently skips layouts that throw at construction. Two suggested worlds: W = 256 for the chunked / curve family, W = 224 (= 32 × 7) for hex_row_major, hex_chunked_row_major_32, and hex_gosper_7.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Layouts

What every layout provides

Chunk and storage models

Curve encoders (`src/layouts/curve_encoders.h`)

Square layouts (`layout_bench`, `grid_kind = "square"`)

B-sweep variants (opt-in)

Hex layouts (`hex_layout_bench`, `grid_kind = "hex"`)

Hex world-size note

FilesExpand file tree

layouts.md

Latest commit

History

layouts.md

File metadata and controls

Layouts

What every layout provides

Chunk and storage models

Curve encoders (src/layouts/curve_encoders.h)

Square layouts (layout_bench, grid_kind = "square")

B-sweep variants (opt-in)

Hex layouts (hex_layout_bench, grid_kind = "hex")

Hex world-size note

Curve encoders (`src/layouts/curve_encoders.h`)

Square layouts (`layout_bench`, `grid_kind = "square"`)

Hex layouts (`hex_layout_bench`, `grid_kind = "hex"`)