A three-phase pipeline for efficient ghost/gradient stability mapping of the Transitional Planck Mass (TPM) model in EFTCAMB, and for accelerating Cobaya MCMC chains via a neural network viability prior.
- Scientific Context
- Pipeline Overview
- Repository Structure
- Requirements
- Phase 1 — Sobol Stability Mapping
- Phase 2 — Neural Network Training
- Phase 3 — Fast Prior in Cobaya
- Visualisation
- Mathematical Background
- Performance Notes
- References
The Transitional Planck Mass (TPM) model is a modified gravity theory
implemented within the EFTCAMB framework
(Hu et al. 2014; Raveri et al. 2014). It describes a smooth, Gaussian-shaped
transition of the effective Planck mass at early times — parameterised via the
EFT function
| Symbol | Script name | Physical meaning | Prior range |
|---|---|---|---|
Log_aT |
Scale factor at the centre of the transition | ||
sig |
Width of the transition in e-folds | ||
M |
Amplitude of the Planck-mass shift | ||
c |
Late-time kinetic braiding coefficient |
For typical values
EFTCAMB enforces ghost and gradient stability conditions on the scalar perturbations at every step of an MCMC chain:
-
No-ghost: the kinetic matrix of scalar perturbations must be positive
definite, i.e.
$\alpha_K + \tfrac{3}{2}\alpha_B^2 > 0$ . For the TPM model this reduces to requiring$c_0 < 0$ , which ensures$\alpha_K > 0$ . -
No-gradient instability: the squared sound speed of scalar perturbations
must be positive throughout the entire cosmic history,
$c_s^2(a) > 0\ \forall, a$ . This depends non-trivially on all four parameters and cannot be reduced to a simple analytic inequality.
Because a single EFTCAMB stability evaluation takes
┌──────────────────────────────────────────────────────────────────────────┐
│ PHASE 1 — tpm_stability_map.py │
│ │
│ Sobol QMC sampling in (Log_aT, sig, M, c) space │
│ ↓ parallel EFTCAMB evaluation (ghost + gradient stability flags) │
│ K-NN boundary uncertainty score → adaptive Sobol refinement │
│ ↓ │
│ tpm_stability_map.pkl · {points: (N,4), stable: (N,) bool} │
└──────────────────────────────┬───────────────────────────────────────────┘
│
┌──────────────────────────────▼───────────────────────────────────────────┐
│ PHASE 2 — Stability_nn.py │
│ │
│ StandardScaler normalisation + 80/20 train/val split │
│ ↓ MLP 4 → 64 → 64 → 32 → 1 (BCEWithLogitsLoss, Adam) │
│ Best checkpoint saved on minimum validation loss │
│ ↓ │
│ tpm_stability_model.pt (PyTorch weights) │
│ tpm_scaler.pkl (fitted StandardScaler) │
└──────────────────────────────┬───────────────────────────────────────────┘
│
┌──────────────────────────────▼───────────────────────────────────────────┐
│ PHASE 3 — NN_fast_prior_TPM.py (Cobaya Likelihood) │
│ │
│ At each MCMC step: │
│ θ → scale → forward pass → logit z │
│ z > 0 ⟹ logp = 0.0 (pass to EFTCAMB and likelihoods) │
│ z ≤ 0 ⟹ logp = −∞ (reject immediately, skip EFTCAMB) │
│ │
│ Cost: ~0.1 ms/step vs ~10 s for a full EFTCAMB call │
└──────────────────────────────────────────────────────────────────────────┘
.
├── tpm_stability_map.py # Phase 1: Sobol sampling + EFTCAMB labelling
├── tpm_stability_viz.py # Visualisation: scatter corner + heatmap corner
├── Stability_nn.py # Phase 2: MLP training
├── NN_fast_prior_TPM.py # Phase 3: Cobaya Likelihood wrapper
├── TPM.yaml # Reference Cobaya YAML (MCMC run configuration)
└── README.md
Produced artefacts (not tracked by git):
├── tpm_stability_map.pkl # Labelled point cloud from Phase 1
├── tpm_stability_model.pt # Trained MLP weights from Phase 2
└── tpm_scaler.pkl # Fitted StandardScaler from Phase 2
cobaya >= 3.3
camb / eftcamb (your custom EFTCAMB-patched build)
torch >= 2.0
scikit-learn >= 1.3
scipy >= 1.11
numpy
matplotlib
pyyaml
This pipeline requires the EFTCAMB-patched version of CAMB that includes the
TPM model, but it can be generalized to any model. Point Cobaya to your build via the path field
in the YAML:
theory:
camb:
path: /path/to/your/eftcamb# Smoke test — verifies the pipeline end-to-end (~5 min on 4 workers)
python tpm_stability_map.py --yaml TPM.yaml --base 64 --refine-iters 0 --workers 4
# Production run
python tpm_stability_map.py \
--yaml TPM.yaml \
--base 4096 \
--refine-iters 3 \
--refine-batch 1024 \
--workers 36 \
--output tpm_stability_map.pkl| Argument | Default | Description |
|---|---|---|
--yaml |
TPM.yaml |
Path to the Cobaya YAML |
--base |
4096 |
Number of base Sobol points |
--refine-iters |
3 |
Number of boundary-refinement passes |
--refine-batch |
1024 |
Points added per refinement pass |
--workers |
ncpu − 1 |
Parallel worker processes |
--seed |
42 |
Sobol scramble seed |
--output |
tpm_stability_map.pkl |
Output pickle path |
--serial |
flag | Disable multiprocessing (for debugging) |
Each worker runs one CAMB instance. Set the following before launching to
avoid OpenMP oversubscription (
export OMP_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
export MKL_NUM_THREADS=1python Stability_nn.py \
--data tpm_stability_map.pkl \
--epochs 1000 \
--lr 0.001| Argument | Default | Description |
|---|---|---|
--data |
(hardcoded path) | Path to the Phase 1 pickle |
--epochs |
1000 |
Training epochs |
--lr |
0.001 |
Adam learning rate |
| File | Description |
|---|---|
tpm_stability_model.pt |
Best model weights (selected on validation loss) |
tpm_scaler.pkl |
Fitted StandardScaler — must be paired with the weights |
likelihood:
# Physical likelihoods (unchanged)
planck_2018_highl_plik.TTTEEE_lite_native: null
act_dr6_lenslike.ACTDR6LensLike:
variant: actplanck_baseline
lens_only: false
planck_2018_lowl.TT: null
planck_2018_lowl.EE: null
bao.desi_dr2.desi_bao_lrg1: null
sn.desdovekie: null
# Neural network fast prior
NN_fast_prior_TPM.NNFastPrior:
python_path: /path/to/this/repo
model_path: /path/to/tpm_stability_model.pt
scaler_path: /path/to/tpm_scaler.pklpython tpm_stability_viz.py tpm_stability_map.pkl \
--bins 25 \
--min-count 3 \
--outdir figs/Produces two figures:
| File | Content |
|---|---|
tpm_stability_scatter.png |
Corner scatter: 2-D projections of the labelled point cloud. Green = stable, red = unstable. Diagonal panels show 1-D histograms. |
tpm_stability_heatmap.png |
Corner heatmap: |
Note on sample size. The heatmap bins each require at least
--min-countpoints to display a colour (grey = masked). With the smoke-test sample of 64 points and the default--bins 25, the expected number of points per 2-D cell is$64/625 \approx 0.1$ , so almost all cells will appear grey. Use--bins 4 --min-count 2for a coarse but visible map from a small sample.
The value
"If I fix $x_i$ and $x_j$ to values in this bin, what fraction of the remaining 2-D space $(x_k, x_l)$ yields a stable model?"
A single panel cannot distinguish between a sharp boundary crossing and a
uniformly mixed region — the corner layout (all
A grid of
A Sobol sequence is a deterministic, quasi-random sequence designed so that
every new point fills the largest empty region of the parameter space. The
quality of a point set is measured by its star-discrepancy
compared to
The practical consequence is that the Sobol sequence achieves the same coverage uniformity as a grid but with exponentially fewer points in high dimensions.
The sequence is constructed via base-2 arithmetic using a set of direction
numbers that ensure successive points are always placed in the least-covered
sub-interval. The scrambled variant (used here via scipy.stats.qmc.Sobol)
randomises the starting direction while preserving the low-discrepancy property,
which provides an unbiased estimator and enables error estimation via
independent scrambled replicates.
An important implementation detail: the balance property of a Sobol sequence is
strongest when
After the base Sobol sample is labelled, the script identifies the
stability boundary
Step 1 — normalisation. The four parameters live on very different scales.
Euclidean distances in raw parameter space are dominated by whichever parameter
has the largest range. To make the geometry meaningful, all coordinates are
first mapped to
Step 2 — local stable fraction. For each point
where
Step 3 — uncertainty score. A scalar uncertainty is derived from
This function peaks at
Step 4 — bounding box refinement. The top 10% of points by
This adaptive procedure is iterated
For each parameter point, the script builds a minimal Cobaya model: the EFTCAMB
theory is configured exactly as in TPM.yaml (same flags: EFTflag=3,
DesignerEFTmodel=3, ghost and gradient stability enabled), but the physical
likelihoods are replaced by the no-op one likelihood which always returns 0.
The six ΛCDM parameters are pinned at the centres of their ref distributions:
| Parameter | Value |
|---|---|
ombh2 |
0.0224 |
omch2 |
0.118 |
tau |
0.055 |
logA |
3.05 |
ns |
0.965 |
H0 |
72.0 |
Calling model.loglike(θ) then triggers only the EFTCAMB background
initialisation — the step at which the stability conditions are evaluated.
If EFTCAMB declares the model stable, the call returns np.isfinite(loglike).
The multiprocessing.Pool with a pool initialiser: each worker process builds
its own Cobaya model once at startup (a ~10 s overhead) and then receives
parameter points one at a time, amortising the initialisation cost over hundreds
of evaluations.
The multiprocessing start method is set to spawn rather than fork. This is
necessary because fork would copy the parent process's memory, including any
OpenMP thread-pool state initialised by numpy or CAMB imports. An OpenMP runtime
that has been forked is in an undefined state and can deadlock. With spawn,
each worker starts a fresh Python interpreter and initialises OpenMP cleanly
under its own OMP_NUM_THREADS setting.
The neural network is a Multi-Layer Perceptron (MLP) that maps the 4-dimensional normalised parameter vector to a single real number (the logit):
The architecture consists of three hidden layers with ReLU activations:
Input: θ̃ ∈ ℝ⁴
│
├─ Linear(4 → 64) + ReLU
├─ Linear(64 → 64) + ReLU
├─ Linear(64 → 32) + ReLU
└─ Linear(32 → 1) ← raw logit z ∈ ℝ
Why this depth? The stability boundary
Why ReLU? The ReLU activation
Output and probability interpretation. The final layer produces an
unconstrained scalar
The sign of
-
$z > 0 \Leftrightarrow P > 0.5$ : predict stable. -
$z \leq 0 \Leftrightarrow P \leq 0.5$ : predict unstable.
Preprocessing. Before training, the inputs are standardised with a
StandardScaler:
where tpm_scaler.pkl and must be applied identically
at inference time — the network weights are calibrated to normalised inputs and
are meaningless without it.
Loss function. The network is trained to minimise the binary cross-entropy with logits:
This loss penalises confident wrong predictions exponentially: predicting
BCEWithLogitsLoss — which fuses
the sigmoid and the logarithm into a single numerically stable operation via
the log-sum-exp trick — avoids underflow when
Optimiser. Adam (Adaptive Moment Estimation) is used with learning rate
This adapts the effective step size for each weight independently, making it
robust to the different curvatures of the loss along different parameter
directions — particularly useful when the stability boundary has very different
sharpness in the
Model selection. The training set uses 80% of the labelled data, with the remaining 20% held out as a validation set. Validation loss and accuracy are computed every 100 epochs without gradient updates. The checkpoint with the lowest validation loss is saved, rather than the last epoch, to prevent overfitting.
NNFastPrior is registered with Cobaya as a Likelihood, making it part of
the posterior:
The fast prior contributes
The potential source of bias is at the boundary: the network has finite
accuracy, so some truly stable points near z > 0 can be lowered (e.g. to z > −1.0, corresponding to
| Configuration | Evaluations | Indicative wall time |
|---|---|---|
Smoke test (--base 64, 4 workers) |
64 | ~5 min |
Production (--base 4096 + |
~7200 | 4–12 h |
| NN inference (per MCMC step) | — | ~0.1 ms |
| EFTCAMB background (per MCMC step) | — | ~5–20 s |
The fast prior is most effective when the unstable region is large. If, for example, 60% of proposed MCMC points fall in the unstable region, the fast prior reduces EFTCAMB calls by ~60%, cutting MCMC wall time proportionally.
For background-only EFTCAMB calls (as in Phase 1), the parallelisable fraction
of the code is small (background solver is mostly serial, unlike the
perturbation and lensing modules). One OpenMP thread per worker
(OMP_NUM_THREADS=1) therefore maximises throughput by avoiding thread
synchronisation overhead. The recommended setup for a 40-core node is
--workers 36 with OMP_NUM_THREADS=1, leaving 4 cores free for the OS and
the master Python process.
- Benevento, G. et al. 2022, ApJ 935, 156 — TPM model: definition, EFT mapping, and first MCMC constraints.
- Kable, J. et al. 2023, ApJ 959, 143 — TPM constraints with SPT data.
- Hu, B. et al. 2014 — EFTCAMB: arXiv:1312.5742
- Raveri, M. et al. 2014 — EFTCosmoMC: arXiv:1405.1022
- Frusciante, N. & Perenon, L. 2020, Phys. Rep. 857, 1 — EFTCAMB review.
- Deffayet, C. et al. 2010, JCAP 10, 026 — Kinetic gravity braiding.
- Joe, S. & Kuo, F. Y. 2008, SIAM J. Sci. Comput. — Sobol direction numbers.
- Kingma, D. P. & Ba, J. 2015 — Adam optimiser: arXiv:1412.6980
Lorenzo Baldazzi
PhD student, Università degli Studi di Roma Tor Vergata
2026
