Code for the paper "When Does Gene Regulatory Network Inference Break? A Controlled Diagnostic Study of Causal and Correlational Methods on Single-Cell Data".
The repository compares representative gene regulatory network (GRN) inference methods under controlled single-cell pathologies: dropout, latent confounding, cell-type mixing, feedback, graph density, sample size, and pseudotime drift.
src/simulator.py: synthetic GRN simulator with linear and nonlinear SCMs.src/methods.py: inference methods used in the benchmark.src/metrics.py: AUPRC and error decomposition metrics.src/experiments.py: experiment grids and runners.src/plotting.py: figure and table generation.src/run_all.py: command-line entry point.
Generated files are written to:
results/: experiment CSVs.figures/: paper figures in PNG and PDF.tables/: summary CSV tables.
These output directories are ignored by git so runs can be regenerated locally.
This project uses Python 3.13 and uv.
uv syncYou can also run the code with any Python 3.13 environment that has the dependencies listed in pyproject.toml.
Run a small linear-SCM sweep and build figures/tables:
uv run python -m src.run_all --quickRun the full default linear-SCM sweep:
uv run python -m src.run_allThe default command writes results/results.csv, then builds the standard figures and tables.
Run this linear sweep before running optional sweeps by themselves. The plotting code uses results/results.csv as the baseline for shared figures.
This is the main synthetic benchmark. It sweeps each pathology independently over five levels and evaluates all methods over multiple random seeds.
uv run python -m src.run_all --n-seeds 10Use fewer seeds for iteration:
uv run python -m src.run_all --quickRun the same pathology grid with the nonlinear tanh SCM:
uv run python -m src.run_all --nonlinear --n-seeds 10This expects the linear results to exist because the plotting step overlays nonlinear results against the linear baseline. To run both in one command, use --all or run the linear sweep first.
Run the joint dropout x confounders x density sweep:
uv run python -m src.run_all --interaction --interaction-seeds 5The interaction results are saved to results/results_interaction.csv.
As above, run the linear sweep first if results/results.csv does not already exist.
Run the linear, nonlinear, and interaction experiments:
uv run python -m src.run_all --allIf the result CSVs already exist, regenerate figures and tables without recomputing experiments:
uv run python -m src.run_all --figures-only
uv run python -m src.run_all --nonlinear --figures-only
uv run python -m src.run_all --interaction --figures-only
uv run python -m src.run_all --all --figures-onlyThe benchmark includes:
- Pearson correlation.
- Mutual information.
- GENIE3-style random forest feature importance.
- PC-style conditional independence testing.
- GES-style greedy BIC search.
- NOTEARS.
Undirected and directed AUPRC are reported for every method. Error decomposition is computed at a top-K threshold where K is the number of true directed edges.
This code is released under the MIT License.