Tools Catalog — Automated Empirical Research & Causal Inference

Curated, license- and maintenance-aware index of software tools for automated empirical research and causal inference — distinct from the agent skills under ../skills/. Source of truth: tools.json. Rebuild with python3 scripts/build-tools-catalog.py.

Summary

334 tools across 6 categories.

Category	Count
Causal-inference & treatment-effect libraries	31
Econometrics & quasi-experimental libraries	170
Causal discovery / structure learning	25
Autonomous research & data-science agents	51
MCP servers (data & stats execution)	48
Benchmarks & datasets	9

By language	Tools	By maintenance	Tools	By license	Tools
Python	164	🟢 active	181	permissive (MIT/BSD/Apache/…)	183
R	109	🟡 maintained	94	copyleft (GPL/AGPL/LGPL/CeCILL)	104
Stata	53	🔴 dormant	59	unverified / unmapped	39
TypeScript	16			proprietary / non-OSI / custom	8
Julia	11
C++	6
Java	3
JavaScript	3

last_activity and stars_approx are point-in-time snapshots from the curation pass (see README.md for caveats). Status: 🟢 active ≈ commit within ~6 months · 🟡 maintained ≈ within ~2 years · 🔴 dormant ≈ older.

Causal-inference & treatment-effect libraries (31)

Tool	Lang	License	Status	What it does
Ananke	Python	Apache-2.0	🟡 maintained · 2023-12	Python package for causal inference using graphical models (DAGs, ADMGs, chain graphs) supporting nonparametric identification and semiparametric estimation under unmeasured confounding.
bartCause	R	GPL-2.0-or-later	🟢 active · 2025-12	R package for causal inference using Bayesian Additive Regression Trees (BART), fitting response/treatment models to estimate ATE/ATT/ITE.
bcf	R · C++	unverified	🟡 maintained · 2023-01	Reference R implementation of Bayesian Causal Forests (Hahn-Murray-Carvalho) for heterogeneous treatment-effect estimation with regularized treatment-effect priors.
CATENets	Python	BSD-3-Clause	🟡 maintained · 2023-08	sklearn-style JAX/PyTorch implementations of neural-network CATE estimators including TARNet, CFRNet, DragonNet, SNet, FlexTENet, and NN meta-learners.
causal-curve	Python	MIT	🟡 maintained · 2024-05	Python package for estimating causal dose-response curves (continuous-treatment effects) from observational data with confidence intervals.
CausalImpact	R	Apache-2.0	🟢 active · 2026-03	Google's R package estimating the causal effect of an intervention on a time series using a Bayesian structural time-series counterfactual model.
Causalinference	Python	BSD-3-Clause	🟡 maintained · 2025-06	Classic Python package for treatment-effect estimation via propensity-score estimation, trimming, subclassification, matching, weighting, and least-squares.
causallib	Python	Apache-2.0	🟢 active · 2026-05	IBM's scikit-learn-style package for estimating causal effects from observational data via IPW, standardization, doubly-robust (AIPW), and matching estimators.
CausalLift	Python	BSD-2-Clause	🔴 dormant · 2019-08	Uplift modeling package based on the T-learner targeting which customers to treat, usable with both A/B-test and observational data.
CausalML	Python	Apache-2.0	🟢 active · 2026-05	Uber's uplift modeling and causal ML toolkit providing CATE/ITE estimation via S/T/X/R meta-learners, uplift trees/forests, and tree-based treatment selection.
CausalPy	Python	Apache-2.0	🟢 active · 2026-05	PyMC Labs package for Bayesian (and OLS) causal inference in quasi-experimental designs including synthetic control, interrupted time series, difference-in-differences, regression discontinuity, and instrumental variables.
causalToolbox	R	GPL-3.0	🔴 dormant · 2021	R toolbox for heterogeneous treatment effects implementing S/T/X/M/DR meta-learners with honest random forests and BART base learners (now mirrored at forestry-labs/causalToolbox).
CausalTune	Python	Apache-2.0	🟡 maintained · 2024-12	AutoML library for automated tuning and out-of-sample (energy-score) selection of causal estimators wrapping EconML/DoWhy via FLAML.
DoubleML (Python)	Python	BSD-3-Clause	🟢 active · 2026-05	Object-oriented implementation of the double/debiased machine learning framework on top of scikit-learn for partially linear, IV, and interactive regression models.
DoubleML (R)	R	MIT	🟢 active · 2026-05	R implementation of the double/debiased machine learning framework built on the mlr3 ecosystem for orthogonal-score estimation of treatment effects.
DoWhy	Python	MIT	🟢 active · 2025-11	End-to-end Python causal inference library that models assumptions as a causal graph and provides a four-step identify/estimate/refute API with refutation-based robustness tests.
EconML	Python	MIT	🟢 active · 2026-06	Microsoft ALICE project package for estimating heterogeneous treatment effects (CATE) from observational data using double machine learning, orthogonal/causal forests, DRLearner, DeepIV and meta-learners.
grf	R · C++	GPL-3.0	🟢 active · 2026-04	Generalized Random Forests for nonparametric heterogeneous treatment-effect estimation (causal forests), including IV, multi-arm, and survival forests with honest confidence intervals.
ltmle	R	GPL-2.0	🟡 maintained · 2023-04	R package for longitudinal targeted maximum likelihood estimation (and IPTW/G-computation) of treatment/censoring-specific mean outcomes and marginal structural models.
MendelianRandomization	R	GPL-2.0-or-later	🟢 active · 2024-04	CRAN R package implementing many summary-data Mendelian randomization methods (IVW, MR-Egger, median, mode, contamination-mixture, cML, debiased IVW) for causal effect estimation.
metalearners	Python	BSD-3-Clause	🟢 active · 2025-06	QuantCo's library for CATE estimation with S/T/X/R/DR meta-learners featuring sound cross-fitting, multi-treatment support, and SHAP/optuna integrations.
policytree	R · C++	MIT	🟢 active · 2026-02	R package learning optimal shallow decision-tree treatment policies via doubly-robust empirical welfare maximization using grf scores.
pylift	Python	BSD-2-Clause	🔴 dormant · 2022-11	Wayfair's uplift modeling package implementing the Transformed Outcome method with uplift evaluation/visualization tools (repository archived).
scikit-uplift	Python	MIT	🔴 dormant · 2022-08	scikit-learn-style uplift modeling package providing solo-model/two-model/class-transformation approaches plus uplift metrics and visualizations.
stochtree	Python · R · C++	MIT	🟢 active · 2026-05	Stochastic tree ensembles (BART/XBART/BCF) in R and Python for supervised learning and Bayesian Causal Forest treatment-effect estimation.
tfcausalimpact	Python	Apache-2.0	🟡 maintained · 2025-01	Python port of Google's CausalImpact built on TensorFlow Probability for Bayesian structural time-series intervention analysis.
tmle (R)	R	BSD-3-Clause	🟢 active · 2025-08	Susan Gruber & van der Laan's R package for targeted maximum likelihood estimation of ATE/ATT/ATC for a binary point treatment with SuperLearner-based nuisance estimation.
tmle3	R	GPL-3.0	🔴 dormant · 2021-03	Generalized targeted learning (TMLE) framework from the tlverse providing a unified interface for estimating a range of causal target parameters.
TwoSampleMR	R	MIT	🟢 active · 2026-05	R package for two-sample Mendelian randomization using GWAS summary data, interfacing the IEU OpenGWAS database with IVW, MR-Egger, median and mode estimators.
UpliftML	Python	Apache-2.0	🔴 dormant · 2022-12	Booking.com's scalable uplift modeling package with PySpark/H2O implementations of metalearners, uplift random forests, and retrospective/constrained estimation.
zEpid	Python	MIT	🟡 maintained · 2022-10	Epidemiology analysis package with causal inference estimators including IPTW/AIPW, g-formula (parametric and Monte Carlo), and TMLE.

Econometrics & quasi-experimental libraries (170)

Tool	Lang	License	Status	What it does
access	Python	BSD-3-Clause	🟢 active · 2025-12	Classical and novel spatial accessibility-to-services measures (floating catchment, gravity, RAAM) within the PySAL ecosystem.
admetan	Stata	GPL-3.0-only	🔴 dormant · 2019-02	Stata module providing comprehensive aggregate-data meta-analysis and forest plots; deprecated since 2020 in favor of metan/ipdmetan.
AER	R	GPL-2.0-or-later	🟢 active · 2026-02	Applied Econometrics with R companion package providing IV (ivreg), tobit, count and other econometric estimators and datasets.
allsynth	Stata	GPL-3.0	🟡 maintained	Wrapper around synth automating bias-correction, in-space placebo inference and stacked multi-unit synthetic control (Wiltshire).
anesrake	R	GPL-2.0-or-later	🔴 dormant · 2018-04	Implements ANES-style iterative raking to weight survey data to known target population margins with automatic variable selection.
ARDL	R	GPL-3.0-only	🟢 active · 2026-05	Builds ARDL and unrestricted/restricted error-correction models and runs the Pesaran-Shin-Smith (2001) bounds test for cointegration.
augsynth	R	MIT	🟡 maintained · 2024	Augmented synthetic control method (and multisynth for staggered adoption) that de-biases SCM when pre-treatment fit is imperfect.
AutoregressiveModels.jl	Julia	MIT	🟡 maintained · 2024-04	Julia toolkit for vector autoregressions with OLS estimation and structural impulse-response computation with bootstrap confidence bands.
autumn	R	MIT	🟡 maintained · 2024-01	Performs fast, tidy-friendly iterative proportional fitting (raking) to generate survey weights matching target population distributions.
bacondecomp	R	MIT	🔴 dormant · 2020-01	Goodman-Bacon decomposition of two-way fixed-effects DiD estimates into their underlying 2x2 comparison weights.
balance	Python	MIT	🟢 active · 2026-06	Workflow and methods (IPW, raking, post-stratification) for adjusting biased samples to infer about a target population.
bayesmeta	R	GPL-2.0-or-later	🟡 maintained · 2025-08	Bayesian random-effects meta-analysis and meta-regression, returning posterior and predictive distributions and shrinkage estimates.
binscatter	Stata	unverified	🟡 maintained	Generates binned scatterplots to visualize conditional means / OLS relationships in Stata.
binsreg	R · Python · Stata	GPL-3.0	🟢 active · 2026-05	Binscatter least-squares, quantile and GLM regression with valid confidence bands and shape-restriction tests.
boottest	Stata	GPL-3.0	🟢 active · 2026-04	Fast wild bootstrap (null-imposed) and score bootstrap for cluster-robust inference with few clusters in Stata.
brms	R	GPL-2.0-only	🟢 active · 2026-06	Fits Bayesian generalized (non-)linear multivariate multilevel models via Stan, widely used as the multilevel-regression engine for MRP.
BVAR	R	GPL-3.0-only	🟡 maintained · 2024-02	Hierarchical Bayesian VAR estimation with Giannone-Lenza-Primiceri conjugate-prior selection, computing impulse responses, forecasts and FEVD.
bvarsv	R	GPL-2.0-or-later	🔴 dormant · 2015-10	Bayesian estimation of a time-varying-parameter VAR with stochastic volatility (Primiceri 2005) for posterior predictive densities and impulse responses.
cem	R	GPL-2.0	🔴 dormant · 2022-09	Coarsened exact matching for reducing imbalance between treatment and control groups in observational data.
clubSandwich	R	GPL-3.0	🟢 active · 2026-05	Cluster-robust (CR2) variance estimators with small-sample corrections and Satterthwaite/Wald hypothesis tests.
cobalt	R	GPL-2.0-or-later	🟢 active · 2026-05	Balance tables and love-plots for samples preprocessed by matching, weighting or subclassification.
coefplot	Stata	MIT	🟢 active · 2025-08	Plots coefficients/confidence intervals from estimation results or matrices (widely used for event-study graphs).
compute.es	R	GPL-2.0-or-later	🟢 active · 2026-01	Converts a wide range of test statistics into effect sizes (d, g, r, z', OR) with variances, CIs, and p-values for meta-analysis.
csdid (Python)	Python	MIT	🟢 active · 2025	Python port of the Callaway & Sant'Anna group-time ATT estimator for staggered DiD.
csdid (Stata)	Stata	MIT	🟢 active · 2025	Stata implementation of Callaway & Sant'Anna group-time ATTs with panel and repeated cross-section support (Rios-Avila).
csdid2	Stata	MIT	🟢 active · 2025	Faster all-Mata reimplementation of csdid (Callaway-Sant'Anna staggered DiD) with extended functionality.
designmatch	R	GPL-2.0-or-later	🟢 active · 2026-02	Constructs matched samples that are balanced and representative by design via mixed-integer programming (cardinality/optimal matching).
did	R	GPL-2.0	🟢 active · 2025-12	Implements Callaway & Sant'Anna group-time average treatment effects for staggered difference-in-differences with multiple periods.
did2s	R	MIT	🟢 active · 2026-03	Two-stage difference-in-differences estimator (Gardner 2021) robust to heterogeneous treatment effects under staggered adoption.
did_imputation	Stata	GPL-3.0	🟡 maintained · 2024	Borusyak, Jaravel & Spiess imputation estimator and event-study plotting (did_imputation/event_plot) for staggered DiD in Stata.
didimputation	R	MIT	🟡 maintained · 2024	Imputation-based DiD estimator of Borusyak, Jaravel & Spiess (2021/2024) for staggered treatment timing.
DIDmultiplegt	R · Stata	MIT	🟢 active · 2026-02	de Chaisemartin & D'Haultfoeuille heterogeneity-robust DiD estimators for multiple groups, periods and non-binary treatments (original version).
DIDmultiplegtDYN	R · Stata	MIT	🟢 active · 2026-05	Dynamic (event-study) heterogeneity-robust DiD estimator allowing treatments that switch on and off multiple times.
differences	Python	GPL-3.0	🟢 active · 2026-04	Difference-in-differences estimation in Python (Callaway-Sant'Anna and related estimators) for staggered adoption with heterogeneous effects.
dmetar	R	GPL-3.0-only	🟡 maintained · 2025-05	Companion package of helper functions for the 'Doing Meta-Analysis in R' guide, extending meta, metafor, and netmeta with diagnostics and visualizations.
drdid (Stata)	Stata	MIT	🟢 active · 2025	Doubly-robust difference-in-differences estimators (Sant'Anna & Zhao 2020) for Stata; the building block for csdid.
dynamac	R	GPL-2.0-or-later	🔴 dormant · 2022-11	Estimates single-equation ARDL/error-correction models, dynamically simulates and plots their responses, and tests for cointegration (Jordan & Philips).
ebal	R	GPL-2.0-or-later	🟢 active · 2026-04	Entropy balancing reweighting so covariate moments match user-specified targets in observational studies (Hainmueller).
Econometrics.jl	Julia	ISC	🟡 maintained · 2024-12	General econometrics package for Julia covering panel models, IV and discrete-choice estimators.
esc	R	GPL-3.0-only	🔴 dormant · 2023-09	Computes effect sizes and their variances (d, g, r, OR, etc.) from diverse reported statistics for use in meta-analysis.
esda	Python	BSD-3-Clause	🟢 active · 2026-03	Exploratory spatial data analysis: global and local autocorrelation (Moran's I, Geary, Getis-Ord, local Moran/LISA) for continuous and binary areal data.
estimatr	R	MIT	🟡 maintained · 2025-02	Fast design-based OLS/IV estimators (lm_robust, iv_robust, difference_in_means) with robust and cluster-robust standard errors.
estout (esttab)	Stata	MIT	🟢 active · 2026-04	Produces publication-quality regression tables (esttab/estout) exportable to LaTeX, RTF, HTML and CSV.
etwfe	R	MIT	🟢 active · 2026-03	Extended two-way fixed effects (Wooldridge) DiD via saturated cohort-by-time interactions plus marginal-effects aggregation.
eventstudyinteract	Stata	MIT	🟡 maintained · 2023	Sun & Abraham interaction-weighted event-study estimator robust to heterogeneous treatment effects under staggered timing.
eventstudyr	R	MIT	🟢 active · 2026-04	Estimates and plots linear panel event-study models following Freyaldenhoven et al., including sup-t bands and pre-trend tests.
fect	R	MIT	🟢 active · 2026-05	Counterfactual estimators for causal panel analysis (two-way FE, interactive fixed effects, matrix completion) with diagnostic tests.
FixedEffectModels.jl	Julia	MIT	🟢 active · 2026-04	Estimates linear models with high-dimensional fixed effects and instrumental variables in Julia (reghdfe/fixest analog).
fixest	R	GPL-3.0	🟢 active · 2026-05	Fast and user-friendly estimation of OLS, GLM and IV models with multiple high-dimensional fixed effects, with built-in clustered/robust inference and event-study tooling.
ftools	Stata	MIT	🟢 active · 2026-01	Fast Mata-based data manipulation backend (collapse/merge/egen) that powers reghdfe and other Stata commands.
fwildclusterboot	R	GPL-3.0	🔴 dormant · 2023-07	Fast wild cluster bootstrap inference for OLS/IV with few clusters (R port of boottest); archived on CRAN, source remains on GitHub.
GeoDa	C++	GPL-3.0-only	🟢 active · 2025-09	Cross-platform desktop GUI for exploratory spatial data analysis, LISA mapping, spatial weights and basic spatial regression on lattice data.
giddy	Python	BSD-3-Clause	🟢 active · 2025-12	Geospatial distribution dynamics: spatial Markov chains, rank/mobility and directional LISA analysis of longitudinal spatial data.
GLFixedEffectModels.jl	Julia	MIT	🟢 active · 2026-03	Estimates GLMs (logit, Poisson, etc.) with high-dimensional fixed effects in Julia (ppmlhdfe analog).
gsynth	R	MIT	🟢 active · 2026-03	Generalized synthetic control imputing counterfactuals via interactive fixed-effects models, supporting multiple treated units and staggered timing.
gtools	Stata	MIT	🟡 maintained · 2024-06	C-plugin accelerated versions of common Stata data commands (collapse, egen, reshape, pctile) used in large-panel workflows.
HonestDiD	R	MIT	🟢 active · 2026-04	Robust inference and sensitivity analysis for DiD/event-study designs under relaxations of the parallel-trends assumption (Rambachan & Roth).
ipdmetan	Stata	GPL-3.0-only	🔴 dormant · 2022-10	Stata module for two-stage individual-participant-data meta-analysis with subgroup and forest-plot support.
ipfn	Python	MIT	🟡 maintained · 2024-05	Implements N-dimensional iterative proportional fitting to adjust a data matrix so its margins match specified target totals.
ipfraking	Stata	GPL-3.0-only	🔴 dormant · 2018-05	Stata module performing iterative proportional fitting (raking) to calibrate complex survey weights to control totals with trimming and diagnostics.
ivmodel	R	GPL-2.0	🔴 dormant · 2023-04	IV estimation with weak-instrument-robust inference (AR, CLR), power and sensitivity analysis for a single endogenous regressor.
ivreg	R	GPL-2.0-or-later	🟢 active · 2026-03	Instrumental-variables (2SLS/2SM/2SMM) regression with weak-instrument and endogeneity diagnostics.
ivreg2	Stata	GPL-3.0	🟡 maintained · 2024-08	Extended IV/2SLS/LIML/GMM estimation with weak-instrument and overidentification diagnostics (Baum, Schaffer & Stillman).
ivreghdfe	Stata	MIT	🟢 active · 2025-12	Combines ivreg2 and reghdfe to run IV/2SLS/GMM regressions with many high-dimensional fixed effects.
kmatch	Stata	MIT	🟢 active · 2026-02	Multivariate-distance and propensity-score matching with entropy balancing, IPW, CEM and regression adjustment.
lfe	R	Apache-2.0	🟡 maintained · 2025-02	Estimates linear models with multiple high-dimensional group fixed effects (and IV) by transforming away factors before OLS.
libpysal	Python	BSD-3-Clause	🟢 active · 2026-01	Core PySAL components: spatial weights construction, computational geometry, graphs, and I/O underpinning the spatial-econometrics stack.
linearmodels	Python	NCSA	🟢 active · 2025-10	Panel (fixed/random effects), IV/2SLS-GMM, system and asset-pricing estimators missing from statsmodels.
localprojections	Python	MIT	🔴 dormant · 2023-09	Implements Jordà (2005) local-projection impulse responses for single-entity time series and panel data, including threshold/state-dependent variants.
LocalProjections.jl	Julia	MIT	🟡 maintained · 2024-04	Julia implementation of local-projection methods for impulse-response estimation, including lag-augmented and smoothed local projections.
locproj	Stata	GPL-3.0-only	🟢 active · 2026-02	Stata (SSC) command estimating linear and nonlinear local-projection IRFs for time-series and panel data, supporting IV and quantile-regression variants (Ugarte Ruiz).
lpirfs	R	GPL-2.0-or-later	🟢 active · 2025-12	Estimates linear and nonlinear (state-dependent) impulse responses via Jordà (2005) local projections for time-series and panel data, with identified-shock and IV options.
marginaleffects	R	GPL-3.0	🟢 active · 2026-02	Computes predictions, marginal effects/slopes, comparisons and marginal means with delta-method or simulation inference for 100+ model classes.
MatchIt	R	GPL-2.0-or-later	🟢 active · 2025-05	Unified interface to nearest-neighbor, optimal, full, genetic and coarsened-exact matching for covariate balance in observational studies.
meta	R	GPL-2.0	🟢 active · 2026-05	Standard meta-analysis methods including fixed/random-effects models, meta-regression, bias tests, and forest/funnel plots.
meta	Stata	proprietary	🟢 active · 2026-06	Stata's built-in meta suite for fixed/random-effects meta-analysis, meta-regression, forest/funnel plots, and small-study-effect tests.
metabias	Stata	GPL-3.0-only	🔴 dormant · 2010-12	Stata module testing for small-study effects / funnel-plot asymmetry (Egger, Begg, Harbord tests) in meta-analysis.
metafor	R	GPL-2.0-or-later	🟢 active · 2026-05	Comprehensive R package for conducting meta-analyses, including effect-size computation, fixed/random/mixed-effects models, moderators, and forest/funnel plots.
metan	Stata	GPL-3.0-only	🟡 maintained · 2024-07	Comprehensive Stata module for fixed- and random-effects meta-analysis of binary, continuous, or generic effect estimates with flexible forest plots.
metareg	Stata	GPL-3.0-only	🔴 dormant · 2009-01	Stata module performing random-effects meta-regression on study-level summary data with permutation-test p-values.
metaSEM	R	GPL-2.0-or-later	🟢 active · 2026-05	Conducts meta-analysis via structural equation modeling (using OpenMx/lavaan), including fixed/random-effects and meta-analytic SEM on correlation matrices.
mgwr	Python	BSD-3-Clause	🟡 maintained · 2024-01	Calibration, inference and prediction for (multiscale) geographically weighted regression across GLM families with model diagnostics.
modelsummary	R	GPL-3.0	🟢 active · 2026-02	Publication-quality regression and summary tables (and coefficient plots) for many model classes in multiple output formats.
mvmeta	Stata	GPL-3.0-only	🔴 dormant · 2022-04	Stata module for multivariate random-effects meta-analysis and meta-regression on point estimates, variances, and covariances.
netmeta	R	GPL-2.0-or-later	🟢 active · 2026-05	Frequentist network meta-analysis for simultaneously comparing multiple treatments across studies, with inconsistency assessment and network graphs.
network	Stata	GPL-3.0-only	🔴 dormant · 2018-04	Stata module for network (mixed-treatment-comparison) meta-analysis using contrast-based multivariate meta-regression with inconsistency checks.
optmatch	R	MIT	🟡 maintained · 2024-09	Optimal bipartite matching using minimum-cost flow for distance/propensity-score matched designs.
outreg2	Stata	unverified	🔴 dormant · 2014-08	Produces formatted regression-output tables for Word/Excel/LaTeX from Stata estimation results (Roy Wada).
panelView	R	MIT	🟡 maintained · 2024-06	Visualizes treatment status, missingness and outcome dynamics for panel/DiD datasets.
plm	R	GPL-2.0-or-later	🟢 active · 2025-11	Comprehensive panel-data econometrics toolkit with fixed/random effects estimators, robust covariances and panel diagnostic tests.
ppmlhdfe	Stata	MIT	🟢 active · 2026-01	Poisson pseudo-maximum-likelihood regression with multiple high-dimensional fixed effects and robust separation handling.
PracTools	R	GPL-3.0-only	🟢 active · 2026-01	Tools and datasets for designing complex survey samples, computing sample sizes, and constructing/weighting survey samples.
pretrends	R	MIT	🟡 maintained · 2024	Computes the power of pre-trends tests and visualizes detectable violations of parallel trends in event studies.
psmatch2	Stata	unverified	🔴 dormant · 2018-02	Mahalanobis and propensity-score matching with common-support graphing and covariate-imbalance testing (Leuven & Sianesi).
psychmeta	R	GPL-3.0-or-later	🟡 maintained · 2024-06	Psychometric meta-analysis toolkit for bare-bones and artifact-corrected meta-analysis of correlations and d-values (Hunter-Schmidt methods).
puniform	R	GPL-2.0-or-later	🟢 active · 2025-12	Publication-bias-correcting meta-analysis methods (p-uniform / p-uniform*) based on the distribution of conditional p-values.
pyfixest	Python	MIT	🟢 active · 2026-04	Fast high-dimensional fixed-effects OLS/IV/Poisson regression in Python following fixest syntax, with clustered and wild-bootstrap inference.
PyMARE	Python	MIT	🟡 maintained · 2025-04	Python meta-analysis and regression engine providing mixed-effects meta-regression estimators and effect-size combination.
pysal	Python	BSD-3-Clause	🟢 active · 2026-01	Meta-package bundling the Python Spatial Analysis Library submodules (libpysal, esda, spreg, mgwr, giddy, etc.) for spatial analysis and econometrics.
PySVAR	Python	unverified	🟡 maintained · 2024-06	Small Python package for SVAR estimation and impulse responses across recursive (Cholesky), sign-restriction and optimization-based identification schemes.
pysyncon	Python	MIT	🟡 maintained · 2025-01	Python implementation of classic, robust, augmented and penalized synthetic control plus synthetic DiD.
pysynthdid	Python	Apache-2.0	🔴 dormant · 2023	Python implementation of the synthetic difference-in-differences (SDID) estimator.
PythonMeta	Python	GPL-3.0-only	🔴 dormant · 2021-11	Python module for meta-analysis in evidence-based-medicine systematic reviews, with fixed/random-effects pooling and forest/funnel plots.
quantipy3	Python	MIT	🟢 active · 2026-04	Python 3 survey-data processing and analysis toolkit handling multiple-choice data, metadata, and case weighting (including raking).
rddensity	R · Python · Stata	GPL-3.0	🟢 active · 2025	Manipulation (density-discontinuity) testing for RD designs using local polynomial density estimators (McCrary-style sorting test).
rdlocrand	R · Python · Stata	GPL-3.0	🟢 active · 2026-05	Local-randomization methods for estimation, inference and window selection in regression discontinuity designs.
rdmulti	R · Python · Stata	GPL-3.0	🟡 maintained · 2025	RD estimation and inference with multiple cutoffs or multiple running variables/scores.
rdpower	R · Python · Stata	GPL-3.0	🟡 maintained · 2025	Power, sample-size and minimum-detectable-effect calculations for regression discontinuity designs.
rdrobust	R · Python · Stata	GPL-3.0	🟢 active · 2026-05	Estimation, robust bias-corrected inference and plotting for sharp/fuzzy regression discontinuity designs via local polynomials.
reghdfe	Stata	MIT	🟢 active · 2026-01	Linear regression with multiple high-dimensional fixed effects and clustered/robust standard errors in Stata.
RegressionTables.jl	Julia	MIT	🟢 active · 2025-10	Generates publication-quality regression tables (esttab/stargazer analog) for Julia models.
regsensitivity	Stata	MIT	🟡 maintained	Regression sensitivity analysis (Masten & Poirier breakdown frontiers) quantifying robustness to omitted-variable bias.
rgeoda	R	GPL-2.0-or-later	🟢 active · 2026-02	R interface to libgeoda/GeoDa for ESDA, LISA spatial autocorrelation, spatial clustering and regionalization.
RoBMA	R	GPL-3.0-only	🟢 active · 2026-06	Robust Bayesian model-averaged meta-analysis that adjusts for publication bias via selection models and PET-PEESE ensembles.
robumeta	R	GPL-2.0-only	🔴 dormant · 2023-03	Robust variance estimation (RVE) meta-regression with large- and small-sample estimators for dependent effect sizes without distributional assumptions.
rstanarm	R	GPL-3.0-only	🟢 active · 2026-06	Bayesian applied regression modeling with Stan using familiar R formula syntax, commonly used to fit the multilevel models in MRP.
S2sls	R	GPL-2.0-or-later	🔴 dormant · 2016-08	Minimal package fitting a spatial-lag instrumental-variable regression by spatial two-stage least squares.
samplics	Python	MIT	🟢 active · 2026-03	Design-based analysis of complex survey data covering sample selection, weighting/calibration, estimation, and small area estimation.
sampling	R	GPL-2.0-or-later	🟡 maintained · 2025-07	Provides survey sampling selection algorithms and calibration/weight estimators including variance estimation for complex designs.
sandwich	R	GPL-2.0-or-later	🟡 maintained · 2024-09	Object-oriented model-robust covariance matrix estimators (HC, HAC, clustered, panel-corrected).
scpi	R · Python · Stata	MIT	🟢 active · 2025	Estimation, prediction-interval inference and graphics for synthetic control (scest/scpi), including multiple treated units and staggered adoption.
sdid	Stata	GPL-3.0	🟢 active · 2025	Synthetic difference-in-differences estimation with inference and graphics for Stata (Arkhangelsky et al. 2021).
sensemakr	R · Python · Stata	GPL-3.0	🟡 maintained · 2024-07	Sensitivity analysis to unobserved confounders for OLS via robustness values and contour plots (Cinelli & Hazlett).
spaMM	R	CeCILL-2.0	🟢 active · 2026-04	Fits mixed-effect models with spatially correlated random effects (geostatistical and Markov-random-field GLMMs) via Laplace/h-likelihood approximations.
SpatialDependence.jl	Julia	MIT	🟢 active · 2025-12	Julia package for spatial weights matrices, spatial-autocorrelation tests (global/local Moran, Geary, Getis-Ord, LISA) and choropleth ESDA.
spatialEco	R	GPL-3.0-only	🟢 active · 2026-05	Utilities for spatial data manipulation, sampling and modelling including autologistic models, spatial smoothing and landscape/point-pattern metrics.
spatialreg	R	GPL-2.0-only	🟢 active · 2026-03	Estimates spatial cross-sectional lattice/areal models (SAR, SEM, SAC, Durbin) by maximum likelihood, spatial 2SLS and GMM following Cliff-Ord and Kelejian-Prucha.
spdep	R	GPL-2.0-or-later	🟢 active · 2026-05	Builds spatial weights matrices from contiguities/distances and computes spatial-autocorrelation tests (Moran's I, Geary's C, Getis-Ord, local LISA).
spglm	Python	BSD-3-Clause	🔴 dormant · 2023-10	Sparse-compatible generalized linear models (Gaussian, Poisson, logistic) serving as the estimation base for PySAL's spint and GWR modules.
sphet	R	GPL-2.0-only	🟡 maintained · 2024-12	Fits Cliff-Ord spatial autoregressive models with heteroskedastic innovations via GMM/IV, including spatial HAC standard errors.
spint	Python	BSD-3-Clause	🔴 dormant · 2020-09	Calibrates gravity-type spatial interaction models (unconstrained and production/attraction-constrained Poisson) via entropy maximization.
splm	R	GPL-2.0-only	🟡 maintained · 2023-12	Maximum-likelihood and GM estimation plus diagnostic testing of fixed/random-effects econometric models for spatial panel data (Millo & Piras).
spmoran	R	GPL-2.0-or-later	🟡 maintained · 2024-12	Estimates Moran-eigenvector spatial/spatio-temporal regression models with spatially varying coefficients for Gaussian and non-Gaussian data.
sppack (spreg/spivreg/spmat)	Stata	GPL-3.0-only	🔴 dormant · 2018-12	Community Stata (SSC) precursor to official Sp: builds spatial-weighting matrices (spmat) and fits SAR/SEM/SAC by ML and GS2SLS (spreg, spivreg) by Drukker, Peng, Prucha & Raciborski.
spreg	Python	BSD-3-Clause	🟢 active · 2026-05	PySAL spatial econometric regression: OLS/2SLS with spatial lag and error (SAR/SEM/SARAR/Durbin), GM/ML estimators, panel and regimes models.
spsur	R	GPL-3.0-only	🟢 active · 2025-09	Tests and estimates spatial Seemingly Unrelated Regression (SUR-SLM/SEM/SDM/SLX) systems by maximum likelihood and three-stage least squares.
sptotal	R	GPL-2.0-or-later	🔴 dormant · 2023-09	Finite-population block kriging to predict totals and weighted sums from spatially autocorrelated sample data (Ver Hoef 2008).
sreweight	Stata	GPL-3.0-only	🔴 dormant · 2014-01	Stata module that reweights survey microdata to external aggregate totals using Deville-Sarndal calibration methods.
srvyr	R	GPL-2.0-or-later	🟢 active · 2026-03	Provides dplyr-like syntax for computing summary statistics on complex survey data by wrapping the survey package.
stackedev	Stata	unverified	🟡 maintained	Stacked event-study estimator (Cengiz et al.) that builds clean cohort-vs-never-treated stacks to avoid bad TWFE comparisons.
staggered	R	unverified	🟢 active · 2025-12	Efficient estimators (Roth & Sant'Anna) for difference-in-differences settings with randomized/as-good-as-random treatment timing.
Stata lpirf / ivlpirf	Stata	proprietary	🟢 active · 2026-01	Official Stata (18+) commands estimating Jordà local-projection impulse-response functions, with ivlpirf adding instrumental-variables identification.
Stata sp (spregress/spxtregress/spivregress)	Stata	proprietary	🟢 active · 2026-01	Official Stata Sp suite fitting cross-sectional and panel spatial autoregressive models (SAR/SEM/SAC, with endogenous covariates) by ML and GS2SLS.
Stata var / svar / varbasic	Stata	proprietary	🟢 active · 2026-01	Official Stata time-series suite estimating reduced-form and structural VARs (var, svar, varbasic) with IRF/FEVD via the irf subsystem.
statsmodels	Python	BSD-3-Clause	🟢 active · 2025-12	General-purpose statistical modeling library (OLS/GLM, robust/clustered SE, panel and time-series tools); a foundation rather than a quasi-experimental-specific package.
survey	R	GPL-2.0-or-later	🟢 active · 2026-02	Analysis of complex survey samples including design-based summary statistics, generalized linear models, calibration and raking of survey weights.
svars	R	MIT	🟡 maintained · 2025-10	Data-driven identification of structural VARs (changes in volatility, GARCH, independent-component analysis, non-Gaussian ML) with IRFs and bootstrap inference.
svy	Stata	proprietary	🟢 active · 2026-06	Stata's built-in survey-data prefix and estimators that account for sampling weights, stratification, and clustering in complex survey designs.
svyweight	R	GPL-3.0-only	🟢 active · 2026-03	Quickly and flexibly applies rake weighting to survey data, extending the survey package's weighting interface to correct for non-response.
Synth	R	GPL-2.0-or-later	🟢 active · 2026-04	Classic synthetic control method (Abadie, Diamond & Hainmueller) for comparative case studies with a single treated unit.
synth	Stata	unverified	🟡 maintained	Original Stata implementation of the synthetic control method (Abadie, Diamond & Hainmueller).
synth_runner	Stata	unverified	🔴 dormant · 2017-08	Automates running synth across treated units/placebos to perform inference and produce synthetic-control plots.
SynthControl.jl	Julia	MIT	🟡 maintained · 2024-02	Pure-Julia synthetic control and synthetic difference-in-differences estimators (beta).
synthdid	R	BSD-3-Clause	🟡 maintained · 2024	Reference R implementation of the synthetic difference-in-differences (SDID) estimator of Arkhangelsky et al. (2021).
synthdid.py	Python	MIT	🟡 maintained · 2025	Python port of synthetic DiD supporting SDID/SC/DiD estimators with bootstrap, placebo and jackknife inference.
SyntheticControlMethods	Python	Apache-2.0	🔴 dormant · 2023	Python package for classic and Differenced (robust) synthetic control estimation with placebo-based inference.
tsDyn	R	GPL-2.0-or-later	🟡 maintained · 2024-10	Nonlinear and regime-switching time-series models including linear VAR/VECM and threshold TVAR/TVECM with associated cointegration tests.
varexternalinstrument	R	MIT	🔴 dormant · 2019-07	Identifies VAR impulse responses using a high-frequency external instrument (proxy-SVAR / Gertler-Karadi), extending models fit with the vars package.
vars	R	GPL-2.0-or-later	🟡 maintained · 2024-03	Estimation, lag selection, diagnostics, forecasting, Granger causality, IRFs and FEVD for VAR models plus SVAR and SVEC estimation (Pfaff).
VARsignR	R	GPL-3.0-only	🔴 dormant · 2015-12	Identifies structural shocks in Bayesian VARs via sign restrictions (Uhlig rejection and penalty, Rubio-Ramirez QR, Fry-Pagan median target).
Vcov.jl	Julia	unverified	🟢 active · 2026-03	Provides robust and clustered variance-covariance estimators as a backend for Julia regression packages.
VectorAutoregressions.jl	Julia	MIT	🔴 dormant · 2022-06	Julia VAR/BVAR/FAVAR estimation with IRF identification (Cholesky, long-run, sign restrictions) and asymptotic/bootstrap confidence bands.
weakiv	Stata	unverified	🔴 dormant	Weak-instrument-robust tests and confidence sets (AR, CLR, K) for IV/probit/tobit models (Finlay, Magnusson & Schaffer).
weightipy	Python	MIT	🟢 active · 2026-02	A modern, lightweight RIM (iterative raking) library for weighting survey/people data, a fork-style successor to quantipy's weighting.
WeightIt	R	GPL-2.0-or-later	🟢 active · 2026-04	Generates balancing weights (propensity scores, entropy balancing, CBPS, energy balancing) for binary, multi-category and continuous treatments.
weightr	R	GPL-2.0-or-later	🔴 dormant · 2019-07	Estimates the Vevea and Hedges (1995) weight-function model to assess and correct for publication bias in meta-analysis.
weights	R	GPL-2.0-or-later	🟡 maintained · 2025-06	Computes weighted descriptive statistics and tests (weighted correlations, t-tests, chi-squared) plus weighted graphics for survey data.
wildboottest	Python	MIT	🟡 maintained · 2024-08	Fast wild cluster bootstrap algorithms for inference on OLS coefficients in Python.
WildBootTests.jl	Julia	unverified	🟡 maintained	Julia engine for fast wild (cluster) bootstrap tests and confidence sets, used as the backend for boottest and fwildclusterboot.
xsmle	Stata	unverified	🔴 dormant · 2017-01	Stata (SSC) command estimating fixed/random-effects spatial panel models (SAR, SEM, Durbin, dynamic) by quasi-maximum likelihood with direct/indirect/total effects (Belotti, Hughes & Piano Mortari).

Causal discovery / structure learning (25)

Tool	Lang	License	Status	What it does
AVICI	Python	MIT	🟡 maintained · 2025-02	Amortized variational inference for causal structure learning (NeurIPS 2022), predicting causal graphs directly from data via a trained neural network.
benchpress	Python · R	GPL-2.0	🟢 active · 2026-05	Snakemake workflow to run, develop and benchmark causal-discovery/structure-learning algorithms across many libraries (bnlearn, pcalg, causal-learn, gCastle, Tetrad, etc.) with data generators and metrics.
bnlearn (Python)	Python	MIT	🟢 active · 2026-03	Independent Python package (built on pgmpy) for Bayesian network structure learning, parameter learning, inference and sampling.
bnlearn (R)	R	GPL-2.0-or-later	🟡 maintained · 2025-08	Widely used R package for Bayesian network structure learning (constraint-based, score-based, hybrid), parameter learning and inference.
Causal Discovery Toolbox (CDT)	Python	MIT	🟡 maintained · 2025-10	Python package for graph and pairwise causal discovery, bridging to R packages (pcalg, bnlearn) and providing deep-learning-based methods.
causal-cmd	Java	unverified	🟢 active · 2026-03	Command-line interface wrapping the Tetrad causal-discovery algorithms for running searches on data files from a shell.
causal-learn	Python	MIT	🟢 active · 2026-06	Comprehensive Python library of classic and state-of-the-art causal discovery algorithms (PC, FCI, GES, LiNGAM, Granger, etc.) for learning causal structure from observational data.
causaldag	Python	BSD-3-Clause	🔴 dormant · 2023	Python package for creating, manipulating and learning causal DAGs, including GSP/IGSP permutation-based and interventional structure-learning algorithms.
CausalDisco	Python	BSD-3-Clause	🔴 dormant · 2023-11	Python package of baseline causal-discovery algorithms and analytics tools (varsortability, sortnregress) for benchmarking structure learning.
CausalNex	Python	Apache-2.0	🟡 maintained · 2024-06	Python library for learning Bayesian network structure (NOTEARS-based) and reasoning about causal relationships for decision-making.
causica	Python	MIT	🟡 maintained · 2024-12	Microsoft's deep-learning library for end-to-end causal discovery and inference, including the DECI amortized causal-discovery model.
DAGMA	Python	Apache-2.0	🟡 maintained · 2024-01	Python package learning DAGs via continuous optimization using an M-matrix log-determinant acyclicity characterization (DAGMA).
dodiscover	Python	MIT	🟢 active · 2026-05	PyWhy's experimental causal discovery package providing constraint-based and other global structure-learning algorithms with a scikit-learn-style API.
gCastle	Python	Apache-2.0	🟢 active · 2026-06	Python causal structure learning toolbox emphasizing gradient-based methods (NOTEARS, GraN-DAG, etc.) plus data simulators and SHD/F1 evaluation metrics.
gimme	R	GPL-2.0-or-later	🟢 active · 2026-03	R package (Group Iterative Multiple Model Estimation) that recovers group- and individual-level directed contemporaneous/lagged network structure from time series via unified SEM search.
LiNGAM	Python	MIT	🟢 active · 2026-05	Python package implementing the LiNGAM family (ICA-LiNGAM, DirectLiNGAM, VAR-LiNGAM, RCD, etc.) for causal discovery in linear non-Gaussian models.
NOTEARS	Python	Apache-2.0	🟢 active · 2026-05	Reference implementation of NO TEARS, casting DAG structure learning as a continuous optimization with a smooth acyclicity constraint.
pcalg	R	GPL-2.0-or-later	🟡 maintained · 2024-09	Canonical R package for graphical-model causal structure learning (PC, FCI, RFCI, GIES) and causal effect estimation (IDA).
pgmpy	Python	MIT	🟢 active · 2026-06	Python toolkit for probabilistic graphical models with Bayesian network structure learning (PC, Hill-Climb, etc.), parameter learning, inference and causal reasoning.
py-tetrad	Python · Java	MIT	🟢 active · 2026-05	Python interface (via JPype) exposing the Java Tetrad causal-discovery algorithms in Python workflows.
pyAgrum / aGrUM	Python · C++	LGPL-3.0-or-MIT	🟢 active · 2026-01	C++/Python library for probabilistic graphical models (Bayesian networks) with structure learning and causal do-calculus support.
pywhy-graphs	Python	MIT	🟢 active · 2026-05	NetworkX-compliant causal graph data structures (ADMG, PAG, CPDAG) underpinning the PyWhy causal-discovery ecosystem.
Tetrad	Java	GPL-3.0	🟢 active · 2026-06	Long-running Java toolkit and GUI for causal discovery and graphical-causal-model search, the reference implementation of many constraint- and score-based algorithms.
tigramite	Python	GPL-3.0	🟢 active · 2026-01	Python package for causal discovery in time series via the PCMCI/PCMCI+/LPCMCI family of conditional-independence-based algorithms.
typed-DAG (t-DAG)	Python	Apache-2.0	🔴 dormant · 2023-07	Reference implementation of causal discovery with typed directed acyclic graphs, integrating variable-type knowledge into structure learning.

Autonomous research & data-science agents (51)

Tool	Lang	License	Status	What it does
Agent Laboratory	Python	MIT	🟡 maintained · 2025-08	End-to-end autonomous research workflow with literature-review, experimentation, and report-writing phases (and AgentRxiv shared-preprint collaboration) to turn a human research idea into a paper plus code.
Agentic Data Scientist	Python	MIT	🟢 active · 2026-05	An adaptive multi-agent framework (Google ADK + Claude Agent SDK) that separates planning from execution with continuous validation to complete end-to-end data-science tasks.
AI Data Science Team	Python	MIT	🟢 active · 2025-12	A library of specialized LLM agents (data cleaning, EDA, feature engineering, SQL, H2O AutoML, visualization) orchestrated by a supervisor to automate common data-science tasks.
AI-Researcher	Python	unverified	🟡 maintained · 2025-10	Fully autonomous research system (NeurIPS 2025) that runs the whole pipeline from literature review and idea generation through algorithm implementation to manuscript writing, primarily for AI/ML research.
AIDE (aideml)	Python	MIT	🟢 active · 2026-05	Tree-search ML-engineering agent that autonomously drafts, debugs, and benchmarks code to maximize a user-defined metric, reaching strong Kaggle/MLE-bench performance. (Overlaps with the data-science agent bucket.)
Auto-Analyst	TypeScript · Python	MIT	🟢 active · 2026-05	A modular multi-agent AI data-scientist platform (DSPy-based) automating cleaning, statistical analysis, scikit-learn modeling, and Plotly visualization.
Auto-Deep-Research	Python	MIT	🟡 maintained · 2025-02	A cost-efficient open Deep Research alternative (built on the AutoAgent framework) that autonomously gathers and synthesizes web information; strong on GAIA.
AutoGluon Assistant (MLZero)	Python	Apache-2.0	🟢 active · 2026-03	A multi-agent system that transforms raw multimodal data (tabular, image, text, audio) into trained ML solutions end-to-end with zero human intervention, using MCTS-guided code generation over AutoGluon.
AutoKaggle	Python	Apache-2.0	🟡 maintained · 2024-12	A multi-agent framework with five cooperating agents that autonomously complete Kaggle tabular competitions across six pipeline phases.
AutoMind	Python	MIT	🟢 active · 2025-10	An adaptive, knowledge-grounded data-science agent using an expert knowledge base plus agentic tree search to build ML pipelines (beats AIDE on MLE-bench).
AutoResearchClaw	Python	MIT	🟢 active · 2026-06	Self-reinforcing 23-stage autonomous research pipeline (literature discovery, multi-agent hypothesis debate, sandboxed self-healing experiments, peer review, LaTeX export) that turns an idea into a conference-ready paper.
AutoSurvey	Python	unverified	🔴 dormant · 2025-02	NeurIPS 2024 method that automatically writes comprehensive literature surveys via retrieval, parallel subsection drafting by specialized LLMs, and iterative refinement with automated evaluation.
Aviary	Python	Apache-2.0	🟢 active · 2026-06	Gymnasium/framework of language-agent environments for challenging scientific tasks (literature QA, DNA manipulation, protein engineering) used to build and train autonomous research agents.
Biomni	Python	Apache-2.0	🟢 active · 2025-10	A general-purpose autonomous biomedical research agent combining LLM reasoning, retrieval-augmented planning, and code execution over a large library of biomedical tools.
ChemCrow	Python	MIT	🔴 dormant · 2024-03	An LLM agent augmented with chemistry tools (RDKit, paper-qa, reaction/retrosynthesis databases) that autonomously solves reasoning-intensive chemistry tasks.
Coscientist	Python	Apache-2.0 (Commons Clause)	🔴 dormant	An LLM-driven autonomous lab agent (from the Nature paper) that plans, designs, and optimizes chemical experiments and synthesis.
Curie	Python	Apache-2.0	🟡 maintained · 2025-09	AI agent framework for rigorous, automated scientific experimentation that handles the full hypothesis-to-analysis loop (experiment design, environment setup, execution, analysis) with reproducibility guarantees.
CycleResearcher	Python	unverified	🟢 active · 2026-03	Open-source ecosystem of trained models (CycleResearcher + CycleReviewer) that iteratively generate research papers and improve them via automated peer review, focused on ML research.
Data Formulator	TypeScript · Python	MIT	🟢 active · 2026-05	An AI tool with data-loading, exploration, and chart-style-refinement agents that transform and visualize data via a blend of UI interactions and natural language.
Data-Copilot	Python	MIT	🔴 dormant · 2023	An LLM agent that self-designs interface tools then dispatches them to autonomously query, process, analyze, and visualize (financial) data.
data-to-paper	Python	MIT	🟡 maintained · 2025-07	Multi-agent system that goes from a raw dataset and research goal to a verifiable, data-traceable scientific paper, emphasizing reproducibility in data-driven (e.g. biomedical/clinical) research.
DataMind	Python	Apache-2.0	🟢 active · 2026-06	An open data-synthesis + agent-training recipe yielding generalist data-analytic LLMs (DataMind-7B/14B) that do multi-step, code-based reasoning over CSV/Excel/SQLite.
deep-research (dzhng)	TypeScript	MIT	🟢 active · 2026-04	A compact open-source deep-research agent that recursively searches, scrapes, and reasons over the web to produce reports, tracking goals across iterations.
DeepAnalyze	Python	MIT	🟢 active · 2026-03	An agentic LLM (DeepAnalyze-8B) that autonomously runs the end-to-end data-science pipeline from raw structured/semi-structured/unstructured data to analyst-grade research reports.
DeepEye	Python · TypeScript	Apache-2.0	🟢 active · 2026-05	A production-ready 'self-driving' data agent system that autonomously orchestrates multi-step workflows to produce dashboards, analytical reports, and data videos from heterogeneous data.
DS-Agent	Python	unverified	🔴 dormant · 2024	An ICML'24 data-science agent that uses case-based reasoning over Kaggle expert knowledge to iteratively build and train ML models across tabular/text/time-series.
freephdlabor	Python	MIT	🟢 active · 2026-05	Customizable multi-agent framework (ManagerAgent orchestrating Ideation/Experiment/Writeup agents) for building personalized systems that run continuous autonomous research toward publication-grade reports.
GPT Researcher	Python · TypeScript	Apache-2.0	🟢 active · 2026-05	Autonomous deep-research agent that plans sub-questions, scrapes and aggregates many web/local sources, and synthesizes a long-form cited research report. (Also relevant to the data-science/deep-research bucket.)
Jupyter AI	Python	BSD-3-Clause	🟢 active · 2026-04	A JupyterLab extension (v3) connecting agentic AI models to notebooks so they can read/write files, run code, and act via a built-in MCP server for data work.
LIDA	Python	MIT	🔴 dormant · 2024-03	An LLM agent that automatically summarizes data, generates analysis goals, and writes/executes/edits visualization code (treating viz as code) across grammars.
MetaGPT (Data Interpreter / SELA)	Python	MIT	🟢 active · 2026-01	Multi-agent framework whose Data Interpreter (and SELA tree-search AutoML extension) agent plans, writes, and self-debugs code to solve data-analysis, ML, and modeling tasks.
MLE-Agent	Python	MIT	🔴 dormant · 2024-10	An AI companion that autonomously builds ML/AI baselines and end-to-end solutions (incl. Kaggle) with integrated arXiv/paper search.
MLR-Copilot	Python	unverified	🟡 maintained · 2025-03	Machine-learning research assistant framework where LLM agents autonomously generate research ideas from papers and implement/execute the corresponding experiments.
Open Deep Research (LangChain)	Python	MIT	🟢 active · 2025-08	A configurable, fully open-source deep-research agent (LangGraph-based) that works across many model/search providers; ranks on Deep Research Bench.
Open Deep Research (nickscamara/Firecrawl)	TypeScript	Apache-2.0	🟡 maintained · 2025-02	An open Deep Research clone that reasons over large amounts of web data extracted via Firecrawl to generate research analyses.
Open Interpreter	Python	AGPL-3.0	🔴 dormant · 2024-10	A natural-language code-execution agent that runs Python/shell locally to plot, clean, and analyze datasets (and general computer tasks), with human approval of generated code.
OpenResearcher	Python	Apache-2.0	🔴 dormant · 2024-10	AI research-assistant platform that uses retrieval-augmented generation over scientific literature to autonomously answer research questions, summarize, and recommend papers with source citations.
PandasAI	Python	MIT	🟢 active · 2025-10	A conversational data-analysis agent that turns natural-language questions over CSV/SQL/parquet data lakes into executed analysis code and charts.
Paper2Code (PaperCoder)	Python	Apache-2.0	🟢 active · 2026-03	Multi-agent LLM system that autonomously converts an ML research paper into a faithful, runnable code repository via planning, analysis, and generation stages.
PaperQA2	Python	Apache-2.0	🟢 active · 2026-03	Agentic high-accuracy RAG system over full-text scientific literature that autonomously retrieves, ranks, and synthesizes cited answers and literature summaries with superhuman accuracy on QA/contradiction tasks.
RD-Agent	Python	MIT	🟢 active · 2026-05	Microsoft's R&D automation framework that iteratively proposes hypotheses and implements/evolves them as code, targeting data-driven R&D such as quantitative finance factor/model discovery and ML engineering.
ResearchAgent	Python	unverified	🟡 maintained · 2025-08	LLM system (NAACL 2025) that iteratively generates research problems, methods, and experiment designs grounded in an academic citation graph, refined by collaborating reviewing agents.
Robin	Python	Apache-2.0	🟢 active · 2026-04	Multi-agent system (built on Aviary/PaperQA) that automates therapeutics discovery by generating hypotheses, proposing experiments, and analyzing experimental data, demonstrated by identifying a novel dry-AMD drug candidate.
STORM / Co-STORM	Python	MIT	🟡 maintained · 2025-09	LLM knowledge-curation system that researches a topic via multi-perspective simulated expert conversations and web search to autonomously synthesize a full, Wikipedia-style cited report (Co-STORM adds human-in-the-loop).
SurveyX	Python	unverified	🟡 maintained · 2026-01	Academic survey-automation system that takes a title and keywords and autonomously retrieves literature and generates a structured, cited survey paper (open-source release is offline-only; full service is hosted).
TableGPT Agent	Python	Apache-2.0	🟡 maintained · 2025-03	A LangGraph-based pre-built agent for the TableGPT2 model that answers analytical questions and runs code over tabular datasets.
TaskWeaver	Python	MIT	🔴 dormant · 2026-03	A code-first agent framework that plans and executes data-analytics tasks via generated Python, with stateful code/plugin memory (repo archived March 2026).
The AI Scientist	Python	AI Scientist Source Code License v1.0 (custom, Responsible-AI based)	🟡 maintained · 2025-12	Fully automated pipeline that generates ML research ideas, writes and runs experiment code, and drafts complete LaTeX papers with an automated reviewer, in machine-learning domains.
The AI Scientist-v2	Python	AI Scientist Source Code License v1.0 (custom, Responsible-AI based)	🟡 maintained · 2025-12	Template-free successor to The AI Scientist that uses agentic tree search and an experiment-manager agent to autonomously produce workshop-level ML papers end-to-end.
The Virtual Lab	Python	MIT	🟡 maintained · 2025-12	Team of LLM agents (AI PI, domain researchers, scientific critic) that hold structured meetings to autonomously design scientific pipelines, demonstrated by designing new SARS-CoV-2 nanobodies.
Virtual Scientists (VirSci)	Python	Apache-2.0	🟡 maintained · 2025-07	Multi-agent 'science of science' system (ACL 2025) that simulates teams of scientist agents through team organization and inter/intra-team discussion to autonomously generate and evaluate novel research ideas.

MCP servers (data & stats execution) (48)

Tool	Lang	License	Status	Data source / what it serves
Academix	Python	MIT	🟢 active · 2026-02	Aggregator: OpenAlex, DBLP, Semantic Scholar, arXiv, Crossref
akshare-one MCP	Python	MIT	🟢 active · 2026-03	AKShare (Chinese stock market data)
Alpha Vantage MCP (calvernaz)	Python	Apache-2.0	🟢 active · 2026-02	Alpha Vantage (stocks, FX, crypto)
Alpha Vantage MCP Server (official)	Python	MIT	🟢 active · 2026-05	Alpha Vantage (stocks, FX, crypto, fundamentals)
ArXiv MCP Server	Python	Apache-2.0	🟢 active · 2026-05	arXiv (preprints)
BEA MCP Server (mcp-bea)	TypeScript	unverified	🟡 maintained · 2026-01	BEA (US Bureau of Economic Analysis, GDP/income)
bioRxiv MCP Server	Python	unverified	🟡 maintained · 2025-03	bioRxiv (biology preprints)
BLS Labor MCP Server	TypeScript	NOASSERTION	🟢 active · 2026-06	BLS (US Bureau of Labor Statistics)
Crossref MCP Server (JackKuo666)	Python	unverified	🟡 maintained · 2025-04	Crossref (DOI metadata, 150M+ works)
Data Commons Agent Toolkit (official MCP)	Python	Apache-2.0	🟢 active · 2026-06	Google Data Commons (unified public datasets)
Data.gov MCP Server	JavaScript	MIT	🟡 maintained · 2025-04	Data.gov (US government open data catalog)
doi-mcp (citation verifier)	TypeScript	unverified	🟢 active · 2026-05	Aggregator: Crossref, OpenAlex, etc. (citation verification by DOI)
Eurostat MCP (ano-kuhanathan)	Python	MIT	🟡 maintained · 2026-01	Eurostat (EU official statistics)
Eurostat MCP (dcerecedo)	Python	NOASSERTION	🟢 active · 2026-03	Eurostat (EU official statistics)
FinanceMCP (Tushare + Binance)	JavaScript	MIT	🟢 active · 2026-05	Tushare (China A-shares, macro) + Binance (crypto)
FRED MCP Server (stefanoamorelli)	TypeScript	AGPL-3.0	🟢 active · 2026-05	FRED (Federal Reserve Economic Data, 800k+ series)
IMF Data MCP Server	Python	Apache-2.0	🟢 active · 2026-04	IMF (data.imf.org SDMX API)
Jupyter MCP Server (Datalayer)	Python	BSD-3-Clause	🟢 active · 2026-05	MCP server for Jupyter that lets an agent execute notebook cells and run Python/code in a live kernel with multimodal output.
MCP-DBLP	Python	MIT	🟢 active · 2026-04	DBLP (computer-science bibliography)
mcp-fred (cfdude)	Python	unverified	🟢 active · 2026-03	FRED (Federal Reserve Economic Data)
mcp-stata (tmonk)	Python	AGPL-3.0	🟢 active · 2026-05	Lightweight Stata MCP server that executes commands, inspects data, retrieves stored r()/e() results, and views graphs in a chat interface.
mcptools (Model Context Protocol for R)	R	NOASSERTION	🟢 active · 2026-03	Posit's official R package that turns a running R session into an MCP server (and client) so agents can execute R code and call R functions as tools.
Nasdaq Data Link MCP Server	Python	MIT	🟡 maintained · 2025-10	Nasdaq Data Link / Quandl (alternative + financial time series)
OECD MCP Server	TypeScript	MIT	🟢 active · 2026-04	OECD (SDMX, 5,000+ datasets)
OpenAlex MCP (reetp14)	TypeScript	MIT	🟡 maintained · 2025-07	OpenAlex (scholarly works, authors, institutions)
OpenAlex Research MCP	JavaScript	MIT	🟢 active · 2026-05	OpenAlex (240M+ scholarly works)
OpenEcon Data MCP Server	Python	NOASSERTION	🟢 active · 2026-05	Aggregator: FRED, World Bank, IMF, Eurostat, BIS, UN Comtrade (330K indicators)
Paper Search MCP	Python	MIT	🟢 active · 2026-05	Aggregator: arXiv, PubMed, bioRxiv, Semantic Scholar, OpenAlex, Crossref, CORE, dblp, etc.
paper-distill-mcp	Python	AGPL-3.0	🟢 active · 2026-03	Scholarly sources (paper search/curation)
PubMed MCP Server (cyanheads)	TypeScript	Apache-2.0	🟢 active · 2026-06	PubMed + Europe PMC + Unpaywall (biomedical literature/full text)
PubMed MCP Server (JackKuo666)	Python	MIT	🟡 maintained · 2025-05	PubMed (35M+ biomedical citations)
pubmed-search-mcp (u9401066)	Python	NOASSERTION	🟢 active · 2026-05	Aggregator: PubMed, Europe PMC, CORE, OpenAlex (biomedical)
RMCP (R MCP Server)	Python	MIT	🟡 maintained · 2025-12	MCP server exposing 50+ R statistical-analysis tools (regression, econometrics, time series, ML) backed by CRAN packages for AI agents.
SEC EDGAR MCP	Python	AGPL-3.0	🟢 active · 2026-05	SEC EDGAR (US public-company filings, XBRL financials)
Semantic Scholar MCP Server (JackKuo666)	Python	unverified	🟡 maintained · 2025-03	Semantic Scholar (200M+ papers, citations)
Simple PubMed MCP (andybrandt)	Python	MIT	🟢 active · 2026-03	PubMed (biomedical literature)
Stata MCP (hanlulong)	Python	MIT	🟢 active · 2026-04	Stata MCP extension for VS Code, Cursor and Antigravity that executes Stata commands and do-files from an AI assistant.
Stata MCP (SepineTam / mcp-for-stata)	Python	AGPL-3.0	🟢 active · 2026-06	MCP server that lets an LLM agent write and run Stata regressions and econometric do-files for paper replication and hypothesis testing (repo renamed to mcp-for-stata).
StatsPAI MCP Server	Python	MIT	🟢 active · 2026-06	Agent-native causal inference and econometrics toolkit (DiD, IV, RDD, synth, DML, Bayesian, causal discovery) exposed as an MCP server with 900+ estimator tools.
Tushare MCP (buuzzy)	Python	MIT	🟢 active · 2026-02	Tushare (China A-shares financial data)
Unpaywall MCP Server	TypeScript	MIT	🟢 active · 2026-04	Unpaywall (open-access full text by DOI)
US Census Bureau Data API MCP (official)	TypeScript	CC0-1.0	🟢 active · 2026-03	US Census Bureau Data API (ACS, demographics)
US Government Open Data MCP	TypeScript	MIT	🟢 active · 2026-04	40+ US government APIs (Treasury, FRED, Congress, FDA, CDC, FEC)
World Bank Data360 MCP (official)	Python	NOASSERTION	🟢 active · 2026-06	World Bank Data360 (development indicators, 200+ countries)
World Bank Open Data MCP (anshumax)	Python	unverified	🟡 maintained · 2025-08	World Bank Open Data API
Yahoo Finance MCP (Alex2Yang97)	Python	MIT	🟢 active · 2026-03	Yahoo Finance (via yfinance)
yfinance MCP (narumiruna)	Python	MIT	🟢 active · 2026-06	Yahoo Finance (via yfinance)
Zotero MCP	Python	MIT	🟢 active · 2026-05	Zotero (personal reference library)

Benchmarks & datasets (9)

Tool	Lang	License	Status	What it does
ACIC Competition data (aciccomp)	R	unverified	🔴 dormant · 2020-07	R packages with the data-generating processes and simulated datasets (with known ground-truth effects) from the 2016/2017 Atlantic Causal Inference Conference competitions.
bnlearn Bayesian Network Repository	R	unverified	🟡 maintained · 2025	Curated collection of reference Bayesian networks (ASIA, ALARM, HEPAR2, etc.) with known ground-truth structure in multiple formats, the standard ground-truth benchmark for structure-learning evaluation.
CausalBench	Python	Apache-2.0	🟡 maintained · 2025-06	GSK benchmark suite (with curated single-cell perturbation datasets) for evaluating network/causal-graph inference methods from gene-perturbation data.
causaldata	R · Python · Stata	unverified	🟡 maintained · 2024-11	R/Python/Stata packages providing the example datasets (LaLonde, NSW, etc.) used in The Effect, Causal Inference: The Mixtape, and What If textbooks.
CEVAE datasets (IHDP)	Python	unverified	🔴 dormant · 2020-07	Reference repo bundling the widely-cited IHDP (Infant Health and Development Program) semi-synthetic benchmark CSVs with known ground-truth treatment effects used across ITE papers.
JustCause	Python	MIT	🔴 dormant · 2020-03	Python framework providing standard causal-inference benchmark datasets (IHDP, IBM ACIC) plus synthetic-data generation and baseline comparison for evaluating ITE methods.
RealCause	Python	MIT	🔴 dormant · 2021-03	Realistic causal-inference benchmark that fits generative models to real data (LaLonde PSID/CPS, Twins) to produce samples with known ground-truth treatment effects.
Tübingen Cause-Effect Pairs	—	unverified	🟡 maintained · 2023	Standard benchmark of ~100 bivariate cause-effect pairs with ground-truth causal direction for evaluating pairwise causal-discovery methods (Mooij et al. 2016).
WhyNot	Python	MIT	🔴 dormant · 2020-06	Python sandbox of dynamic simulators with known ground-truth causal effects for stress-testing causal-inference and sequential decision-making methods.

Inclusion ≠ endorsement. Licenses/activity were verified during curation but change over time; confirm upstream before relying on a tool in a high-trust context. To propose a tool, see README.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tools Catalog — Automated Empirical Research & Causal Inference

Summary

Causal-inference & treatment-effect libraries (31)

Econometrics & quasi-experimental libraries (170)

Causal discovery / structure learning (25)

Autonomous research & data-science agents (51)

MCP servers (data & stats execution) (48)

Benchmarks & datasets (9)

FilesExpand file tree

CATALOG.md

Latest commit

History

CATALOG.md

File metadata and controls

Tools Catalog — Automated Empirical Research & Causal Inference

Summary

Causal-inference & treatment-effect libraries (31)

Econometrics & quasi-experimental libraries (170)

Causal discovery / structure learning (25)

Autonomous research & data-science agents (51)

MCP servers (data & stats execution) (48)

Benchmarks & datasets (9)