Skip to content

Latest commit

 

History

History
399 lines (378 loc) · 81.3 KB

File metadata and controls

399 lines (378 loc) · 81.3 KB

Tools Catalog — Automated Empirical Research & Causal Inference

Curated, license- and maintenance-aware index of software tools for automated empirical research and causal inference — distinct from the agent skills under ../skills/. Source of truth: tools.json. Rebuild with python3 scripts/build-tools-catalog.py.

Summary

334 tools across 6 categories.

Category Count
Causal-inference & treatment-effect libraries 31
Econometrics & quasi-experimental libraries 170
Causal discovery / structure learning 25
Autonomous research & data-science agents 51
MCP servers (data & stats execution) 48
Benchmarks & datasets 9
By language Tools By maintenance Tools By license Tools
Python 164 🟢 active 181 permissive (MIT/BSD/Apache/…) 183
R 109 🟡 maintained 94 copyleft (GPL/AGPL/LGPL/CeCILL) 104
Stata 53 🔴 dormant 59 unverified / unmapped 39
TypeScript 16 proprietary / non-OSI / custom 8
Julia 11
C++ 6
Java 3
JavaScript 3

last_activity and stars_approx are point-in-time snapshots from the curation pass (see README.md for caveats). Status: 🟢 active ≈ commit within ~6 months · 🟡 maintained ≈ within ~2 years · 🔴 dormant ≈ older.

Causal-inference & treatment-effect libraries (31)

Tool Lang License Status What it does
Ananke Python Apache-2.0 🟡 maintained · 2023-12 Python package for causal inference using graphical models (DAGs, ADMGs, chain graphs) supporting nonparametric identification and semiparametric estimation under unmeasured confounding.
bartCause R GPL-2.0-or-later 🟢 active · 2025-12 R package for causal inference using Bayesian Additive Regression Trees (BART), fitting response/treatment models to estimate ATE/ATT/ITE.
bcf R · C++ unverified 🟡 maintained · 2023-01 Reference R implementation of Bayesian Causal Forests (Hahn-Murray-Carvalho) for heterogeneous treatment-effect estimation with regularized treatment-effect priors.
CATENets Python BSD-3-Clause 🟡 maintained · 2023-08 sklearn-style JAX/PyTorch implementations of neural-network CATE estimators including TARNet, CFRNet, DragonNet, SNet, FlexTENet, and NN meta-learners.
causal-curve Python MIT 🟡 maintained · 2024-05 Python package for estimating causal dose-response curves (continuous-treatment effects) from observational data with confidence intervals.
CausalImpact R Apache-2.0 🟢 active · 2026-03 Google's R package estimating the causal effect of an intervention on a time series using a Bayesian structural time-series counterfactual model.
Causalinference Python BSD-3-Clause 🟡 maintained · 2025-06 Classic Python package for treatment-effect estimation via propensity-score estimation, trimming, subclassification, matching, weighting, and least-squares.
causallib Python Apache-2.0 🟢 active · 2026-05 IBM's scikit-learn-style package for estimating causal effects from observational data via IPW, standardization, doubly-robust (AIPW), and matching estimators.
CausalLift Python BSD-2-Clause 🔴 dormant · 2019-08 Uplift modeling package based on the T-learner targeting which customers to treat, usable with both A/B-test and observational data.
CausalML Python Apache-2.0 🟢 active · 2026-05 Uber's uplift modeling and causal ML toolkit providing CATE/ITE estimation via S/T/X/R meta-learners, uplift trees/forests, and tree-based treatment selection.
CausalPy Python Apache-2.0 🟢 active · 2026-05 PyMC Labs package for Bayesian (and OLS) causal inference in quasi-experimental designs including synthetic control, interrupted time series, difference-in-differences, regression discontinuity, and instrumental variables.
causalToolbox R GPL-3.0 🔴 dormant · 2021 R toolbox for heterogeneous treatment effects implementing S/T/X/M/DR meta-learners with honest random forests and BART base learners (now mirrored at forestry-labs/causalToolbox).
CausalTune Python Apache-2.0 🟡 maintained · 2024-12 AutoML library for automated tuning and out-of-sample (energy-score) selection of causal estimators wrapping EconML/DoWhy via FLAML.
DoubleML (Python) Python BSD-3-Clause 🟢 active · 2026-05 Object-oriented implementation of the double/debiased machine learning framework on top of scikit-learn for partially linear, IV, and interactive regression models.
DoubleML (R) R MIT 🟢 active · 2026-05 R implementation of the double/debiased machine learning framework built on the mlr3 ecosystem for orthogonal-score estimation of treatment effects.
DoWhy Python MIT 🟢 active · 2025-11 End-to-end Python causal inference library that models assumptions as a causal graph and provides a four-step identify/estimate/refute API with refutation-based robustness tests.
EconML Python MIT 🟢 active · 2026-06 Microsoft ALICE project package for estimating heterogeneous treatment effects (CATE) from observational data using double machine learning, orthogonal/causal forests, DRLearner, DeepIV and meta-learners.
grf R · C++ GPL-3.0 🟢 active · 2026-04 Generalized Random Forests for nonparametric heterogeneous treatment-effect estimation (causal forests), including IV, multi-arm, and survival forests with honest confidence intervals.
ltmle R GPL-2.0 🟡 maintained · 2023-04 R package for longitudinal targeted maximum likelihood estimation (and IPTW/G-computation) of treatment/censoring-specific mean outcomes and marginal structural models.
MendelianRandomization R GPL-2.0-or-later 🟢 active · 2024-04 CRAN R package implementing many summary-data Mendelian randomization methods (IVW, MR-Egger, median, mode, contamination-mixture, cML, debiased IVW) for causal effect estimation.
metalearners Python BSD-3-Clause 🟢 active · 2025-06 QuantCo's library for CATE estimation with S/T/X/R/DR meta-learners featuring sound cross-fitting, multi-treatment support, and SHAP/optuna integrations.
policytree R · C++ MIT 🟢 active · 2026-02 R package learning optimal shallow decision-tree treatment policies via doubly-robust empirical welfare maximization using grf scores.
pylift Python BSD-2-Clause 🔴 dormant · 2022-11 Wayfair's uplift modeling package implementing the Transformed Outcome method with uplift evaluation/visualization tools (repository archived).
scikit-uplift Python MIT 🔴 dormant · 2022-08 scikit-learn-style uplift modeling package providing solo-model/two-model/class-transformation approaches plus uplift metrics and visualizations.
stochtree Python · R · C++ MIT 🟢 active · 2026-05 Stochastic tree ensembles (BART/XBART/BCF) in R and Python for supervised learning and Bayesian Causal Forest treatment-effect estimation.
tfcausalimpact Python Apache-2.0 🟡 maintained · 2025-01 Python port of Google's CausalImpact built on TensorFlow Probability for Bayesian structural time-series intervention analysis.
tmle (R) R BSD-3-Clause 🟢 active · 2025-08 Susan Gruber & van der Laan's R package for targeted maximum likelihood estimation of ATE/ATT/ATC for a binary point treatment with SuperLearner-based nuisance estimation.
tmle3 R GPL-3.0 🔴 dormant · 2021-03 Generalized targeted learning (TMLE) framework from the tlverse providing a unified interface for estimating a range of causal target parameters.
TwoSampleMR R MIT 🟢 active · 2026-05 R package for two-sample Mendelian randomization using GWAS summary data, interfacing the IEU OpenGWAS database with IVW, MR-Egger, median and mode estimators.
UpliftML Python Apache-2.0 🔴 dormant · 2022-12 Booking.com's scalable uplift modeling package with PySpark/H2O implementations of metalearners, uplift random forests, and retrospective/constrained estimation.
zEpid Python MIT 🟡 maintained · 2022-10 Epidemiology analysis package with causal inference estimators including IPTW/AIPW, g-formula (parametric and Monte Carlo), and TMLE.

Econometrics & quasi-experimental libraries (170)

Tool Lang License Status What it does
access Python BSD-3-Clause 🟢 active · 2025-12 Classical and novel spatial accessibility-to-services measures (floating catchment, gravity, RAAM) within the PySAL ecosystem.
admetan Stata GPL-3.0-only 🔴 dormant · 2019-02 Stata module providing comprehensive aggregate-data meta-analysis and forest plots; deprecated since 2020 in favor of metan/ipdmetan.
AER R GPL-2.0-or-later 🟢 active · 2026-02 Applied Econometrics with R companion package providing IV (ivreg), tobit, count and other econometric estimators and datasets.
allsynth Stata GPL-3.0 🟡 maintained Wrapper around synth automating bias-correction, in-space placebo inference and stacked multi-unit synthetic control (Wiltshire).
anesrake R GPL-2.0-or-later 🔴 dormant · 2018-04 Implements ANES-style iterative raking to weight survey data to known target population margins with automatic variable selection.
ARDL R GPL-3.0-only 🟢 active · 2026-05 Builds ARDL and unrestricted/restricted error-correction models and runs the Pesaran-Shin-Smith (2001) bounds test for cointegration.
augsynth R MIT 🟡 maintained · 2024 Augmented synthetic control method (and multisynth for staggered adoption) that de-biases SCM when pre-treatment fit is imperfect.
AutoregressiveModels.jl Julia MIT 🟡 maintained · 2024-04 Julia toolkit for vector autoregressions with OLS estimation and structural impulse-response computation with bootstrap confidence bands.
autumn R MIT 🟡 maintained · 2024-01 Performs fast, tidy-friendly iterative proportional fitting (raking) to generate survey weights matching target population distributions.
bacondecomp R MIT 🔴 dormant · 2020-01 Goodman-Bacon decomposition of two-way fixed-effects DiD estimates into their underlying 2x2 comparison weights.
balance Python MIT 🟢 active · 2026-06 Workflow and methods (IPW, raking, post-stratification) for adjusting biased samples to infer about a target population.
bayesmeta R GPL-2.0-or-later 🟡 maintained · 2025-08 Bayesian random-effects meta-analysis and meta-regression, returning posterior and predictive distributions and shrinkage estimates.
binscatter Stata unverified 🟡 maintained Generates binned scatterplots to visualize conditional means / OLS relationships in Stata.
binsreg R · Python · Stata GPL-3.0 🟢 active · 2026-05 Binscatter least-squares, quantile and GLM regression with valid confidence bands and shape-restriction tests.
boottest Stata GPL-3.0 🟢 active · 2026-04 Fast wild bootstrap (null-imposed) and score bootstrap for cluster-robust inference with few clusters in Stata.
brms R GPL-2.0-only 🟢 active · 2026-06 Fits Bayesian generalized (non-)linear multivariate multilevel models via Stan, widely used as the multilevel-regression engine for MRP.
BVAR R GPL-3.0-only 🟡 maintained · 2024-02 Hierarchical Bayesian VAR estimation with Giannone-Lenza-Primiceri conjugate-prior selection, computing impulse responses, forecasts and FEVD.
bvarsv R GPL-2.0-or-later 🔴 dormant · 2015-10 Bayesian estimation of a time-varying-parameter VAR with stochastic volatility (Primiceri 2005) for posterior predictive densities and impulse responses.
cem R GPL-2.0 🔴 dormant · 2022-09 Coarsened exact matching for reducing imbalance between treatment and control groups in observational data.
clubSandwich R GPL-3.0 🟢 active · 2026-05 Cluster-robust (CR2) variance estimators with small-sample corrections and Satterthwaite/Wald hypothesis tests.
cobalt R GPL-2.0-or-later 🟢 active · 2026-05 Balance tables and love-plots for samples preprocessed by matching, weighting or subclassification.
coefplot Stata MIT 🟢 active · 2025-08 Plots coefficients/confidence intervals from estimation results or matrices (widely used for event-study graphs).
compute.es R GPL-2.0-or-later 🟢 active · 2026-01 Converts a wide range of test statistics into effect sizes (d, g, r, z', OR) with variances, CIs, and p-values for meta-analysis.
csdid (Python) Python MIT 🟢 active · 2025 Python port of the Callaway & Sant'Anna group-time ATT estimator for staggered DiD.
csdid (Stata) Stata MIT 🟢 active · 2025 Stata implementation of Callaway & Sant'Anna group-time ATTs with panel and repeated cross-section support (Rios-Avila).
csdid2 Stata MIT 🟢 active · 2025 Faster all-Mata reimplementation of csdid (Callaway-Sant'Anna staggered DiD) with extended functionality.
designmatch R GPL-2.0-or-later 🟢 active · 2026-02 Constructs matched samples that are balanced and representative by design via mixed-integer programming (cardinality/optimal matching).
did R GPL-2.0 🟢 active · 2025-12 Implements Callaway & Sant'Anna group-time average treatment effects for staggered difference-in-differences with multiple periods.
did2s R MIT 🟢 active · 2026-03 Two-stage difference-in-differences estimator (Gardner 2021) robust to heterogeneous treatment effects under staggered adoption.
did_imputation Stata GPL-3.0 🟡 maintained · 2024 Borusyak, Jaravel & Spiess imputation estimator and event-study plotting (did_imputation/event_plot) for staggered DiD in Stata.
didimputation R MIT 🟡 maintained · 2024 Imputation-based DiD estimator of Borusyak, Jaravel & Spiess (2021/2024) for staggered treatment timing.
DIDmultiplegt R · Stata MIT 🟢 active · 2026-02 de Chaisemartin & D'Haultfoeuille heterogeneity-robust DiD estimators for multiple groups, periods and non-binary treatments (original version).
DIDmultiplegtDYN R · Stata MIT 🟢 active · 2026-05 Dynamic (event-study) heterogeneity-robust DiD estimator allowing treatments that switch on and off multiple times.
differences Python GPL-3.0 🟢 active · 2026-04 Difference-in-differences estimation in Python (Callaway-Sant'Anna and related estimators) for staggered adoption with heterogeneous effects.
dmetar R GPL-3.0-only 🟡 maintained · 2025-05 Companion package of helper functions for the 'Doing Meta-Analysis in R' guide, extending meta, metafor, and netmeta with diagnostics and visualizations.
drdid (Stata) Stata MIT 🟢 active · 2025 Doubly-robust difference-in-differences estimators (Sant'Anna & Zhao 2020) for Stata; the building block for csdid.
dynamac R GPL-2.0-or-later 🔴 dormant · 2022-11 Estimates single-equation ARDL/error-correction models, dynamically simulates and plots their responses, and tests for cointegration (Jordan & Philips).
ebal R GPL-2.0-or-later 🟢 active · 2026-04 Entropy balancing reweighting so covariate moments match user-specified targets in observational studies (Hainmueller).
Econometrics.jl Julia ISC 🟡 maintained · 2024-12 General econometrics package for Julia covering panel models, IV and discrete-choice estimators.
esc R GPL-3.0-only 🔴 dormant · 2023-09 Computes effect sizes and their variances (d, g, r, OR, etc.) from diverse reported statistics for use in meta-analysis.
esda Python BSD-3-Clause 🟢 active · 2026-03 Exploratory spatial data analysis: global and local autocorrelation (Moran's I, Geary, Getis-Ord, local Moran/LISA) for continuous and binary areal data.
estimatr R MIT 🟡 maintained · 2025-02 Fast design-based OLS/IV estimators (lm_robust, iv_robust, difference_in_means) with robust and cluster-robust standard errors.
estout (esttab) Stata MIT 🟢 active · 2026-04 Produces publication-quality regression tables (esttab/estout) exportable to LaTeX, RTF, HTML and CSV.
etwfe R MIT 🟢 active · 2026-03 Extended two-way fixed effects (Wooldridge) DiD via saturated cohort-by-time interactions plus marginal-effects aggregation.
eventstudyinteract Stata MIT 🟡 maintained · 2023 Sun & Abraham interaction-weighted event-study estimator robust to heterogeneous treatment effects under staggered timing.
eventstudyr R MIT 🟢 active · 2026-04 Estimates and plots linear panel event-study models following Freyaldenhoven et al., including sup-t bands and pre-trend tests.
fect R MIT 🟢 active · 2026-05 Counterfactual estimators for causal panel analysis (two-way FE, interactive fixed effects, matrix completion) with diagnostic tests.
FixedEffectModels.jl Julia MIT 🟢 active · 2026-04 Estimates linear models with high-dimensional fixed effects and instrumental variables in Julia (reghdfe/fixest analog).
fixest R GPL-3.0 🟢 active · 2026-05 Fast and user-friendly estimation of OLS, GLM and IV models with multiple high-dimensional fixed effects, with built-in clustered/robust inference and event-study tooling.
ftools Stata MIT 🟢 active · 2026-01 Fast Mata-based data manipulation backend (collapse/merge/egen) that powers reghdfe and other Stata commands.
fwildclusterboot R GPL-3.0 🔴 dormant · 2023-07 Fast wild cluster bootstrap inference for OLS/IV with few clusters (R port of boottest); archived on CRAN, source remains on GitHub.
GeoDa C++ GPL-3.0-only 🟢 active · 2025-09 Cross-platform desktop GUI for exploratory spatial data analysis, LISA mapping, spatial weights and basic spatial regression on lattice data.
giddy Python BSD-3-Clause 🟢 active · 2025-12 Geospatial distribution dynamics: spatial Markov chains, rank/mobility and directional LISA analysis of longitudinal spatial data.
GLFixedEffectModels.jl Julia MIT 🟢 active · 2026-03 Estimates GLMs (logit, Poisson, etc.) with high-dimensional fixed effects in Julia (ppmlhdfe analog).
gsynth R MIT 🟢 active · 2026-03 Generalized synthetic control imputing counterfactuals via interactive fixed-effects models, supporting multiple treated units and staggered timing.
gtools Stata MIT 🟡 maintained · 2024-06 C-plugin accelerated versions of common Stata data commands (collapse, egen, reshape, pctile) used in large-panel workflows.
HonestDiD R MIT 🟢 active · 2026-04 Robust inference and sensitivity analysis for DiD/event-study designs under relaxations of the parallel-trends assumption (Rambachan & Roth).
ipdmetan Stata GPL-3.0-only 🔴 dormant · 2022-10 Stata module for two-stage individual-participant-data meta-analysis with subgroup and forest-plot support.
ipfn Python MIT 🟡 maintained · 2024-05 Implements N-dimensional iterative proportional fitting to adjust a data matrix so its margins match specified target totals.
ipfraking Stata GPL-3.0-only 🔴 dormant · 2018-05 Stata module performing iterative proportional fitting (raking) to calibrate complex survey weights to control totals with trimming and diagnostics.
ivmodel R GPL-2.0 🔴 dormant · 2023-04 IV estimation with weak-instrument-robust inference (AR, CLR), power and sensitivity analysis for a single endogenous regressor.
ivreg R GPL-2.0-or-later 🟢 active · 2026-03 Instrumental-variables (2SLS/2SM/2SMM) regression with weak-instrument and endogeneity diagnostics.
ivreg2 Stata GPL-3.0 🟡 maintained · 2024-08 Extended IV/2SLS/LIML/GMM estimation with weak-instrument and overidentification diagnostics (Baum, Schaffer & Stillman).
ivreghdfe Stata MIT 🟢 active · 2025-12 Combines ivreg2 and reghdfe to run IV/2SLS/GMM regressions with many high-dimensional fixed effects.
kmatch Stata MIT 🟢 active · 2026-02 Multivariate-distance and propensity-score matching with entropy balancing, IPW, CEM and regression adjustment.
lfe R Apache-2.0 🟡 maintained · 2025-02 Estimates linear models with multiple high-dimensional group fixed effects (and IV) by transforming away factors before OLS.
libpysal Python BSD-3-Clause 🟢 active · 2026-01 Core PySAL components: spatial weights construction, computational geometry, graphs, and I/O underpinning the spatial-econometrics stack.
linearmodels Python NCSA 🟢 active · 2025-10 Panel (fixed/random effects), IV/2SLS-GMM, system and asset-pricing estimators missing from statsmodels.
localprojections Python MIT 🔴 dormant · 2023-09 Implements Jordà (2005) local-projection impulse responses for single-entity time series and panel data, including threshold/state-dependent variants.
LocalProjections.jl Julia MIT 🟡 maintained · 2024-04 Julia implementation of local-projection methods for impulse-response estimation, including lag-augmented and smoothed local projections.
locproj Stata GPL-3.0-only 🟢 active · 2026-02 Stata (SSC) command estimating linear and nonlinear local-projection IRFs for time-series and panel data, supporting IV and quantile-regression variants (Ugarte Ruiz).
lpirfs R GPL-2.0-or-later 🟢 active · 2025-12 Estimates linear and nonlinear (state-dependent) impulse responses via Jordà (2005) local projections for time-series and panel data, with identified-shock and IV options.
marginaleffects R GPL-3.0 🟢 active · 2026-02 Computes predictions, marginal effects/slopes, comparisons and marginal means with delta-method or simulation inference for 100+ model classes.
MatchIt R GPL-2.0-or-later 🟢 active · 2025-05 Unified interface to nearest-neighbor, optimal, full, genetic and coarsened-exact matching for covariate balance in observational studies.
meta R GPL-2.0 🟢 active · 2026-05 Standard meta-analysis methods including fixed/random-effects models, meta-regression, bias tests, and forest/funnel plots.
meta Stata proprietary 🟢 active · 2026-06 Stata's built-in meta suite for fixed/random-effects meta-analysis, meta-regression, forest/funnel plots, and small-study-effect tests.
metabias Stata GPL-3.0-only 🔴 dormant · 2010-12 Stata module testing for small-study effects / funnel-plot asymmetry (Egger, Begg, Harbord tests) in meta-analysis.
metafor R GPL-2.0-or-later 🟢 active · 2026-05 Comprehensive R package for conducting meta-analyses, including effect-size computation, fixed/random/mixed-effects models, moderators, and forest/funnel plots.
metan Stata GPL-3.0-only 🟡 maintained · 2024-07 Comprehensive Stata module for fixed- and random-effects meta-analysis of binary, continuous, or generic effect estimates with flexible forest plots.
metareg Stata GPL-3.0-only 🔴 dormant · 2009-01 Stata module performing random-effects meta-regression on study-level summary data with permutation-test p-values.
metaSEM R GPL-2.0-or-later 🟢 active · 2026-05 Conducts meta-analysis via structural equation modeling (using OpenMx/lavaan), including fixed/random-effects and meta-analytic SEM on correlation matrices.
mgwr Python BSD-3-Clause 🟡 maintained · 2024-01 Calibration, inference and prediction for (multiscale) geographically weighted regression across GLM families with model diagnostics.
modelsummary R GPL-3.0 🟢 active · 2026-02 Publication-quality regression and summary tables (and coefficient plots) for many model classes in multiple output formats.
mvmeta Stata GPL-3.0-only 🔴 dormant · 2022-04 Stata module for multivariate random-effects meta-analysis and meta-regression on point estimates, variances, and covariances.
netmeta R GPL-2.0-or-later 🟢 active · 2026-05 Frequentist network meta-analysis for simultaneously comparing multiple treatments across studies, with inconsistency assessment and network graphs.
network Stata GPL-3.0-only 🔴 dormant · 2018-04 Stata module for network (mixed-treatment-comparison) meta-analysis using contrast-based multivariate meta-regression with inconsistency checks.
optmatch R MIT 🟡 maintained · 2024-09 Optimal bipartite matching using minimum-cost flow for distance/propensity-score matched designs.
outreg2 Stata unverified 🔴 dormant · 2014-08 Produces formatted regression-output tables for Word/Excel/LaTeX from Stata estimation results (Roy Wada).
panelView R MIT 🟡 maintained · 2024-06 Visualizes treatment status, missingness and outcome dynamics for panel/DiD datasets.
plm R GPL-2.0-or-later 🟢 active · 2025-11 Comprehensive panel-data econometrics toolkit with fixed/random effects estimators, robust covariances and panel diagnostic tests.
ppmlhdfe Stata MIT 🟢 active · 2026-01 Poisson pseudo-maximum-likelihood regression with multiple high-dimensional fixed effects and robust separation handling.
PracTools R GPL-3.0-only 🟢 active · 2026-01 Tools and datasets for designing complex survey samples, computing sample sizes, and constructing/weighting survey samples.
pretrends R MIT 🟡 maintained · 2024 Computes the power of pre-trends tests and visualizes detectable violations of parallel trends in event studies.
psmatch2 Stata unverified 🔴 dormant · 2018-02 Mahalanobis and propensity-score matching with common-support graphing and covariate-imbalance testing (Leuven & Sianesi).
psychmeta R GPL-3.0-or-later 🟡 maintained · 2024-06 Psychometric meta-analysis toolkit for bare-bones and artifact-corrected meta-analysis of correlations and d-values (Hunter-Schmidt methods).
puniform R GPL-2.0-or-later 🟢 active · 2025-12 Publication-bias-correcting meta-analysis methods (p-uniform / p-uniform*) based on the distribution of conditional p-values.
pyfixest Python MIT 🟢 active · 2026-04 Fast high-dimensional fixed-effects OLS/IV/Poisson regression in Python following fixest syntax, with clustered and wild-bootstrap inference.
PyMARE Python MIT 🟡 maintained · 2025-04 Python meta-analysis and regression engine providing mixed-effects meta-regression estimators and effect-size combination.
pysal Python BSD-3-Clause 🟢 active · 2026-01 Meta-package bundling the Python Spatial Analysis Library submodules (libpysal, esda, spreg, mgwr, giddy, etc.) for spatial analysis and econometrics.
PySVAR Python unverified 🟡 maintained · 2024-06 Small Python package for SVAR estimation and impulse responses across recursive (Cholesky), sign-restriction and optimization-based identification schemes.
pysyncon Python MIT 🟡 maintained · 2025-01 Python implementation of classic, robust, augmented and penalized synthetic control plus synthetic DiD.
pysynthdid Python Apache-2.0 🔴 dormant · 2023 Python implementation of the synthetic difference-in-differences (SDID) estimator.
PythonMeta Python GPL-3.0-only 🔴 dormant · 2021-11 Python module for meta-analysis in evidence-based-medicine systematic reviews, with fixed/random-effects pooling and forest/funnel plots.
quantipy3 Python MIT 🟢 active · 2026-04 Python 3 survey-data processing and analysis toolkit handling multiple-choice data, metadata, and case weighting (including raking).
rddensity R · Python · Stata GPL-3.0 🟢 active · 2025 Manipulation (density-discontinuity) testing for RD designs using local polynomial density estimators (McCrary-style sorting test).
rdlocrand R · Python · Stata GPL-3.0 🟢 active · 2026-05 Local-randomization methods for estimation, inference and window selection in regression discontinuity designs.
rdmulti R · Python · Stata GPL-3.0 🟡 maintained · 2025 RD estimation and inference with multiple cutoffs or multiple running variables/scores.
rdpower R · Python · Stata GPL-3.0 🟡 maintained · 2025 Power, sample-size and minimum-detectable-effect calculations for regression discontinuity designs.
rdrobust R · Python · Stata GPL-3.0 🟢 active · 2026-05 Estimation, robust bias-corrected inference and plotting for sharp/fuzzy regression discontinuity designs via local polynomials.
reghdfe Stata MIT 🟢 active · 2026-01 Linear regression with multiple high-dimensional fixed effects and clustered/robust standard errors in Stata.
RegressionTables.jl Julia MIT 🟢 active · 2025-10 Generates publication-quality regression tables (esttab/stargazer analog) for Julia models.
regsensitivity Stata MIT 🟡 maintained Regression sensitivity analysis (Masten & Poirier breakdown frontiers) quantifying robustness to omitted-variable bias.
rgeoda R GPL-2.0-or-later 🟢 active · 2026-02 R interface to libgeoda/GeoDa for ESDA, LISA spatial autocorrelation, spatial clustering and regionalization.
RoBMA R GPL-3.0-only 🟢 active · 2026-06 Robust Bayesian model-averaged meta-analysis that adjusts for publication bias via selection models and PET-PEESE ensembles.
robumeta R GPL-2.0-only 🔴 dormant · 2023-03 Robust variance estimation (RVE) meta-regression with large- and small-sample estimators for dependent effect sizes without distributional assumptions.
rstanarm R GPL-3.0-only 🟢 active · 2026-06 Bayesian applied regression modeling with Stan using familiar R formula syntax, commonly used to fit the multilevel models in MRP.
S2sls R GPL-2.0-or-later 🔴 dormant · 2016-08 Minimal package fitting a spatial-lag instrumental-variable regression by spatial two-stage least squares.
samplics Python MIT 🟢 active · 2026-03 Design-based analysis of complex survey data covering sample selection, weighting/calibration, estimation, and small area estimation.
sampling R GPL-2.0-or-later 🟡 maintained · 2025-07 Provides survey sampling selection algorithms and calibration/weight estimators including variance estimation for complex designs.
sandwich R GPL-2.0-or-later 🟡 maintained · 2024-09 Object-oriented model-robust covariance matrix estimators (HC, HAC, clustered, panel-corrected).
scpi R · Python · Stata MIT 🟢 active · 2025 Estimation, prediction-interval inference and graphics for synthetic control (scest/scpi), including multiple treated units and staggered adoption.
sdid Stata GPL-3.0 🟢 active · 2025 Synthetic difference-in-differences estimation with inference and graphics for Stata (Arkhangelsky et al. 2021).
sensemakr R · Python · Stata GPL-3.0 🟡 maintained · 2024-07 Sensitivity analysis to unobserved confounders for OLS via robustness values and contour plots (Cinelli & Hazlett).
spaMM R CeCILL-2.0 🟢 active · 2026-04 Fits mixed-effect models with spatially correlated random effects (geostatistical and Markov-random-field GLMMs) via Laplace/h-likelihood approximations.
SpatialDependence.jl Julia MIT 🟢 active · 2025-12 Julia package for spatial weights matrices, spatial-autocorrelation tests (global/local Moran, Geary, Getis-Ord, LISA) and choropleth ESDA.
spatialEco R GPL-3.0-only 🟢 active · 2026-05 Utilities for spatial data manipulation, sampling and modelling including autologistic models, spatial smoothing and landscape/point-pattern metrics.
spatialreg R GPL-2.0-only 🟢 active · 2026-03 Estimates spatial cross-sectional lattice/areal models (SAR, SEM, SAC, Durbin) by maximum likelihood, spatial 2SLS and GMM following Cliff-Ord and Kelejian-Prucha.
spdep R GPL-2.0-or-later 🟢 active · 2026-05 Builds spatial weights matrices from contiguities/distances and computes spatial-autocorrelation tests (Moran's I, Geary's C, Getis-Ord, local LISA).
spglm Python BSD-3-Clause 🔴 dormant · 2023-10 Sparse-compatible generalized linear models (Gaussian, Poisson, logistic) serving as the estimation base for PySAL's spint and GWR modules.
sphet R GPL-2.0-only 🟡 maintained · 2024-12 Fits Cliff-Ord spatial autoregressive models with heteroskedastic innovations via GMM/IV, including spatial HAC standard errors.
spint Python BSD-3-Clause 🔴 dormant · 2020-09 Calibrates gravity-type spatial interaction models (unconstrained and production/attraction-constrained Poisson) via entropy maximization.
splm R GPL-2.0-only 🟡 maintained · 2023-12 Maximum-likelihood and GM estimation plus diagnostic testing of fixed/random-effects econometric models for spatial panel data (Millo & Piras).
spmoran R GPL-2.0-or-later 🟡 maintained · 2024-12 Estimates Moran-eigenvector spatial/spatio-temporal regression models with spatially varying coefficients for Gaussian and non-Gaussian data.
sppack (spreg/spivreg/spmat) Stata GPL-3.0-only 🔴 dormant · 2018-12 Community Stata (SSC) precursor to official Sp: builds spatial-weighting matrices (spmat) and fits SAR/SEM/SAC by ML and GS2SLS (spreg, spivreg) by Drukker, Peng, Prucha & Raciborski.
spreg Python BSD-3-Clause 🟢 active · 2026-05 PySAL spatial econometric regression: OLS/2SLS with spatial lag and error (SAR/SEM/SARAR/Durbin), GM/ML estimators, panel and regimes models.
spsur R GPL-3.0-only 🟢 active · 2025-09 Tests and estimates spatial Seemingly Unrelated Regression (SUR-SLM/SEM/SDM/SLX) systems by maximum likelihood and three-stage least squares.
sptotal R GPL-2.0-or-later 🔴 dormant · 2023-09 Finite-population block kriging to predict totals and weighted sums from spatially autocorrelated sample data (Ver Hoef 2008).
sreweight Stata GPL-3.0-only 🔴 dormant · 2014-01 Stata module that reweights survey microdata to external aggregate totals using Deville-Sarndal calibration methods.
srvyr R GPL-2.0-or-later 🟢 active · 2026-03 Provides dplyr-like syntax for computing summary statistics on complex survey data by wrapping the survey package.
stackedev Stata unverified 🟡 maintained Stacked event-study estimator (Cengiz et al.) that builds clean cohort-vs-never-treated stacks to avoid bad TWFE comparisons.
staggered R unverified 🟢 active · 2025-12 Efficient estimators (Roth & Sant'Anna) for difference-in-differences settings with randomized/as-good-as-random treatment timing.
Stata lpirf / ivlpirf Stata proprietary 🟢 active · 2026-01 Official Stata (18+) commands estimating Jordà local-projection impulse-response functions, with ivlpirf adding instrumental-variables identification.
Stata sp (spregress/spxtregress/spivregress) Stata proprietary 🟢 active · 2026-01 Official Stata Sp suite fitting cross-sectional and panel spatial autoregressive models (SAR/SEM/SAC, with endogenous covariates) by ML and GS2SLS.
Stata var / svar / varbasic Stata proprietary 🟢 active · 2026-01 Official Stata time-series suite estimating reduced-form and structural VARs (var, svar, varbasic) with IRF/FEVD via the irf subsystem.
statsmodels Python BSD-3-Clause 🟢 active · 2025-12 General-purpose statistical modeling library (OLS/GLM, robust/clustered SE, panel and time-series tools); a foundation rather than a quasi-experimental-specific package.
survey R GPL-2.0-or-later 🟢 active · 2026-02 Analysis of complex survey samples including design-based summary statistics, generalized linear models, calibration and raking of survey weights.
svars R MIT 🟡 maintained · 2025-10 Data-driven identification of structural VARs (changes in volatility, GARCH, independent-component analysis, non-Gaussian ML) with IRFs and bootstrap inference.
svy Stata proprietary 🟢 active · 2026-06 Stata's built-in survey-data prefix and estimators that account for sampling weights, stratification, and clustering in complex survey designs.
svyweight R GPL-3.0-only 🟢 active · 2026-03 Quickly and flexibly applies rake weighting to survey data, extending the survey package's weighting interface to correct for non-response.
Synth R GPL-2.0-or-later 🟢 active · 2026-04 Classic synthetic control method (Abadie, Diamond & Hainmueller) for comparative case studies with a single treated unit.
synth Stata unverified 🟡 maintained Original Stata implementation of the synthetic control method (Abadie, Diamond & Hainmueller).
synth_runner Stata unverified 🔴 dormant · 2017-08 Automates running synth across treated units/placebos to perform inference and produce synthetic-control plots.
SynthControl.jl Julia MIT 🟡 maintained · 2024-02 Pure-Julia synthetic control and synthetic difference-in-differences estimators (beta).
synthdid R BSD-3-Clause 🟡 maintained · 2024 Reference R implementation of the synthetic difference-in-differences (SDID) estimator of Arkhangelsky et al. (2021).
synthdid.py Python MIT 🟡 maintained · 2025 Python port of synthetic DiD supporting SDID/SC/DiD estimators with bootstrap, placebo and jackknife inference.
SyntheticControlMethods Python Apache-2.0 🔴 dormant · 2023 Python package for classic and Differenced (robust) synthetic control estimation with placebo-based inference.
tsDyn R GPL-2.0-or-later 🟡 maintained · 2024-10 Nonlinear and regime-switching time-series models including linear VAR/VECM and threshold TVAR/TVECM with associated cointegration tests.
varexternalinstrument R MIT 🔴 dormant · 2019-07 Identifies VAR impulse responses using a high-frequency external instrument (proxy-SVAR / Gertler-Karadi), extending models fit with the vars package.
vars R GPL-2.0-or-later 🟡 maintained · 2024-03 Estimation, lag selection, diagnostics, forecasting, Granger causality, IRFs and FEVD for VAR models plus SVAR and SVEC estimation (Pfaff).
VARsignR R GPL-3.0-only 🔴 dormant · 2015-12 Identifies structural shocks in Bayesian VARs via sign restrictions (Uhlig rejection and penalty, Rubio-Ramirez QR, Fry-Pagan median target).
Vcov.jl Julia unverified 🟢 active · 2026-03 Provides robust and clustered variance-covariance estimators as a backend for Julia regression packages.
VectorAutoregressions.jl Julia MIT 🔴 dormant · 2022-06 Julia VAR/BVAR/FAVAR estimation with IRF identification (Cholesky, long-run, sign restrictions) and asymptotic/bootstrap confidence bands.
weakiv Stata unverified 🔴 dormant Weak-instrument-robust tests and confidence sets (AR, CLR, K) for IV/probit/tobit models (Finlay, Magnusson & Schaffer).
weightipy Python MIT 🟢 active · 2026-02 A modern, lightweight RIM (iterative raking) library for weighting survey/people data, a fork-style successor to quantipy's weighting.
WeightIt R GPL-2.0-or-later 🟢 active · 2026-04 Generates balancing weights (propensity scores, entropy balancing, CBPS, energy balancing) for binary, multi-category and continuous treatments.
weightr R GPL-2.0-or-later 🔴 dormant · 2019-07 Estimates the Vevea and Hedges (1995) weight-function model to assess and correct for publication bias in meta-analysis.
weights R GPL-2.0-or-later 🟡 maintained · 2025-06 Computes weighted descriptive statistics and tests (weighted correlations, t-tests, chi-squared) plus weighted graphics for survey data.
wildboottest Python MIT 🟡 maintained · 2024-08 Fast wild cluster bootstrap algorithms for inference on OLS coefficients in Python.
WildBootTests.jl Julia unverified 🟡 maintained Julia engine for fast wild (cluster) bootstrap tests and confidence sets, used as the backend for boottest and fwildclusterboot.
xsmle Stata unverified 🔴 dormant · 2017-01 Stata (SSC) command estimating fixed/random-effects spatial panel models (SAR, SEM, Durbin, dynamic) by quasi-maximum likelihood with direct/indirect/total effects (Belotti, Hughes & Piano Mortari).

Causal discovery / structure learning (25)

Tool Lang License Status What it does
AVICI Python MIT 🟡 maintained · 2025-02 Amortized variational inference for causal structure learning (NeurIPS 2022), predicting causal graphs directly from data via a trained neural network.
benchpress Python · R GPL-2.0 🟢 active · 2026-05 Snakemake workflow to run, develop and benchmark causal-discovery/structure-learning algorithms across many libraries (bnlearn, pcalg, causal-learn, gCastle, Tetrad, etc.) with data generators and metrics.
bnlearn (Python) Python MIT 🟢 active · 2026-03 Independent Python package (built on pgmpy) for Bayesian network structure learning, parameter learning, inference and sampling.
bnlearn (R) R GPL-2.0-or-later 🟡 maintained · 2025-08 Widely used R package for Bayesian network structure learning (constraint-based, score-based, hybrid), parameter learning and inference.
Causal Discovery Toolbox (CDT) Python MIT 🟡 maintained · 2025-10 Python package for graph and pairwise causal discovery, bridging to R packages (pcalg, bnlearn) and providing deep-learning-based methods.
causal-cmd Java unverified 🟢 active · 2026-03 Command-line interface wrapping the Tetrad causal-discovery algorithms for running searches on data files from a shell.
causal-learn Python MIT 🟢 active · 2026-06 Comprehensive Python library of classic and state-of-the-art causal discovery algorithms (PC, FCI, GES, LiNGAM, Granger, etc.) for learning causal structure from observational data.
causaldag Python BSD-3-Clause 🔴 dormant · 2023 Python package for creating, manipulating and learning causal DAGs, including GSP/IGSP permutation-based and interventional structure-learning algorithms.
CausalDisco Python BSD-3-Clause 🔴 dormant · 2023-11 Python package of baseline causal-discovery algorithms and analytics tools (varsortability, sortnregress) for benchmarking structure learning.
CausalNex Python Apache-2.0 🟡 maintained · 2024-06 Python library for learning Bayesian network structure (NOTEARS-based) and reasoning about causal relationships for decision-making.
causica Python MIT 🟡 maintained · 2024-12 Microsoft's deep-learning library for end-to-end causal discovery and inference, including the DECI amortized causal-discovery model.
DAGMA Python Apache-2.0 🟡 maintained · 2024-01 Python package learning DAGs via continuous optimization using an M-matrix log-determinant acyclicity characterization (DAGMA).
dodiscover Python MIT 🟢 active · 2026-05 PyWhy's experimental causal discovery package providing constraint-based and other global structure-learning algorithms with a scikit-learn-style API.
gCastle Python Apache-2.0 🟢 active · 2026-06 Python causal structure learning toolbox emphasizing gradient-based methods (NOTEARS, GraN-DAG, etc.) plus data simulators and SHD/F1 evaluation metrics.
gimme R GPL-2.0-or-later 🟢 active · 2026-03 R package (Group Iterative Multiple Model Estimation) that recovers group- and individual-level directed contemporaneous/lagged network structure from time series via unified SEM search.
LiNGAM Python MIT 🟢 active · 2026-05 Python package implementing the LiNGAM family (ICA-LiNGAM, DirectLiNGAM, VAR-LiNGAM, RCD, etc.) for causal discovery in linear non-Gaussian models.
NOTEARS Python Apache-2.0 🟢 active · 2026-05 Reference implementation of NO TEARS, casting DAG structure learning as a continuous optimization with a smooth acyclicity constraint.
pcalg R GPL-2.0-or-later 🟡 maintained · 2024-09 Canonical R package for graphical-model causal structure learning (PC, FCI, RFCI, GIES) and causal effect estimation (IDA).
pgmpy Python MIT 🟢 active · 2026-06 Python toolkit for probabilistic graphical models with Bayesian network structure learning (PC, Hill-Climb, etc.), parameter learning, inference and causal reasoning.
py-tetrad Python · Java MIT 🟢 active · 2026-05 Python interface (via JPype) exposing the Java Tetrad causal-discovery algorithms in Python workflows.
pyAgrum / aGrUM Python · C++ LGPL-3.0-or-MIT 🟢 active · 2026-01 C++/Python library for probabilistic graphical models (Bayesian networks) with structure learning and causal do-calculus support.
pywhy-graphs Python MIT 🟢 active · 2026-05 NetworkX-compliant causal graph data structures (ADMG, PAG, CPDAG) underpinning the PyWhy causal-discovery ecosystem.
Tetrad Java GPL-3.0 🟢 active · 2026-06 Long-running Java toolkit and GUI for causal discovery and graphical-causal-model search, the reference implementation of many constraint- and score-based algorithms.
tigramite Python GPL-3.0 🟢 active · 2026-01 Python package for causal discovery in time series via the PCMCI/PCMCI+/LPCMCI family of conditional-independence-based algorithms.
typed-DAG (t-DAG) Python Apache-2.0 🔴 dormant · 2023-07 Reference implementation of causal discovery with typed directed acyclic graphs, integrating variable-type knowledge into structure learning.

Autonomous research & data-science agents (51)

Tool Lang License Status What it does
Agent Laboratory Python MIT 🟡 maintained · 2025-08 End-to-end autonomous research workflow with literature-review, experimentation, and report-writing phases (and AgentRxiv shared-preprint collaboration) to turn a human research idea into a paper plus code.
Agentic Data Scientist Python MIT 🟢 active · 2026-05 An adaptive multi-agent framework (Google ADK + Claude Agent SDK) that separates planning from execution with continuous validation to complete end-to-end data-science tasks.
AI Data Science Team Python MIT 🟢 active · 2025-12 A library of specialized LLM agents (data cleaning, EDA, feature engineering, SQL, H2O AutoML, visualization) orchestrated by a supervisor to automate common data-science tasks.
AI-Researcher Python unverified 🟡 maintained · 2025-10 Fully autonomous research system (NeurIPS 2025) that runs the whole pipeline from literature review and idea generation through algorithm implementation to manuscript writing, primarily for AI/ML research.
AIDE (aideml) Python MIT 🟢 active · 2026-05 Tree-search ML-engineering agent that autonomously drafts, debugs, and benchmarks code to maximize a user-defined metric, reaching strong Kaggle/MLE-bench performance. (Overlaps with the data-science agent bucket.)
Auto-Analyst TypeScript · Python MIT 🟢 active · 2026-05 A modular multi-agent AI data-scientist platform (DSPy-based) automating cleaning, statistical analysis, scikit-learn modeling, and Plotly visualization.
Auto-Deep-Research Python MIT 🟡 maintained · 2025-02 A cost-efficient open Deep Research alternative (built on the AutoAgent framework) that autonomously gathers and synthesizes web information; strong on GAIA.
AutoGluon Assistant (MLZero) Python Apache-2.0 🟢 active · 2026-03 A multi-agent system that transforms raw multimodal data (tabular, image, text, audio) into trained ML solutions end-to-end with zero human intervention, using MCTS-guided code generation over AutoGluon.
AutoKaggle Python Apache-2.0 🟡 maintained · 2024-12 A multi-agent framework with five cooperating agents that autonomously complete Kaggle tabular competitions across six pipeline phases.
AutoMind Python MIT 🟢 active · 2025-10 An adaptive, knowledge-grounded data-science agent using an expert knowledge base plus agentic tree search to build ML pipelines (beats AIDE on MLE-bench).
AutoResearchClaw Python MIT 🟢 active · 2026-06 Self-reinforcing 23-stage autonomous research pipeline (literature discovery, multi-agent hypothesis debate, sandboxed self-healing experiments, peer review, LaTeX export) that turns an idea into a conference-ready paper.
AutoSurvey Python unverified 🔴 dormant · 2025-02 NeurIPS 2024 method that automatically writes comprehensive literature surveys via retrieval, parallel subsection drafting by specialized LLMs, and iterative refinement with automated evaluation.
Aviary Python Apache-2.0 🟢 active · 2026-06 Gymnasium/framework of language-agent environments for challenging scientific tasks (literature QA, DNA manipulation, protein engineering) used to build and train autonomous research agents.
Biomni Python Apache-2.0 🟢 active · 2025-10 A general-purpose autonomous biomedical research agent combining LLM reasoning, retrieval-augmented planning, and code execution over a large library of biomedical tools.
ChemCrow Python MIT 🔴 dormant · 2024-03 An LLM agent augmented with chemistry tools (RDKit, paper-qa, reaction/retrosynthesis databases) that autonomously solves reasoning-intensive chemistry tasks.
Coscientist Python Apache-2.0 (Commons Clause) 🔴 dormant An LLM-driven autonomous lab agent (from the Nature paper) that plans, designs, and optimizes chemical experiments and synthesis.
Curie Python Apache-2.0 🟡 maintained · 2025-09 AI agent framework for rigorous, automated scientific experimentation that handles the full hypothesis-to-analysis loop (experiment design, environment setup, execution, analysis) with reproducibility guarantees.
CycleResearcher Python unverified 🟢 active · 2026-03 Open-source ecosystem of trained models (CycleResearcher + CycleReviewer) that iteratively generate research papers and improve them via automated peer review, focused on ML research.
Data Formulator TypeScript · Python MIT 🟢 active · 2026-05 An AI tool with data-loading, exploration, and chart-style-refinement agents that transform and visualize data via a blend of UI interactions and natural language.
Data-Copilot Python MIT 🔴 dormant · 2023 An LLM agent that self-designs interface tools then dispatches them to autonomously query, process, analyze, and visualize (financial) data.
data-to-paper Python MIT 🟡 maintained · 2025-07 Multi-agent system that goes from a raw dataset and research goal to a verifiable, data-traceable scientific paper, emphasizing reproducibility in data-driven (e.g. biomedical/clinical) research.
DataMind Python Apache-2.0 🟢 active · 2026-06 An open data-synthesis + agent-training recipe yielding generalist data-analytic LLMs (DataMind-7B/14B) that do multi-step, code-based reasoning over CSV/Excel/SQLite.
deep-research (dzhng) TypeScript MIT 🟢 active · 2026-04 A compact open-source deep-research agent that recursively searches, scrapes, and reasons over the web to produce reports, tracking goals across iterations.
DeepAnalyze Python MIT 🟢 active · 2026-03 An agentic LLM (DeepAnalyze-8B) that autonomously runs the end-to-end data-science pipeline from raw structured/semi-structured/unstructured data to analyst-grade research reports.
DeepEye Python · TypeScript Apache-2.0 🟢 active · 2026-05 A production-ready 'self-driving' data agent system that autonomously orchestrates multi-step workflows to produce dashboards, analytical reports, and data videos from heterogeneous data.
DS-Agent Python unverified 🔴 dormant · 2024 An ICML'24 data-science agent that uses case-based reasoning over Kaggle expert knowledge to iteratively build and train ML models across tabular/text/time-series.
freephdlabor Python MIT 🟢 active · 2026-05 Customizable multi-agent framework (ManagerAgent orchestrating Ideation/Experiment/Writeup agents) for building personalized systems that run continuous autonomous research toward publication-grade reports.
GPT Researcher Python · TypeScript Apache-2.0 🟢 active · 2026-05 Autonomous deep-research agent that plans sub-questions, scrapes and aggregates many web/local sources, and synthesizes a long-form cited research report. (Also relevant to the data-science/deep-research bucket.)
Jupyter AI Python BSD-3-Clause 🟢 active · 2026-04 A JupyterLab extension (v3) connecting agentic AI models to notebooks so they can read/write files, run code, and act via a built-in MCP server for data work.
LIDA Python MIT 🔴 dormant · 2024-03 An LLM agent that automatically summarizes data, generates analysis goals, and writes/executes/edits visualization code (treating viz as code) across grammars.
MetaGPT (Data Interpreter / SELA) Python MIT 🟢 active · 2026-01 Multi-agent framework whose Data Interpreter (and SELA tree-search AutoML extension) agent plans, writes, and self-debugs code to solve data-analysis, ML, and modeling tasks.
MLE-Agent Python MIT 🔴 dormant · 2024-10 An AI companion that autonomously builds ML/AI baselines and end-to-end solutions (incl. Kaggle) with integrated arXiv/paper search.
MLR-Copilot Python unverified 🟡 maintained · 2025-03 Machine-learning research assistant framework where LLM agents autonomously generate research ideas from papers and implement/execute the corresponding experiments.
Open Deep Research (LangChain) Python MIT 🟢 active · 2025-08 A configurable, fully open-source deep-research agent (LangGraph-based) that works across many model/search providers; ranks on Deep Research Bench.
Open Deep Research (nickscamara/Firecrawl) TypeScript Apache-2.0 🟡 maintained · 2025-02 An open Deep Research clone that reasons over large amounts of web data extracted via Firecrawl to generate research analyses.
Open Interpreter Python AGPL-3.0 🔴 dormant · 2024-10 A natural-language code-execution agent that runs Python/shell locally to plot, clean, and analyze datasets (and general computer tasks), with human approval of generated code.
OpenResearcher Python Apache-2.0 🔴 dormant · 2024-10 AI research-assistant platform that uses retrieval-augmented generation over scientific literature to autonomously answer research questions, summarize, and recommend papers with source citations.
PandasAI Python MIT 🟢 active · 2025-10 A conversational data-analysis agent that turns natural-language questions over CSV/SQL/parquet data lakes into executed analysis code and charts.
Paper2Code (PaperCoder) Python Apache-2.0 🟢 active · 2026-03 Multi-agent LLM system that autonomously converts an ML research paper into a faithful, runnable code repository via planning, analysis, and generation stages.
PaperQA2 Python Apache-2.0 🟢 active · 2026-03 Agentic high-accuracy RAG system over full-text scientific literature that autonomously retrieves, ranks, and synthesizes cited answers and literature summaries with superhuman accuracy on QA/contradiction tasks.
RD-Agent Python MIT 🟢 active · 2026-05 Microsoft's R&D automation framework that iteratively proposes hypotheses and implements/evolves them as code, targeting data-driven R&D such as quantitative finance factor/model discovery and ML engineering.
ResearchAgent Python unverified 🟡 maintained · 2025-08 LLM system (NAACL 2025) that iteratively generates research problems, methods, and experiment designs grounded in an academic citation graph, refined by collaborating reviewing agents.
Robin Python Apache-2.0 🟢 active · 2026-04 Multi-agent system (built on Aviary/PaperQA) that automates therapeutics discovery by generating hypotheses, proposing experiments, and analyzing experimental data, demonstrated by identifying a novel dry-AMD drug candidate.
STORM / Co-STORM Python MIT 🟡 maintained · 2025-09 LLM knowledge-curation system that researches a topic via multi-perspective simulated expert conversations and web search to autonomously synthesize a full, Wikipedia-style cited report (Co-STORM adds human-in-the-loop).
SurveyX Python unverified 🟡 maintained · 2026-01 Academic survey-automation system that takes a title and keywords and autonomously retrieves literature and generates a structured, cited survey paper (open-source release is offline-only; full service is hosted).
TableGPT Agent Python Apache-2.0 🟡 maintained · 2025-03 A LangGraph-based pre-built agent for the TableGPT2 model that answers analytical questions and runs code over tabular datasets.
TaskWeaver Python MIT 🔴 dormant · 2026-03 A code-first agent framework that plans and executes data-analytics tasks via generated Python, with stateful code/plugin memory (repo archived March 2026).
The AI Scientist Python AI Scientist Source Code License v1.0 (custom, Responsible-AI based) 🟡 maintained · 2025-12 Fully automated pipeline that generates ML research ideas, writes and runs experiment code, and drafts complete LaTeX papers with an automated reviewer, in machine-learning domains.
The AI Scientist-v2 Python AI Scientist Source Code License v1.0 (custom, Responsible-AI based) 🟡 maintained · 2025-12 Template-free successor to The AI Scientist that uses agentic tree search and an experiment-manager agent to autonomously produce workshop-level ML papers end-to-end.
The Virtual Lab Python MIT 🟡 maintained · 2025-12 Team of LLM agents (AI PI, domain researchers, scientific critic) that hold structured meetings to autonomously design scientific pipelines, demonstrated by designing new SARS-CoV-2 nanobodies.
Virtual Scientists (VirSci) Python Apache-2.0 🟡 maintained · 2025-07 Multi-agent 'science of science' system (ACL 2025) that simulates teams of scientist agents through team organization and inter/intra-team discussion to autonomously generate and evaluate novel research ideas.

MCP servers (data & stats execution) (48)

Tool Lang License Status Data source / what it serves
Academix Python MIT 🟢 active · 2026-02 Aggregator: OpenAlex, DBLP, Semantic Scholar, arXiv, Crossref
akshare-one MCP Python MIT 🟢 active · 2026-03 AKShare (Chinese stock market data)
Alpha Vantage MCP (calvernaz) Python Apache-2.0 🟢 active · 2026-02 Alpha Vantage (stocks, FX, crypto)
Alpha Vantage MCP Server (official) Python MIT 🟢 active · 2026-05 Alpha Vantage (stocks, FX, crypto, fundamentals)
ArXiv MCP Server Python Apache-2.0 🟢 active · 2026-05 arXiv (preprints)
BEA MCP Server (mcp-bea) TypeScript unverified 🟡 maintained · 2026-01 BEA (US Bureau of Economic Analysis, GDP/income)
bioRxiv MCP Server Python unverified 🟡 maintained · 2025-03 bioRxiv (biology preprints)
BLS Labor MCP Server TypeScript NOASSERTION 🟢 active · 2026-06 BLS (US Bureau of Labor Statistics)
Crossref MCP Server (JackKuo666) Python unverified 🟡 maintained · 2025-04 Crossref (DOI metadata, 150M+ works)
Data Commons Agent Toolkit (official MCP) Python Apache-2.0 🟢 active · 2026-06 Google Data Commons (unified public datasets)
Data.gov MCP Server JavaScript MIT 🟡 maintained · 2025-04 Data.gov (US government open data catalog)
doi-mcp (citation verifier) TypeScript unverified 🟢 active · 2026-05 Aggregator: Crossref, OpenAlex, etc. (citation verification by DOI)
Eurostat MCP (ano-kuhanathan) Python MIT 🟡 maintained · 2026-01 Eurostat (EU official statistics)
Eurostat MCP (dcerecedo) Python NOASSERTION 🟢 active · 2026-03 Eurostat (EU official statistics)
FinanceMCP (Tushare + Binance) JavaScript MIT 🟢 active · 2026-05 Tushare (China A-shares, macro) + Binance (crypto)
FRED MCP Server (stefanoamorelli) TypeScript AGPL-3.0 🟢 active · 2026-05 FRED (Federal Reserve Economic Data, 800k+ series)
IMF Data MCP Server Python Apache-2.0 🟢 active · 2026-04 IMF (data.imf.org SDMX API)
Jupyter MCP Server (Datalayer) Python BSD-3-Clause 🟢 active · 2026-05 MCP server for Jupyter that lets an agent execute notebook cells and run Python/code in a live kernel with multimodal output.
MCP-DBLP Python MIT 🟢 active · 2026-04 DBLP (computer-science bibliography)
mcp-fred (cfdude) Python unverified 🟢 active · 2026-03 FRED (Federal Reserve Economic Data)
mcp-stata (tmonk) Python AGPL-3.0 🟢 active · 2026-05 Lightweight Stata MCP server that executes commands, inspects data, retrieves stored r()/e() results, and views graphs in a chat interface.
mcptools (Model Context Protocol for R) R NOASSERTION 🟢 active · 2026-03 Posit's official R package that turns a running R session into an MCP server (and client) so agents can execute R code and call R functions as tools.
Nasdaq Data Link MCP Server Python MIT 🟡 maintained · 2025-10 Nasdaq Data Link / Quandl (alternative + financial time series)
OECD MCP Server TypeScript MIT 🟢 active · 2026-04 OECD (SDMX, 5,000+ datasets)
OpenAlex MCP (reetp14) TypeScript MIT 🟡 maintained · 2025-07 OpenAlex (scholarly works, authors, institutions)
OpenAlex Research MCP JavaScript MIT 🟢 active · 2026-05 OpenAlex (240M+ scholarly works)
OpenEcon Data MCP Server Python NOASSERTION 🟢 active · 2026-05 Aggregator: FRED, World Bank, IMF, Eurostat, BIS, UN Comtrade (330K indicators)
Paper Search MCP Python MIT 🟢 active · 2026-05 Aggregator: arXiv, PubMed, bioRxiv, Semantic Scholar, OpenAlex, Crossref, CORE, dblp, etc.
paper-distill-mcp Python AGPL-3.0 🟢 active · 2026-03 Scholarly sources (paper search/curation)
PubMed MCP Server (cyanheads) TypeScript Apache-2.0 🟢 active · 2026-06 PubMed + Europe PMC + Unpaywall (biomedical literature/full text)
PubMed MCP Server (JackKuo666) Python MIT 🟡 maintained · 2025-05 PubMed (35M+ biomedical citations)
pubmed-search-mcp (u9401066) Python NOASSERTION 🟢 active · 2026-05 Aggregator: PubMed, Europe PMC, CORE, OpenAlex (biomedical)
RMCP (R MCP Server) Python MIT 🟡 maintained · 2025-12 MCP server exposing 50+ R statistical-analysis tools (regression, econometrics, time series, ML) backed by CRAN packages for AI agents.
SEC EDGAR MCP Python AGPL-3.0 🟢 active · 2026-05 SEC EDGAR (US public-company filings, XBRL financials)
Semantic Scholar MCP Server (JackKuo666) Python unverified 🟡 maintained · 2025-03 Semantic Scholar (200M+ papers, citations)
Simple PubMed MCP (andybrandt) Python MIT 🟢 active · 2026-03 PubMed (biomedical literature)
Stata MCP (hanlulong) Python MIT 🟢 active · 2026-04 Stata MCP extension for VS Code, Cursor and Antigravity that executes Stata commands and do-files from an AI assistant.
Stata MCP (SepineTam / mcp-for-stata) Python AGPL-3.0 🟢 active · 2026-06 MCP server that lets an LLM agent write and run Stata regressions and econometric do-files for paper replication and hypothesis testing (repo renamed to mcp-for-stata).
StatsPAI MCP Server Python MIT 🟢 active · 2026-06 Agent-native causal inference and econometrics toolkit (DiD, IV, RDD, synth, DML, Bayesian, causal discovery) exposed as an MCP server with 900+ estimator tools.
Tushare MCP (buuzzy) Python MIT 🟢 active · 2026-02 Tushare (China A-shares financial data)
Unpaywall MCP Server TypeScript MIT 🟢 active · 2026-04 Unpaywall (open-access full text by DOI)
US Census Bureau Data API MCP (official) TypeScript CC0-1.0 🟢 active · 2026-03 US Census Bureau Data API (ACS, demographics)
US Government Open Data MCP TypeScript MIT 🟢 active · 2026-04 40+ US government APIs (Treasury, FRED, Congress, FDA, CDC, FEC)
World Bank Data360 MCP (official) Python NOASSERTION 🟢 active · 2026-06 World Bank Data360 (development indicators, 200+ countries)
World Bank Open Data MCP (anshumax) Python unverified 🟡 maintained · 2025-08 World Bank Open Data API
Yahoo Finance MCP (Alex2Yang97) Python MIT 🟢 active · 2026-03 Yahoo Finance (via yfinance)
yfinance MCP (narumiruna) Python MIT 🟢 active · 2026-06 Yahoo Finance (via yfinance)
Zotero MCP Python MIT 🟢 active · 2026-05 Zotero (personal reference library)

Benchmarks & datasets (9)

Tool Lang License Status What it does
ACIC Competition data (aciccomp) R unverified 🔴 dormant · 2020-07 R packages with the data-generating processes and simulated datasets (with known ground-truth effects) from the 2016/2017 Atlantic Causal Inference Conference competitions.
bnlearn Bayesian Network Repository R unverified 🟡 maintained · 2025 Curated collection of reference Bayesian networks (ASIA, ALARM, HEPAR2, etc.) with known ground-truth structure in multiple formats, the standard ground-truth benchmark for structure-learning evaluation.
CausalBench Python Apache-2.0 🟡 maintained · 2025-06 GSK benchmark suite (with curated single-cell perturbation datasets) for evaluating network/causal-graph inference methods from gene-perturbation data.
causaldata R · Python · Stata unverified 🟡 maintained · 2024-11 R/Python/Stata packages providing the example datasets (LaLonde, NSW, etc.) used in The Effect, Causal Inference: The Mixtape, and What If textbooks.
CEVAE datasets (IHDP) Python unverified 🔴 dormant · 2020-07 Reference repo bundling the widely-cited IHDP (Infant Health and Development Program) semi-synthetic benchmark CSVs with known ground-truth treatment effects used across ITE papers.
JustCause Python MIT 🔴 dormant · 2020-03 Python framework providing standard causal-inference benchmark datasets (IHDP, IBM ACIC) plus synthetic-data generation and baseline comparison for evaluating ITE methods.
RealCause Python MIT 🔴 dormant · 2021-03 Realistic causal-inference benchmark that fits generative models to real data (LaLonde PSID/CPS, Twins) to produce samples with known ground-truth treatment effects.
Tübingen Cause-Effect Pairs unverified 🟡 maintained · 2023 Standard benchmark of ~100 bivariate cause-effect pairs with ground-truth causal direction for evaluating pairwise causal-discovery methods (Mooij et al. 2016).
WhyNot Python MIT 🔴 dormant · 2020-06 Python sandbox of dynamic simulators with known ground-truth causal effects for stress-testing causal-inference and sequential decision-making methods.

Inclusion ≠ endorsement. Licenses/activity were verified during curation but change over time; confirm upstream before relying on a tool in a high-trust context. To propose a tool, see README.md.