Curated, license- and maintenance-aware index of software tools for automated empirical research and causal inference — distinct from the agent skills under ../skills/. Source of truth: tools.json. Rebuild with python3 scripts/build-tools-catalog.py.
334 tools across 6 categories.
| Category | Count |
|---|---|
| Causal-inference & treatment-effect libraries | 31 |
| Econometrics & quasi-experimental libraries | 170 |
| Causal discovery / structure learning | 25 |
| Autonomous research & data-science agents | 51 |
| MCP servers (data & stats execution) | 48 |
| Benchmarks & datasets | 9 |
| By language | Tools | By maintenance | Tools | By license | Tools | ||
|---|---|---|---|---|---|---|---|
| Python | 164 | 🟢 active | 181 | permissive (MIT/BSD/Apache/…) | 183 | ||
| R | 109 | 🟡 maintained | 94 | copyleft (GPL/AGPL/LGPL/CeCILL) | 104 | ||
| Stata | 53 | 🔴 dormant | 59 | unverified / unmapped | 39 | ||
| TypeScript | 16 | proprietary / non-OSI / custom | 8 | ||||
| Julia | 11 | ||||||
| C++ | 6 | ||||||
| Java | 3 | ||||||
| JavaScript | 3 |
last_activityandstars_approxare point-in-time snapshots from the curation pass (seeREADME.mdfor caveats). Status: 🟢 active ≈ commit within ~6 months · 🟡 maintained ≈ within ~2 years · 🔴 dormant ≈ older.
| Tool | Lang | License | Status | What it does |
|---|---|---|---|---|
| Ananke | Python | Apache-2.0 | 🟡 maintained · 2023-12 | Python package for causal inference using graphical models (DAGs, ADMGs, chain graphs) supporting nonparametric identification and semiparametric estimation under unmeasured confounding. |
| bartCause | R | GPL-2.0-or-later | 🟢 active · 2025-12 | R package for causal inference using Bayesian Additive Regression Trees (BART), fitting response/treatment models to estimate ATE/ATT/ITE. |
| bcf | R · C++ | unverified | 🟡 maintained · 2023-01 | Reference R implementation of Bayesian Causal Forests (Hahn-Murray-Carvalho) for heterogeneous treatment-effect estimation with regularized treatment-effect priors. |
| CATENets | Python | BSD-3-Clause | 🟡 maintained · 2023-08 | sklearn-style JAX/PyTorch implementations of neural-network CATE estimators including TARNet, CFRNet, DragonNet, SNet, FlexTENet, and NN meta-learners. |
| causal-curve | Python | MIT | 🟡 maintained · 2024-05 | Python package for estimating causal dose-response curves (continuous-treatment effects) from observational data with confidence intervals. |
| CausalImpact | R | Apache-2.0 | 🟢 active · 2026-03 | Google's R package estimating the causal effect of an intervention on a time series using a Bayesian structural time-series counterfactual model. |
| Causalinference | Python | BSD-3-Clause | 🟡 maintained · 2025-06 | Classic Python package for treatment-effect estimation via propensity-score estimation, trimming, subclassification, matching, weighting, and least-squares. |
| causallib | Python | Apache-2.0 | 🟢 active · 2026-05 | IBM's scikit-learn-style package for estimating causal effects from observational data via IPW, standardization, doubly-robust (AIPW), and matching estimators. |
| CausalLift | Python | BSD-2-Clause | 🔴 dormant · 2019-08 | Uplift modeling package based on the T-learner targeting which customers to treat, usable with both A/B-test and observational data. |
| CausalML | Python | Apache-2.0 | 🟢 active · 2026-05 | Uber's uplift modeling and causal ML toolkit providing CATE/ITE estimation via S/T/X/R meta-learners, uplift trees/forests, and tree-based treatment selection. |
| CausalPy | Python | Apache-2.0 | 🟢 active · 2026-05 | PyMC Labs package for Bayesian (and OLS) causal inference in quasi-experimental designs including synthetic control, interrupted time series, difference-in-differences, regression discontinuity, and instrumental variables. |
| causalToolbox | R | GPL-3.0 | 🔴 dormant · 2021 | R toolbox for heterogeneous treatment effects implementing S/T/X/M/DR meta-learners with honest random forests and BART base learners (now mirrored at forestry-labs/causalToolbox). |
| CausalTune | Python | Apache-2.0 | 🟡 maintained · 2024-12 | AutoML library for automated tuning and out-of-sample (energy-score) selection of causal estimators wrapping EconML/DoWhy via FLAML. |
| DoubleML (Python) | Python | BSD-3-Clause | 🟢 active · 2026-05 | Object-oriented implementation of the double/debiased machine learning framework on top of scikit-learn for partially linear, IV, and interactive regression models. |
| DoubleML (R) | R | MIT | 🟢 active · 2026-05 | R implementation of the double/debiased machine learning framework built on the mlr3 ecosystem for orthogonal-score estimation of treatment effects. |
| DoWhy | Python | MIT | 🟢 active · 2025-11 | End-to-end Python causal inference library that models assumptions as a causal graph and provides a four-step identify/estimate/refute API with refutation-based robustness tests. |
| EconML | Python | MIT | 🟢 active · 2026-06 | Microsoft ALICE project package for estimating heterogeneous treatment effects (CATE) from observational data using double machine learning, orthogonal/causal forests, DRLearner, DeepIV and meta-learners. |
| grf | R · C++ | GPL-3.0 | 🟢 active · 2026-04 | Generalized Random Forests for nonparametric heterogeneous treatment-effect estimation (causal forests), including IV, multi-arm, and survival forests with honest confidence intervals. |
| ltmle | R | GPL-2.0 | 🟡 maintained · 2023-04 | R package for longitudinal targeted maximum likelihood estimation (and IPTW/G-computation) of treatment/censoring-specific mean outcomes and marginal structural models. |
| MendelianRandomization | R | GPL-2.0-or-later | 🟢 active · 2024-04 | CRAN R package implementing many summary-data Mendelian randomization methods (IVW, MR-Egger, median, mode, contamination-mixture, cML, debiased IVW) for causal effect estimation. |
| metalearners | Python | BSD-3-Clause | 🟢 active · 2025-06 | QuantCo's library for CATE estimation with S/T/X/R/DR meta-learners featuring sound cross-fitting, multi-treatment support, and SHAP/optuna integrations. |
| policytree | R · C++ | MIT | 🟢 active · 2026-02 | R package learning optimal shallow decision-tree treatment policies via doubly-robust empirical welfare maximization using grf scores. |
| pylift | Python | BSD-2-Clause | 🔴 dormant · 2022-11 | Wayfair's uplift modeling package implementing the Transformed Outcome method with uplift evaluation/visualization tools (repository archived). |
| scikit-uplift | Python | MIT | 🔴 dormant · 2022-08 | scikit-learn-style uplift modeling package providing solo-model/two-model/class-transformation approaches plus uplift metrics and visualizations. |
| stochtree | Python · R · C++ | MIT | 🟢 active · 2026-05 | Stochastic tree ensembles (BART/XBART/BCF) in R and Python for supervised learning and Bayesian Causal Forest treatment-effect estimation. |
| tfcausalimpact | Python | Apache-2.0 | 🟡 maintained · 2025-01 | Python port of Google's CausalImpact built on TensorFlow Probability for Bayesian structural time-series intervention analysis. |
| tmle (R) | R | BSD-3-Clause | 🟢 active · 2025-08 | Susan Gruber & van der Laan's R package for targeted maximum likelihood estimation of ATE/ATT/ATC for a binary point treatment with SuperLearner-based nuisance estimation. |
| tmle3 | R | GPL-3.0 | 🔴 dormant · 2021-03 | Generalized targeted learning (TMLE) framework from the tlverse providing a unified interface for estimating a range of causal target parameters. |
| TwoSampleMR | R | MIT | 🟢 active · 2026-05 | R package for two-sample Mendelian randomization using GWAS summary data, interfacing the IEU OpenGWAS database with IVW, MR-Egger, median and mode estimators. |
| UpliftML | Python | Apache-2.0 | 🔴 dormant · 2022-12 | Booking.com's scalable uplift modeling package with PySpark/H2O implementations of metalearners, uplift random forests, and retrospective/constrained estimation. |
| zEpid | Python | MIT | 🟡 maintained · 2022-10 | Epidemiology analysis package with causal inference estimators including IPTW/AIPW, g-formula (parametric and Monte Carlo), and TMLE. |
| Tool | Lang | License | Status | What it does |
|---|---|---|---|---|
| access | Python | BSD-3-Clause | 🟢 active · 2025-12 | Classical and novel spatial accessibility-to-services measures (floating catchment, gravity, RAAM) within the PySAL ecosystem. |
| admetan | Stata | GPL-3.0-only | 🔴 dormant · 2019-02 | Stata module providing comprehensive aggregate-data meta-analysis and forest plots; deprecated since 2020 in favor of metan/ipdmetan. |
| AER | R | GPL-2.0-or-later | 🟢 active · 2026-02 | Applied Econometrics with R companion package providing IV (ivreg), tobit, count and other econometric estimators and datasets. |
| allsynth | Stata | GPL-3.0 | 🟡 maintained | Wrapper around synth automating bias-correction, in-space placebo inference and stacked multi-unit synthetic control (Wiltshire). |
| anesrake | R | GPL-2.0-or-later | 🔴 dormant · 2018-04 | Implements ANES-style iterative raking to weight survey data to known target population margins with automatic variable selection. |
| ARDL | R | GPL-3.0-only | 🟢 active · 2026-05 | Builds ARDL and unrestricted/restricted error-correction models and runs the Pesaran-Shin-Smith (2001) bounds test for cointegration. |
| augsynth | R | MIT | 🟡 maintained · 2024 | Augmented synthetic control method (and multisynth for staggered adoption) that de-biases SCM when pre-treatment fit is imperfect. |
| AutoregressiveModels.jl | Julia | MIT | 🟡 maintained · 2024-04 | Julia toolkit for vector autoregressions with OLS estimation and structural impulse-response computation with bootstrap confidence bands. |
| autumn | R | MIT | 🟡 maintained · 2024-01 | Performs fast, tidy-friendly iterative proportional fitting (raking) to generate survey weights matching target population distributions. |
| bacondecomp | R | MIT | 🔴 dormant · 2020-01 | Goodman-Bacon decomposition of two-way fixed-effects DiD estimates into their underlying 2x2 comparison weights. |
| balance | Python | MIT | 🟢 active · 2026-06 | Workflow and methods (IPW, raking, post-stratification) for adjusting biased samples to infer about a target population. |
| bayesmeta | R | GPL-2.0-or-later | 🟡 maintained · 2025-08 | Bayesian random-effects meta-analysis and meta-regression, returning posterior and predictive distributions and shrinkage estimates. |
| binscatter | Stata | unverified | 🟡 maintained | Generates binned scatterplots to visualize conditional means / OLS relationships in Stata. |
| binsreg | R · Python · Stata | GPL-3.0 | 🟢 active · 2026-05 | Binscatter least-squares, quantile and GLM regression with valid confidence bands and shape-restriction tests. |
| boottest | Stata | GPL-3.0 | 🟢 active · 2026-04 | Fast wild bootstrap (null-imposed) and score bootstrap for cluster-robust inference with few clusters in Stata. |
| brms | R | GPL-2.0-only | 🟢 active · 2026-06 | Fits Bayesian generalized (non-)linear multivariate multilevel models via Stan, widely used as the multilevel-regression engine for MRP. |
| BVAR | R | GPL-3.0-only | 🟡 maintained · 2024-02 | Hierarchical Bayesian VAR estimation with Giannone-Lenza-Primiceri conjugate-prior selection, computing impulse responses, forecasts and FEVD. |
| bvarsv | R | GPL-2.0-or-later | 🔴 dormant · 2015-10 | Bayesian estimation of a time-varying-parameter VAR with stochastic volatility (Primiceri 2005) for posterior predictive densities and impulse responses. |
| cem | R | GPL-2.0 | 🔴 dormant · 2022-09 | Coarsened exact matching for reducing imbalance between treatment and control groups in observational data. |
| clubSandwich | R | GPL-3.0 | 🟢 active · 2026-05 | Cluster-robust (CR2) variance estimators with small-sample corrections and Satterthwaite/Wald hypothesis tests. |
| cobalt | R | GPL-2.0-or-later | 🟢 active · 2026-05 | Balance tables and love-plots for samples preprocessed by matching, weighting or subclassification. |
| coefplot | Stata | MIT | 🟢 active · 2025-08 | Plots coefficients/confidence intervals from estimation results or matrices (widely used for event-study graphs). |
| compute.es | R | GPL-2.0-or-later | 🟢 active · 2026-01 | Converts a wide range of test statistics into effect sizes (d, g, r, z', OR) with variances, CIs, and p-values for meta-analysis. |
| csdid (Python) | Python | MIT | 🟢 active · 2025 | Python port of the Callaway & Sant'Anna group-time ATT estimator for staggered DiD. |
| csdid (Stata) | Stata | MIT | 🟢 active · 2025 | Stata implementation of Callaway & Sant'Anna group-time ATTs with panel and repeated cross-section support (Rios-Avila). |
| csdid2 | Stata | MIT | 🟢 active · 2025 | Faster all-Mata reimplementation of csdid (Callaway-Sant'Anna staggered DiD) with extended functionality. |
| designmatch | R | GPL-2.0-or-later | 🟢 active · 2026-02 | Constructs matched samples that are balanced and representative by design via mixed-integer programming (cardinality/optimal matching). |
| did | R | GPL-2.0 | 🟢 active · 2025-12 | Implements Callaway & Sant'Anna group-time average treatment effects for staggered difference-in-differences with multiple periods. |
| did2s | R | MIT | 🟢 active · 2026-03 | Two-stage difference-in-differences estimator (Gardner 2021) robust to heterogeneous treatment effects under staggered adoption. |
| did_imputation | Stata | GPL-3.0 | 🟡 maintained · 2024 | Borusyak, Jaravel & Spiess imputation estimator and event-study plotting (did_imputation/event_plot) for staggered DiD in Stata. |
| didimputation | R | MIT | 🟡 maintained · 2024 | Imputation-based DiD estimator of Borusyak, Jaravel & Spiess (2021/2024) for staggered treatment timing. |
| DIDmultiplegt | R · Stata | MIT | 🟢 active · 2026-02 | de Chaisemartin & D'Haultfoeuille heterogeneity-robust DiD estimators for multiple groups, periods and non-binary treatments (original version). |
| DIDmultiplegtDYN | R · Stata | MIT | 🟢 active · 2026-05 | Dynamic (event-study) heterogeneity-robust DiD estimator allowing treatments that switch on and off multiple times. |
| differences | Python | GPL-3.0 | 🟢 active · 2026-04 | Difference-in-differences estimation in Python (Callaway-Sant'Anna and related estimators) for staggered adoption with heterogeneous effects. |
| dmetar | R | GPL-3.0-only | 🟡 maintained · 2025-05 | Companion package of helper functions for the 'Doing Meta-Analysis in R' guide, extending meta, metafor, and netmeta with diagnostics and visualizations. |
| drdid (Stata) | Stata | MIT | 🟢 active · 2025 | Doubly-robust difference-in-differences estimators (Sant'Anna & Zhao 2020) for Stata; the building block for csdid. |
| dynamac | R | GPL-2.0-or-later | 🔴 dormant · 2022-11 | Estimates single-equation ARDL/error-correction models, dynamically simulates and plots their responses, and tests for cointegration (Jordan & Philips). |
| ebal | R | GPL-2.0-or-later | 🟢 active · 2026-04 | Entropy balancing reweighting so covariate moments match user-specified targets in observational studies (Hainmueller). |
| Econometrics.jl | Julia | ISC | 🟡 maintained · 2024-12 | General econometrics package for Julia covering panel models, IV and discrete-choice estimators. |
| esc | R | GPL-3.0-only | 🔴 dormant · 2023-09 | Computes effect sizes and their variances (d, g, r, OR, etc.) from diverse reported statistics for use in meta-analysis. |
| esda | Python | BSD-3-Clause | 🟢 active · 2026-03 | Exploratory spatial data analysis: global and local autocorrelation (Moran's I, Geary, Getis-Ord, local Moran/LISA) for continuous and binary areal data. |
| estimatr | R | MIT | 🟡 maintained · 2025-02 | Fast design-based OLS/IV estimators (lm_robust, iv_robust, difference_in_means) with robust and cluster-robust standard errors. |
| estout (esttab) | Stata | MIT | 🟢 active · 2026-04 | Produces publication-quality regression tables (esttab/estout) exportable to LaTeX, RTF, HTML and CSV. |
| etwfe | R | MIT | 🟢 active · 2026-03 | Extended two-way fixed effects (Wooldridge) DiD via saturated cohort-by-time interactions plus marginal-effects aggregation. |
| eventstudyinteract | Stata | MIT | 🟡 maintained · 2023 | Sun & Abraham interaction-weighted event-study estimator robust to heterogeneous treatment effects under staggered timing. |
| eventstudyr | R | MIT | 🟢 active · 2026-04 | Estimates and plots linear panel event-study models following Freyaldenhoven et al., including sup-t bands and pre-trend tests. |
| fect | R | MIT | 🟢 active · 2026-05 | Counterfactual estimators for causal panel analysis (two-way FE, interactive fixed effects, matrix completion) with diagnostic tests. |
| FixedEffectModels.jl | Julia | MIT | 🟢 active · 2026-04 | Estimates linear models with high-dimensional fixed effects and instrumental variables in Julia (reghdfe/fixest analog). |
| fixest | R | GPL-3.0 | 🟢 active · 2026-05 | Fast and user-friendly estimation of OLS, GLM and IV models with multiple high-dimensional fixed effects, with built-in clustered/robust inference and event-study tooling. |
| ftools | Stata | MIT | 🟢 active · 2026-01 | Fast Mata-based data manipulation backend (collapse/merge/egen) that powers reghdfe and other Stata commands. |
| fwildclusterboot | R | GPL-3.0 | 🔴 dormant · 2023-07 | Fast wild cluster bootstrap inference for OLS/IV with few clusters (R port of boottest); archived on CRAN, source remains on GitHub. |
| GeoDa | C++ | GPL-3.0-only | 🟢 active · 2025-09 | Cross-platform desktop GUI for exploratory spatial data analysis, LISA mapping, spatial weights and basic spatial regression on lattice data. |
| giddy | Python | BSD-3-Clause | 🟢 active · 2025-12 | Geospatial distribution dynamics: spatial Markov chains, rank/mobility and directional LISA analysis of longitudinal spatial data. |
| GLFixedEffectModels.jl | Julia | MIT | 🟢 active · 2026-03 | Estimates GLMs (logit, Poisson, etc.) with high-dimensional fixed effects in Julia (ppmlhdfe analog). |
| gsynth | R | MIT | 🟢 active · 2026-03 | Generalized synthetic control imputing counterfactuals via interactive fixed-effects models, supporting multiple treated units and staggered timing. |
| gtools | Stata | MIT | 🟡 maintained · 2024-06 | C-plugin accelerated versions of common Stata data commands (collapse, egen, reshape, pctile) used in large-panel workflows. |
| HonestDiD | R | MIT | 🟢 active · 2026-04 | Robust inference and sensitivity analysis for DiD/event-study designs under relaxations of the parallel-trends assumption (Rambachan & Roth). |
| ipdmetan | Stata | GPL-3.0-only | 🔴 dormant · 2022-10 | Stata module for two-stage individual-participant-data meta-analysis with subgroup and forest-plot support. |
| ipfn | Python | MIT | 🟡 maintained · 2024-05 | Implements N-dimensional iterative proportional fitting to adjust a data matrix so its margins match specified target totals. |
| ipfraking | Stata | GPL-3.0-only | 🔴 dormant · 2018-05 | Stata module performing iterative proportional fitting (raking) to calibrate complex survey weights to control totals with trimming and diagnostics. |
| ivmodel | R | GPL-2.0 | 🔴 dormant · 2023-04 | IV estimation with weak-instrument-robust inference (AR, CLR), power and sensitivity analysis for a single endogenous regressor. |
| ivreg | R | GPL-2.0-or-later | 🟢 active · 2026-03 | Instrumental-variables (2SLS/2SM/2SMM) regression with weak-instrument and endogeneity diagnostics. |
| ivreg2 | Stata | GPL-3.0 | 🟡 maintained · 2024-08 | Extended IV/2SLS/LIML/GMM estimation with weak-instrument and overidentification diagnostics (Baum, Schaffer & Stillman). |
| ivreghdfe | Stata | MIT | 🟢 active · 2025-12 | Combines ivreg2 and reghdfe to run IV/2SLS/GMM regressions with many high-dimensional fixed effects. |
| kmatch | Stata | MIT | 🟢 active · 2026-02 | Multivariate-distance and propensity-score matching with entropy balancing, IPW, CEM and regression adjustment. |
| lfe | R | Apache-2.0 | 🟡 maintained · 2025-02 | Estimates linear models with multiple high-dimensional group fixed effects (and IV) by transforming away factors before OLS. |
| libpysal | Python | BSD-3-Clause | 🟢 active · 2026-01 | Core PySAL components: spatial weights construction, computational geometry, graphs, and I/O underpinning the spatial-econometrics stack. |
| linearmodels | Python | NCSA | 🟢 active · 2025-10 | Panel (fixed/random effects), IV/2SLS-GMM, system and asset-pricing estimators missing from statsmodels. |
| localprojections | Python | MIT | 🔴 dormant · 2023-09 | Implements Jordà (2005) local-projection impulse responses for single-entity time series and panel data, including threshold/state-dependent variants. |
| LocalProjections.jl | Julia | MIT | 🟡 maintained · 2024-04 | Julia implementation of local-projection methods for impulse-response estimation, including lag-augmented and smoothed local projections. |
| locproj | Stata | GPL-3.0-only | 🟢 active · 2026-02 | Stata (SSC) command estimating linear and nonlinear local-projection IRFs for time-series and panel data, supporting IV and quantile-regression variants (Ugarte Ruiz). |
| lpirfs | R | GPL-2.0-or-later | 🟢 active · 2025-12 | Estimates linear and nonlinear (state-dependent) impulse responses via Jordà (2005) local projections for time-series and panel data, with identified-shock and IV options. |
| marginaleffects | R | GPL-3.0 | 🟢 active · 2026-02 | Computes predictions, marginal effects/slopes, comparisons and marginal means with delta-method or simulation inference for 100+ model classes. |
| MatchIt | R | GPL-2.0-or-later | 🟢 active · 2025-05 | Unified interface to nearest-neighbor, optimal, full, genetic and coarsened-exact matching for covariate balance in observational studies. |
| meta | R | GPL-2.0 | 🟢 active · 2026-05 | Standard meta-analysis methods including fixed/random-effects models, meta-regression, bias tests, and forest/funnel plots. |
| meta | Stata | proprietary | 🟢 active · 2026-06 | Stata's built-in meta suite for fixed/random-effects meta-analysis, meta-regression, forest/funnel plots, and small-study-effect tests. |
| metabias | Stata | GPL-3.0-only | 🔴 dormant · 2010-12 | Stata module testing for small-study effects / funnel-plot asymmetry (Egger, Begg, Harbord tests) in meta-analysis. |
| metafor | R | GPL-2.0-or-later | 🟢 active · 2026-05 | Comprehensive R package for conducting meta-analyses, including effect-size computation, fixed/random/mixed-effects models, moderators, and forest/funnel plots. |
| metan | Stata | GPL-3.0-only | 🟡 maintained · 2024-07 | Comprehensive Stata module for fixed- and random-effects meta-analysis of binary, continuous, or generic effect estimates with flexible forest plots. |
| metareg | Stata | GPL-3.0-only | 🔴 dormant · 2009-01 | Stata module performing random-effects meta-regression on study-level summary data with permutation-test p-values. |
| metaSEM | R | GPL-2.0-or-later | 🟢 active · 2026-05 | Conducts meta-analysis via structural equation modeling (using OpenMx/lavaan), including fixed/random-effects and meta-analytic SEM on correlation matrices. |
| mgwr | Python | BSD-3-Clause | 🟡 maintained · 2024-01 | Calibration, inference and prediction for (multiscale) geographically weighted regression across GLM families with model diagnostics. |
| modelsummary | R | GPL-3.0 | 🟢 active · 2026-02 | Publication-quality regression and summary tables (and coefficient plots) for many model classes in multiple output formats. |
| mvmeta | Stata | GPL-3.0-only | 🔴 dormant · 2022-04 | Stata module for multivariate random-effects meta-analysis and meta-regression on point estimates, variances, and covariances. |
| netmeta | R | GPL-2.0-or-later | 🟢 active · 2026-05 | Frequentist network meta-analysis for simultaneously comparing multiple treatments across studies, with inconsistency assessment and network graphs. |
| network | Stata | GPL-3.0-only | 🔴 dormant · 2018-04 | Stata module for network (mixed-treatment-comparison) meta-analysis using contrast-based multivariate meta-regression with inconsistency checks. |
| optmatch | R | MIT | 🟡 maintained · 2024-09 | Optimal bipartite matching using minimum-cost flow for distance/propensity-score matched designs. |
| outreg2 | Stata | unverified | 🔴 dormant · 2014-08 | Produces formatted regression-output tables for Word/Excel/LaTeX from Stata estimation results (Roy Wada). |
| panelView | R | MIT | 🟡 maintained · 2024-06 | Visualizes treatment status, missingness and outcome dynamics for panel/DiD datasets. |
| plm | R | GPL-2.0-or-later | 🟢 active · 2025-11 | Comprehensive panel-data econometrics toolkit with fixed/random effects estimators, robust covariances and panel diagnostic tests. |
| ppmlhdfe | Stata | MIT | 🟢 active · 2026-01 | Poisson pseudo-maximum-likelihood regression with multiple high-dimensional fixed effects and robust separation handling. |
| PracTools | R | GPL-3.0-only | 🟢 active · 2026-01 | Tools and datasets for designing complex survey samples, computing sample sizes, and constructing/weighting survey samples. |
| pretrends | R | MIT | 🟡 maintained · 2024 | Computes the power of pre-trends tests and visualizes detectable violations of parallel trends in event studies. |
| psmatch2 | Stata | unverified | 🔴 dormant · 2018-02 | Mahalanobis and propensity-score matching with common-support graphing and covariate-imbalance testing (Leuven & Sianesi). |
| psychmeta | R | GPL-3.0-or-later | 🟡 maintained · 2024-06 | Psychometric meta-analysis toolkit for bare-bones and artifact-corrected meta-analysis of correlations and d-values (Hunter-Schmidt methods). |
| puniform | R | GPL-2.0-or-later | 🟢 active · 2025-12 | Publication-bias-correcting meta-analysis methods (p-uniform / p-uniform*) based on the distribution of conditional p-values. |
| pyfixest | Python | MIT | 🟢 active · 2026-04 | Fast high-dimensional fixed-effects OLS/IV/Poisson regression in Python following fixest syntax, with clustered and wild-bootstrap inference. |
| PyMARE | Python | MIT | 🟡 maintained · 2025-04 | Python meta-analysis and regression engine providing mixed-effects meta-regression estimators and effect-size combination. |
| pysal | Python | BSD-3-Clause | 🟢 active · 2026-01 | Meta-package bundling the Python Spatial Analysis Library submodules (libpysal, esda, spreg, mgwr, giddy, etc.) for spatial analysis and econometrics. |
| PySVAR | Python | unverified | 🟡 maintained · 2024-06 | Small Python package for SVAR estimation and impulse responses across recursive (Cholesky), sign-restriction and optimization-based identification schemes. |
| pysyncon | Python | MIT | 🟡 maintained · 2025-01 | Python implementation of classic, robust, augmented and penalized synthetic control plus synthetic DiD. |
| pysynthdid | Python | Apache-2.0 | 🔴 dormant · 2023 | Python implementation of the synthetic difference-in-differences (SDID) estimator. |
| PythonMeta | Python | GPL-3.0-only | 🔴 dormant · 2021-11 | Python module for meta-analysis in evidence-based-medicine systematic reviews, with fixed/random-effects pooling and forest/funnel plots. |
| quantipy3 | Python | MIT | 🟢 active · 2026-04 | Python 3 survey-data processing and analysis toolkit handling multiple-choice data, metadata, and case weighting (including raking). |
| rddensity | R · Python · Stata | GPL-3.0 | 🟢 active · 2025 | Manipulation (density-discontinuity) testing for RD designs using local polynomial density estimators (McCrary-style sorting test). |
| rdlocrand | R · Python · Stata | GPL-3.0 | 🟢 active · 2026-05 | Local-randomization methods for estimation, inference and window selection in regression discontinuity designs. |
| rdmulti | R · Python · Stata | GPL-3.0 | 🟡 maintained · 2025 | RD estimation and inference with multiple cutoffs or multiple running variables/scores. |
| rdpower | R · Python · Stata | GPL-3.0 | 🟡 maintained · 2025 | Power, sample-size and minimum-detectable-effect calculations for regression discontinuity designs. |
| rdrobust | R · Python · Stata | GPL-3.0 | 🟢 active · 2026-05 | Estimation, robust bias-corrected inference and plotting for sharp/fuzzy regression discontinuity designs via local polynomials. |
| reghdfe | Stata | MIT | 🟢 active · 2026-01 | Linear regression with multiple high-dimensional fixed effects and clustered/robust standard errors in Stata. |
| RegressionTables.jl | Julia | MIT | 🟢 active · 2025-10 | Generates publication-quality regression tables (esttab/stargazer analog) for Julia models. |
| regsensitivity | Stata | MIT | 🟡 maintained | Regression sensitivity analysis (Masten & Poirier breakdown frontiers) quantifying robustness to omitted-variable bias. |
| rgeoda | R | GPL-2.0-or-later | 🟢 active · 2026-02 | R interface to libgeoda/GeoDa for ESDA, LISA spatial autocorrelation, spatial clustering and regionalization. |
| RoBMA | R | GPL-3.0-only | 🟢 active · 2026-06 | Robust Bayesian model-averaged meta-analysis that adjusts for publication bias via selection models and PET-PEESE ensembles. |
| robumeta | R | GPL-2.0-only | 🔴 dormant · 2023-03 | Robust variance estimation (RVE) meta-regression with large- and small-sample estimators for dependent effect sizes without distributional assumptions. |
| rstanarm | R | GPL-3.0-only | 🟢 active · 2026-06 | Bayesian applied regression modeling with Stan using familiar R formula syntax, commonly used to fit the multilevel models in MRP. |
| S2sls | R | GPL-2.0-or-later | 🔴 dormant · 2016-08 | Minimal package fitting a spatial-lag instrumental-variable regression by spatial two-stage least squares. |
| samplics | Python | MIT | 🟢 active · 2026-03 | Design-based analysis of complex survey data covering sample selection, weighting/calibration, estimation, and small area estimation. |
| sampling | R | GPL-2.0-or-later | 🟡 maintained · 2025-07 | Provides survey sampling selection algorithms and calibration/weight estimators including variance estimation for complex designs. |
| sandwich | R | GPL-2.0-or-later | 🟡 maintained · 2024-09 | Object-oriented model-robust covariance matrix estimators (HC, HAC, clustered, panel-corrected). |
| scpi | R · Python · Stata | MIT | 🟢 active · 2025 | Estimation, prediction-interval inference and graphics for synthetic control (scest/scpi), including multiple treated units and staggered adoption. |
| sdid | Stata | GPL-3.0 | 🟢 active · 2025 | Synthetic difference-in-differences estimation with inference and graphics for Stata (Arkhangelsky et al. 2021). |
| sensemakr | R · Python · Stata | GPL-3.0 | 🟡 maintained · 2024-07 | Sensitivity analysis to unobserved confounders for OLS via robustness values and contour plots (Cinelli & Hazlett). |
| spaMM | R | CeCILL-2.0 | 🟢 active · 2026-04 | Fits mixed-effect models with spatially correlated random effects (geostatistical and Markov-random-field GLMMs) via Laplace/h-likelihood approximations. |
| SpatialDependence.jl | Julia | MIT | 🟢 active · 2025-12 | Julia package for spatial weights matrices, spatial-autocorrelation tests (global/local Moran, Geary, Getis-Ord, LISA) and choropleth ESDA. |
| spatialEco | R | GPL-3.0-only | 🟢 active · 2026-05 | Utilities for spatial data manipulation, sampling and modelling including autologistic models, spatial smoothing and landscape/point-pattern metrics. |
| spatialreg | R | GPL-2.0-only | 🟢 active · 2026-03 | Estimates spatial cross-sectional lattice/areal models (SAR, SEM, SAC, Durbin) by maximum likelihood, spatial 2SLS and GMM following Cliff-Ord and Kelejian-Prucha. |
| spdep | R | GPL-2.0-or-later | 🟢 active · 2026-05 | Builds spatial weights matrices from contiguities/distances and computes spatial-autocorrelation tests (Moran's I, Geary's C, Getis-Ord, local LISA). |
| spglm | Python | BSD-3-Clause | 🔴 dormant · 2023-10 | Sparse-compatible generalized linear models (Gaussian, Poisson, logistic) serving as the estimation base for PySAL's spint and GWR modules. |
| sphet | R | GPL-2.0-only | 🟡 maintained · 2024-12 | Fits Cliff-Ord spatial autoregressive models with heteroskedastic innovations via GMM/IV, including spatial HAC standard errors. |
| spint | Python | BSD-3-Clause | 🔴 dormant · 2020-09 | Calibrates gravity-type spatial interaction models (unconstrained and production/attraction-constrained Poisson) via entropy maximization. |
| splm | R | GPL-2.0-only | 🟡 maintained · 2023-12 | Maximum-likelihood and GM estimation plus diagnostic testing of fixed/random-effects econometric models for spatial panel data (Millo & Piras). |
| spmoran | R | GPL-2.0-or-later | 🟡 maintained · 2024-12 | Estimates Moran-eigenvector spatial/spatio-temporal regression models with spatially varying coefficients for Gaussian and non-Gaussian data. |
| sppack (spreg/spivreg/spmat) | Stata | GPL-3.0-only | 🔴 dormant · 2018-12 | Community Stata (SSC) precursor to official Sp: builds spatial-weighting matrices (spmat) and fits SAR/SEM/SAC by ML and GS2SLS (spreg, spivreg) by Drukker, Peng, Prucha & Raciborski. |
| spreg | Python | BSD-3-Clause | 🟢 active · 2026-05 | PySAL spatial econometric regression: OLS/2SLS with spatial lag and error (SAR/SEM/SARAR/Durbin), GM/ML estimators, panel and regimes models. |
| spsur | R | GPL-3.0-only | 🟢 active · 2025-09 | Tests and estimates spatial Seemingly Unrelated Regression (SUR-SLM/SEM/SDM/SLX) systems by maximum likelihood and three-stage least squares. |
| sptotal | R | GPL-2.0-or-later | 🔴 dormant · 2023-09 | Finite-population block kriging to predict totals and weighted sums from spatially autocorrelated sample data (Ver Hoef 2008). |
| sreweight | Stata | GPL-3.0-only | 🔴 dormant · 2014-01 | Stata module that reweights survey microdata to external aggregate totals using Deville-Sarndal calibration methods. |
| srvyr | R | GPL-2.0-or-later | 🟢 active · 2026-03 | Provides dplyr-like syntax for computing summary statistics on complex survey data by wrapping the survey package. |
| stackedev | Stata | unverified | 🟡 maintained | Stacked event-study estimator (Cengiz et al.) that builds clean cohort-vs-never-treated stacks to avoid bad TWFE comparisons. |
| staggered | R | unverified | 🟢 active · 2025-12 | Efficient estimators (Roth & Sant'Anna) for difference-in-differences settings with randomized/as-good-as-random treatment timing. |
| Stata lpirf / ivlpirf | Stata | proprietary | 🟢 active · 2026-01 | Official Stata (18+) commands estimating Jordà local-projection impulse-response functions, with ivlpirf adding instrumental-variables identification. |
| Stata sp (spregress/spxtregress/spivregress) | Stata | proprietary | 🟢 active · 2026-01 | Official Stata Sp suite fitting cross-sectional and panel spatial autoregressive models (SAR/SEM/SAC, with endogenous covariates) by ML and GS2SLS. |
| Stata var / svar / varbasic | Stata | proprietary | 🟢 active · 2026-01 | Official Stata time-series suite estimating reduced-form and structural VARs (var, svar, varbasic) with IRF/FEVD via the irf subsystem. |
| statsmodels | Python | BSD-3-Clause | 🟢 active · 2025-12 | General-purpose statistical modeling library (OLS/GLM, robust/clustered SE, panel and time-series tools); a foundation rather than a quasi-experimental-specific package. |
| survey | R | GPL-2.0-or-later | 🟢 active · 2026-02 | Analysis of complex survey samples including design-based summary statistics, generalized linear models, calibration and raking of survey weights. |
| svars | R | MIT | 🟡 maintained · 2025-10 | Data-driven identification of structural VARs (changes in volatility, GARCH, independent-component analysis, non-Gaussian ML) with IRFs and bootstrap inference. |
| svy | Stata | proprietary | 🟢 active · 2026-06 | Stata's built-in survey-data prefix and estimators that account for sampling weights, stratification, and clustering in complex survey designs. |
| svyweight | R | GPL-3.0-only | 🟢 active · 2026-03 | Quickly and flexibly applies rake weighting to survey data, extending the survey package's weighting interface to correct for non-response. |
| Synth | R | GPL-2.0-or-later | 🟢 active · 2026-04 | Classic synthetic control method (Abadie, Diamond & Hainmueller) for comparative case studies with a single treated unit. |
| synth | Stata | unverified | 🟡 maintained | Original Stata implementation of the synthetic control method (Abadie, Diamond & Hainmueller). |
| synth_runner | Stata | unverified | 🔴 dormant · 2017-08 | Automates running synth across treated units/placebos to perform inference and produce synthetic-control plots. |
| SynthControl.jl | Julia | MIT | 🟡 maintained · 2024-02 | Pure-Julia synthetic control and synthetic difference-in-differences estimators (beta). |
| synthdid | R | BSD-3-Clause | 🟡 maintained · 2024 | Reference R implementation of the synthetic difference-in-differences (SDID) estimator of Arkhangelsky et al. (2021). |
| synthdid.py | Python | MIT | 🟡 maintained · 2025 | Python port of synthetic DiD supporting SDID/SC/DiD estimators with bootstrap, placebo and jackknife inference. |
| SyntheticControlMethods | Python | Apache-2.0 | 🔴 dormant · 2023 | Python package for classic and Differenced (robust) synthetic control estimation with placebo-based inference. |
| tsDyn | R | GPL-2.0-or-later | 🟡 maintained · 2024-10 | Nonlinear and regime-switching time-series models including linear VAR/VECM and threshold TVAR/TVECM with associated cointegration tests. |
| varexternalinstrument | R | MIT | 🔴 dormant · 2019-07 | Identifies VAR impulse responses using a high-frequency external instrument (proxy-SVAR / Gertler-Karadi), extending models fit with the vars package. |
| vars | R | GPL-2.0-or-later | 🟡 maintained · 2024-03 | Estimation, lag selection, diagnostics, forecasting, Granger causality, IRFs and FEVD for VAR models plus SVAR and SVEC estimation (Pfaff). |
| VARsignR | R | GPL-3.0-only | 🔴 dormant · 2015-12 | Identifies structural shocks in Bayesian VARs via sign restrictions (Uhlig rejection and penalty, Rubio-Ramirez QR, Fry-Pagan median target). |
| Vcov.jl | Julia | unverified | 🟢 active · 2026-03 | Provides robust and clustered variance-covariance estimators as a backend for Julia regression packages. |
| VectorAutoregressions.jl | Julia | MIT | 🔴 dormant · 2022-06 | Julia VAR/BVAR/FAVAR estimation with IRF identification (Cholesky, long-run, sign restrictions) and asymptotic/bootstrap confidence bands. |
| weakiv | Stata | unverified | 🔴 dormant | Weak-instrument-robust tests and confidence sets (AR, CLR, K) for IV/probit/tobit models (Finlay, Magnusson & Schaffer). |
| weightipy | Python | MIT | 🟢 active · 2026-02 | A modern, lightweight RIM (iterative raking) library for weighting survey/people data, a fork-style successor to quantipy's weighting. |
| WeightIt | R | GPL-2.0-or-later | 🟢 active · 2026-04 | Generates balancing weights (propensity scores, entropy balancing, CBPS, energy balancing) for binary, multi-category and continuous treatments. |
| weightr | R | GPL-2.0-or-later | 🔴 dormant · 2019-07 | Estimates the Vevea and Hedges (1995) weight-function model to assess and correct for publication bias in meta-analysis. |
| weights | R | GPL-2.0-or-later | 🟡 maintained · 2025-06 | Computes weighted descriptive statistics and tests (weighted correlations, t-tests, chi-squared) plus weighted graphics for survey data. |
| wildboottest | Python | MIT | 🟡 maintained · 2024-08 | Fast wild cluster bootstrap algorithms for inference on OLS coefficients in Python. |
| WildBootTests.jl | Julia | unverified | 🟡 maintained | Julia engine for fast wild (cluster) bootstrap tests and confidence sets, used as the backend for boottest and fwildclusterboot. |
| xsmle | Stata | unverified | 🔴 dormant · 2017-01 | Stata (SSC) command estimating fixed/random-effects spatial panel models (SAR, SEM, Durbin, dynamic) by quasi-maximum likelihood with direct/indirect/total effects (Belotti, Hughes & Piano Mortari). |
| Tool | Lang | License | Status | What it does |
|---|---|---|---|---|
| AVICI | Python | MIT | 🟡 maintained · 2025-02 | Amortized variational inference for causal structure learning (NeurIPS 2022), predicting causal graphs directly from data via a trained neural network. |
| benchpress | Python · R | GPL-2.0 | 🟢 active · 2026-05 | Snakemake workflow to run, develop and benchmark causal-discovery/structure-learning algorithms across many libraries (bnlearn, pcalg, causal-learn, gCastle, Tetrad, etc.) with data generators and metrics. |
| bnlearn (Python) | Python | MIT | 🟢 active · 2026-03 | Independent Python package (built on pgmpy) for Bayesian network structure learning, parameter learning, inference and sampling. |
| bnlearn (R) | R | GPL-2.0-or-later | 🟡 maintained · 2025-08 | Widely used R package for Bayesian network structure learning (constraint-based, score-based, hybrid), parameter learning and inference. |
| Causal Discovery Toolbox (CDT) | Python | MIT | 🟡 maintained · 2025-10 | Python package for graph and pairwise causal discovery, bridging to R packages (pcalg, bnlearn) and providing deep-learning-based methods. |
| causal-cmd | Java | unverified | 🟢 active · 2026-03 | Command-line interface wrapping the Tetrad causal-discovery algorithms for running searches on data files from a shell. |
| causal-learn | Python | MIT | 🟢 active · 2026-06 | Comprehensive Python library of classic and state-of-the-art causal discovery algorithms (PC, FCI, GES, LiNGAM, Granger, etc.) for learning causal structure from observational data. |
| causaldag | Python | BSD-3-Clause | 🔴 dormant · 2023 | Python package for creating, manipulating and learning causal DAGs, including GSP/IGSP permutation-based and interventional structure-learning algorithms. |
| CausalDisco | Python | BSD-3-Clause | 🔴 dormant · 2023-11 | Python package of baseline causal-discovery algorithms and analytics tools (varsortability, sortnregress) for benchmarking structure learning. |
| CausalNex | Python | Apache-2.0 | 🟡 maintained · 2024-06 | Python library for learning Bayesian network structure (NOTEARS-based) and reasoning about causal relationships for decision-making. |
| causica | Python | MIT | 🟡 maintained · 2024-12 | Microsoft's deep-learning library for end-to-end causal discovery and inference, including the DECI amortized causal-discovery model. |
| DAGMA | Python | Apache-2.0 | 🟡 maintained · 2024-01 | Python package learning DAGs via continuous optimization using an M-matrix log-determinant acyclicity characterization (DAGMA). |
| dodiscover | Python | MIT | 🟢 active · 2026-05 | PyWhy's experimental causal discovery package providing constraint-based and other global structure-learning algorithms with a scikit-learn-style API. |
| gCastle | Python | Apache-2.0 | 🟢 active · 2026-06 | Python causal structure learning toolbox emphasizing gradient-based methods (NOTEARS, GraN-DAG, etc.) plus data simulators and SHD/F1 evaluation metrics. |
| gimme | R | GPL-2.0-or-later | 🟢 active · 2026-03 | R package (Group Iterative Multiple Model Estimation) that recovers group- and individual-level directed contemporaneous/lagged network structure from time series via unified SEM search. |
| LiNGAM | Python | MIT | 🟢 active · 2026-05 | Python package implementing the LiNGAM family (ICA-LiNGAM, DirectLiNGAM, VAR-LiNGAM, RCD, etc.) for causal discovery in linear non-Gaussian models. |
| NOTEARS | Python | Apache-2.0 | 🟢 active · 2026-05 | Reference implementation of NO TEARS, casting DAG structure learning as a continuous optimization with a smooth acyclicity constraint. |
| pcalg | R | GPL-2.0-or-later | 🟡 maintained · 2024-09 | Canonical R package for graphical-model causal structure learning (PC, FCI, RFCI, GIES) and causal effect estimation (IDA). |
| pgmpy | Python | MIT | 🟢 active · 2026-06 | Python toolkit for probabilistic graphical models with Bayesian network structure learning (PC, Hill-Climb, etc.), parameter learning, inference and causal reasoning. |
| py-tetrad | Python · Java | MIT | 🟢 active · 2026-05 | Python interface (via JPype) exposing the Java Tetrad causal-discovery algorithms in Python workflows. |
| pyAgrum / aGrUM | Python · C++ | LGPL-3.0-or-MIT | 🟢 active · 2026-01 | C++/Python library for probabilistic graphical models (Bayesian networks) with structure learning and causal do-calculus support. |
| pywhy-graphs | Python | MIT | 🟢 active · 2026-05 | NetworkX-compliant causal graph data structures (ADMG, PAG, CPDAG) underpinning the PyWhy causal-discovery ecosystem. |
| Tetrad | Java | GPL-3.0 | 🟢 active · 2026-06 | Long-running Java toolkit and GUI for causal discovery and graphical-causal-model search, the reference implementation of many constraint- and score-based algorithms. |
| tigramite | Python | GPL-3.0 | 🟢 active · 2026-01 | Python package for causal discovery in time series via the PCMCI/PCMCI+/LPCMCI family of conditional-independence-based algorithms. |
| typed-DAG (t-DAG) | Python | Apache-2.0 | 🔴 dormant · 2023-07 | Reference implementation of causal discovery with typed directed acyclic graphs, integrating variable-type knowledge into structure learning. |
| Tool | Lang | License | Status | What it does |
|---|---|---|---|---|
| Agent Laboratory | Python | MIT | 🟡 maintained · 2025-08 | End-to-end autonomous research workflow with literature-review, experimentation, and report-writing phases (and AgentRxiv shared-preprint collaboration) to turn a human research idea into a paper plus code. |
| Agentic Data Scientist | Python | MIT | 🟢 active · 2026-05 | An adaptive multi-agent framework (Google ADK + Claude Agent SDK) that separates planning from execution with continuous validation to complete end-to-end data-science tasks. |
| AI Data Science Team | Python | MIT | 🟢 active · 2025-12 | A library of specialized LLM agents (data cleaning, EDA, feature engineering, SQL, H2O AutoML, visualization) orchestrated by a supervisor to automate common data-science tasks. |
| AI-Researcher | Python | unverified | 🟡 maintained · 2025-10 | Fully autonomous research system (NeurIPS 2025) that runs the whole pipeline from literature review and idea generation through algorithm implementation to manuscript writing, primarily for AI/ML research. |
| AIDE (aideml) | Python | MIT | 🟢 active · 2026-05 | Tree-search ML-engineering agent that autonomously drafts, debugs, and benchmarks code to maximize a user-defined metric, reaching strong Kaggle/MLE-bench performance. (Overlaps with the data-science agent bucket.) |
| Auto-Analyst | TypeScript · Python | MIT | 🟢 active · 2026-05 | A modular multi-agent AI data-scientist platform (DSPy-based) automating cleaning, statistical analysis, scikit-learn modeling, and Plotly visualization. |
| Auto-Deep-Research | Python | MIT | 🟡 maintained · 2025-02 | A cost-efficient open Deep Research alternative (built on the AutoAgent framework) that autonomously gathers and synthesizes web information; strong on GAIA. |
| AutoGluon Assistant (MLZero) | Python | Apache-2.0 | 🟢 active · 2026-03 | A multi-agent system that transforms raw multimodal data (tabular, image, text, audio) into trained ML solutions end-to-end with zero human intervention, using MCTS-guided code generation over AutoGluon. |
| AutoKaggle | Python | Apache-2.0 | 🟡 maintained · 2024-12 | A multi-agent framework with five cooperating agents that autonomously complete Kaggle tabular competitions across six pipeline phases. |
| AutoMind | Python | MIT | 🟢 active · 2025-10 | An adaptive, knowledge-grounded data-science agent using an expert knowledge base plus agentic tree search to build ML pipelines (beats AIDE on MLE-bench). |
| AutoResearchClaw | Python | MIT | 🟢 active · 2026-06 | Self-reinforcing 23-stage autonomous research pipeline (literature discovery, multi-agent hypothesis debate, sandboxed self-healing experiments, peer review, LaTeX export) that turns an idea into a conference-ready paper. |
| AutoSurvey | Python | unverified | 🔴 dormant · 2025-02 | NeurIPS 2024 method that automatically writes comprehensive literature surveys via retrieval, parallel subsection drafting by specialized LLMs, and iterative refinement with automated evaluation. |
| Aviary | Python | Apache-2.0 | 🟢 active · 2026-06 | Gymnasium/framework of language-agent environments for challenging scientific tasks (literature QA, DNA manipulation, protein engineering) used to build and train autonomous research agents. |
| Biomni | Python | Apache-2.0 | 🟢 active · 2025-10 | A general-purpose autonomous biomedical research agent combining LLM reasoning, retrieval-augmented planning, and code execution over a large library of biomedical tools. |
| ChemCrow | Python | MIT | 🔴 dormant · 2024-03 | An LLM agent augmented with chemistry tools (RDKit, paper-qa, reaction/retrosynthesis databases) that autonomously solves reasoning-intensive chemistry tasks. |
| Coscientist | Python | Apache-2.0 (Commons Clause) | 🔴 dormant | An LLM-driven autonomous lab agent (from the Nature paper) that plans, designs, and optimizes chemical experiments and synthesis. |
| Curie | Python | Apache-2.0 | 🟡 maintained · 2025-09 | AI agent framework for rigorous, automated scientific experimentation that handles the full hypothesis-to-analysis loop (experiment design, environment setup, execution, analysis) with reproducibility guarantees. |
| CycleResearcher | Python | unverified | 🟢 active · 2026-03 | Open-source ecosystem of trained models (CycleResearcher + CycleReviewer) that iteratively generate research papers and improve them via automated peer review, focused on ML research. |
| Data Formulator | TypeScript · Python | MIT | 🟢 active · 2026-05 | An AI tool with data-loading, exploration, and chart-style-refinement agents that transform and visualize data via a blend of UI interactions and natural language. |
| Data-Copilot | Python | MIT | 🔴 dormant · 2023 | An LLM agent that self-designs interface tools then dispatches them to autonomously query, process, analyze, and visualize (financial) data. |
| data-to-paper | Python | MIT | 🟡 maintained · 2025-07 | Multi-agent system that goes from a raw dataset and research goal to a verifiable, data-traceable scientific paper, emphasizing reproducibility in data-driven (e.g. biomedical/clinical) research. |
| DataMind | Python | Apache-2.0 | 🟢 active · 2026-06 | An open data-synthesis + agent-training recipe yielding generalist data-analytic LLMs (DataMind-7B/14B) that do multi-step, code-based reasoning over CSV/Excel/SQLite. |
| deep-research (dzhng) | TypeScript | MIT | 🟢 active · 2026-04 | A compact open-source deep-research agent that recursively searches, scrapes, and reasons over the web to produce reports, tracking goals across iterations. |
| DeepAnalyze | Python | MIT | 🟢 active · 2026-03 | An agentic LLM (DeepAnalyze-8B) that autonomously runs the end-to-end data-science pipeline from raw structured/semi-structured/unstructured data to analyst-grade research reports. |
| DeepEye | Python · TypeScript | Apache-2.0 | 🟢 active · 2026-05 | A production-ready 'self-driving' data agent system that autonomously orchestrates multi-step workflows to produce dashboards, analytical reports, and data videos from heterogeneous data. |
| DS-Agent | Python | unverified | 🔴 dormant · 2024 | An ICML'24 data-science agent that uses case-based reasoning over Kaggle expert knowledge to iteratively build and train ML models across tabular/text/time-series. |
| freephdlabor | Python | MIT | 🟢 active · 2026-05 | Customizable multi-agent framework (ManagerAgent orchestrating Ideation/Experiment/Writeup agents) for building personalized systems that run continuous autonomous research toward publication-grade reports. |
| GPT Researcher | Python · TypeScript | Apache-2.0 | 🟢 active · 2026-05 | Autonomous deep-research agent that plans sub-questions, scrapes and aggregates many web/local sources, and synthesizes a long-form cited research report. (Also relevant to the data-science/deep-research bucket.) |
| Jupyter AI | Python | BSD-3-Clause | 🟢 active · 2026-04 | A JupyterLab extension (v3) connecting agentic AI models to notebooks so they can read/write files, run code, and act via a built-in MCP server for data work. |
| LIDA | Python | MIT | 🔴 dormant · 2024-03 | An LLM agent that automatically summarizes data, generates analysis goals, and writes/executes/edits visualization code (treating viz as code) across grammars. |
| MetaGPT (Data Interpreter / SELA) | Python | MIT | 🟢 active · 2026-01 | Multi-agent framework whose Data Interpreter (and SELA tree-search AutoML extension) agent plans, writes, and self-debugs code to solve data-analysis, ML, and modeling tasks. |
| MLE-Agent | Python | MIT | 🔴 dormant · 2024-10 | An AI companion that autonomously builds ML/AI baselines and end-to-end solutions (incl. Kaggle) with integrated arXiv/paper search. |
| MLR-Copilot | Python | unverified | 🟡 maintained · 2025-03 | Machine-learning research assistant framework where LLM agents autonomously generate research ideas from papers and implement/execute the corresponding experiments. |
| Open Deep Research (LangChain) | Python | MIT | 🟢 active · 2025-08 | A configurable, fully open-source deep-research agent (LangGraph-based) that works across many model/search providers; ranks on Deep Research Bench. |
| Open Deep Research (nickscamara/Firecrawl) | TypeScript | Apache-2.0 | 🟡 maintained · 2025-02 | An open Deep Research clone that reasons over large amounts of web data extracted via Firecrawl to generate research analyses. |
| Open Interpreter | Python | AGPL-3.0 | 🔴 dormant · 2024-10 | A natural-language code-execution agent that runs Python/shell locally to plot, clean, and analyze datasets (and general computer tasks), with human approval of generated code. |
| OpenResearcher | Python | Apache-2.0 | 🔴 dormant · 2024-10 | AI research-assistant platform that uses retrieval-augmented generation over scientific literature to autonomously answer research questions, summarize, and recommend papers with source citations. |
| PandasAI | Python | MIT | 🟢 active · 2025-10 | A conversational data-analysis agent that turns natural-language questions over CSV/SQL/parquet data lakes into executed analysis code and charts. |
| Paper2Code (PaperCoder) | Python | Apache-2.0 | 🟢 active · 2026-03 | Multi-agent LLM system that autonomously converts an ML research paper into a faithful, runnable code repository via planning, analysis, and generation stages. |
| PaperQA2 | Python | Apache-2.0 | 🟢 active · 2026-03 | Agentic high-accuracy RAG system over full-text scientific literature that autonomously retrieves, ranks, and synthesizes cited answers and literature summaries with superhuman accuracy on QA/contradiction tasks. |
| RD-Agent | Python | MIT | 🟢 active · 2026-05 | Microsoft's R&D automation framework that iteratively proposes hypotheses and implements/evolves them as code, targeting data-driven R&D such as quantitative finance factor/model discovery and ML engineering. |
| ResearchAgent | Python | unverified | 🟡 maintained · 2025-08 | LLM system (NAACL 2025) that iteratively generates research problems, methods, and experiment designs grounded in an academic citation graph, refined by collaborating reviewing agents. |
| Robin | Python | Apache-2.0 | 🟢 active · 2026-04 | Multi-agent system (built on Aviary/PaperQA) that automates therapeutics discovery by generating hypotheses, proposing experiments, and analyzing experimental data, demonstrated by identifying a novel dry-AMD drug candidate. |
| STORM / Co-STORM | Python | MIT | 🟡 maintained · 2025-09 | LLM knowledge-curation system that researches a topic via multi-perspective simulated expert conversations and web search to autonomously synthesize a full, Wikipedia-style cited report (Co-STORM adds human-in-the-loop). |
| SurveyX | Python | unverified | 🟡 maintained · 2026-01 | Academic survey-automation system that takes a title and keywords and autonomously retrieves literature and generates a structured, cited survey paper (open-source release is offline-only; full service is hosted). |
| TableGPT Agent | Python | Apache-2.0 | 🟡 maintained · 2025-03 | A LangGraph-based pre-built agent for the TableGPT2 model that answers analytical questions and runs code over tabular datasets. |
| TaskWeaver | Python | MIT | 🔴 dormant · 2026-03 | A code-first agent framework that plans and executes data-analytics tasks via generated Python, with stateful code/plugin memory (repo archived March 2026). |
| The AI Scientist | Python | AI Scientist Source Code License v1.0 (custom, Responsible-AI based) | 🟡 maintained · 2025-12 | Fully automated pipeline that generates ML research ideas, writes and runs experiment code, and drafts complete LaTeX papers with an automated reviewer, in machine-learning domains. |
| The AI Scientist-v2 | Python | AI Scientist Source Code License v1.0 (custom, Responsible-AI based) | 🟡 maintained · 2025-12 | Template-free successor to The AI Scientist that uses agentic tree search and an experiment-manager agent to autonomously produce workshop-level ML papers end-to-end. |
| The Virtual Lab | Python | MIT | 🟡 maintained · 2025-12 | Team of LLM agents (AI PI, domain researchers, scientific critic) that hold structured meetings to autonomously design scientific pipelines, demonstrated by designing new SARS-CoV-2 nanobodies. |
| Virtual Scientists (VirSci) | Python | Apache-2.0 | 🟡 maintained · 2025-07 | Multi-agent 'science of science' system (ACL 2025) that simulates teams of scientist agents through team organization and inter/intra-team discussion to autonomously generate and evaluate novel research ideas. |
| Tool | Lang | License | Status | Data source / what it serves |
|---|---|---|---|---|
| Academix | Python | MIT | 🟢 active · 2026-02 | Aggregator: OpenAlex, DBLP, Semantic Scholar, arXiv, Crossref |
| akshare-one MCP | Python | MIT | 🟢 active · 2026-03 | AKShare (Chinese stock market data) |
| Alpha Vantage MCP (calvernaz) | Python | Apache-2.0 | 🟢 active · 2026-02 | Alpha Vantage (stocks, FX, crypto) |
| Alpha Vantage MCP Server (official) | Python | MIT | 🟢 active · 2026-05 | Alpha Vantage (stocks, FX, crypto, fundamentals) |
| ArXiv MCP Server | Python | Apache-2.0 | 🟢 active · 2026-05 | arXiv (preprints) |
| BEA MCP Server (mcp-bea) | TypeScript | unverified | 🟡 maintained · 2026-01 | BEA (US Bureau of Economic Analysis, GDP/income) |
| bioRxiv MCP Server | Python | unverified | 🟡 maintained · 2025-03 | bioRxiv (biology preprints) |
| BLS Labor MCP Server | TypeScript | NOASSERTION | 🟢 active · 2026-06 | BLS (US Bureau of Labor Statistics) |
| Crossref MCP Server (JackKuo666) | Python | unverified | 🟡 maintained · 2025-04 | Crossref (DOI metadata, 150M+ works) |
| Data Commons Agent Toolkit (official MCP) | Python | Apache-2.0 | 🟢 active · 2026-06 | Google Data Commons (unified public datasets) |
| Data.gov MCP Server | JavaScript | MIT | 🟡 maintained · 2025-04 | Data.gov (US government open data catalog) |
| doi-mcp (citation verifier) | TypeScript | unverified | 🟢 active · 2026-05 | Aggregator: Crossref, OpenAlex, etc. (citation verification by DOI) |
| Eurostat MCP (ano-kuhanathan) | Python | MIT | 🟡 maintained · 2026-01 | Eurostat (EU official statistics) |
| Eurostat MCP (dcerecedo) | Python | NOASSERTION | 🟢 active · 2026-03 | Eurostat (EU official statistics) |
| FinanceMCP (Tushare + Binance) | JavaScript | MIT | 🟢 active · 2026-05 | Tushare (China A-shares, macro) + Binance (crypto) |
| FRED MCP Server (stefanoamorelli) | TypeScript | AGPL-3.0 | 🟢 active · 2026-05 | FRED (Federal Reserve Economic Data, 800k+ series) |
| IMF Data MCP Server | Python | Apache-2.0 | 🟢 active · 2026-04 | IMF (data.imf.org SDMX API) |
| Jupyter MCP Server (Datalayer) | Python | BSD-3-Clause | 🟢 active · 2026-05 | MCP server for Jupyter that lets an agent execute notebook cells and run Python/code in a live kernel with multimodal output. |
| MCP-DBLP | Python | MIT | 🟢 active · 2026-04 | DBLP (computer-science bibliography) |
| mcp-fred (cfdude) | Python | unverified | 🟢 active · 2026-03 | FRED (Federal Reserve Economic Data) |
| mcp-stata (tmonk) | Python | AGPL-3.0 | 🟢 active · 2026-05 | Lightweight Stata MCP server that executes commands, inspects data, retrieves stored r()/e() results, and views graphs in a chat interface. |
| mcptools (Model Context Protocol for R) | R | NOASSERTION | 🟢 active · 2026-03 | Posit's official R package that turns a running R session into an MCP server (and client) so agents can execute R code and call R functions as tools. |
| Nasdaq Data Link MCP Server | Python | MIT | 🟡 maintained · 2025-10 | Nasdaq Data Link / Quandl (alternative + financial time series) |
| OECD MCP Server | TypeScript | MIT | 🟢 active · 2026-04 | OECD (SDMX, 5,000+ datasets) |
| OpenAlex MCP (reetp14) | TypeScript | MIT | 🟡 maintained · 2025-07 | OpenAlex (scholarly works, authors, institutions) |
| OpenAlex Research MCP | JavaScript | MIT | 🟢 active · 2026-05 | OpenAlex (240M+ scholarly works) |
| OpenEcon Data MCP Server | Python | NOASSERTION | 🟢 active · 2026-05 | Aggregator: FRED, World Bank, IMF, Eurostat, BIS, UN Comtrade (330K indicators) |
| Paper Search MCP | Python | MIT | 🟢 active · 2026-05 | Aggregator: arXiv, PubMed, bioRxiv, Semantic Scholar, OpenAlex, Crossref, CORE, dblp, etc. |
| paper-distill-mcp | Python | AGPL-3.0 | 🟢 active · 2026-03 | Scholarly sources (paper search/curation) |
| PubMed MCP Server (cyanheads) | TypeScript | Apache-2.0 | 🟢 active · 2026-06 | PubMed + Europe PMC + Unpaywall (biomedical literature/full text) |
| PubMed MCP Server (JackKuo666) | Python | MIT | 🟡 maintained · 2025-05 | PubMed (35M+ biomedical citations) |
| pubmed-search-mcp (u9401066) | Python | NOASSERTION | 🟢 active · 2026-05 | Aggregator: PubMed, Europe PMC, CORE, OpenAlex (biomedical) |
| RMCP (R MCP Server) | Python | MIT | 🟡 maintained · 2025-12 | MCP server exposing 50+ R statistical-analysis tools (regression, econometrics, time series, ML) backed by CRAN packages for AI agents. |
| SEC EDGAR MCP | Python | AGPL-3.0 | 🟢 active · 2026-05 | SEC EDGAR (US public-company filings, XBRL financials) |
| Semantic Scholar MCP Server (JackKuo666) | Python | unverified | 🟡 maintained · 2025-03 | Semantic Scholar (200M+ papers, citations) |
| Simple PubMed MCP (andybrandt) | Python | MIT | 🟢 active · 2026-03 | PubMed (biomedical literature) |
| Stata MCP (hanlulong) | Python | MIT | 🟢 active · 2026-04 | Stata MCP extension for VS Code, Cursor and Antigravity that executes Stata commands and do-files from an AI assistant. |
| Stata MCP (SepineTam / mcp-for-stata) | Python | AGPL-3.0 | 🟢 active · 2026-06 | MCP server that lets an LLM agent write and run Stata regressions and econometric do-files for paper replication and hypothesis testing (repo renamed to mcp-for-stata). |
| StatsPAI MCP Server | Python | MIT | 🟢 active · 2026-06 | Agent-native causal inference and econometrics toolkit (DiD, IV, RDD, synth, DML, Bayesian, causal discovery) exposed as an MCP server with 900+ estimator tools. |
| Tushare MCP (buuzzy) | Python | MIT | 🟢 active · 2026-02 | Tushare (China A-shares financial data) |
| Unpaywall MCP Server | TypeScript | MIT | 🟢 active · 2026-04 | Unpaywall (open-access full text by DOI) |
| US Census Bureau Data API MCP (official) | TypeScript | CC0-1.0 | 🟢 active · 2026-03 | US Census Bureau Data API (ACS, demographics) |
| US Government Open Data MCP | TypeScript | MIT | 🟢 active · 2026-04 | 40+ US government APIs (Treasury, FRED, Congress, FDA, CDC, FEC) |
| World Bank Data360 MCP (official) | Python | NOASSERTION | 🟢 active · 2026-06 | World Bank Data360 (development indicators, 200+ countries) |
| World Bank Open Data MCP (anshumax) | Python | unverified | 🟡 maintained · 2025-08 | World Bank Open Data API |
| Yahoo Finance MCP (Alex2Yang97) | Python | MIT | 🟢 active · 2026-03 | Yahoo Finance (via yfinance) |
| yfinance MCP (narumiruna) | Python | MIT | 🟢 active · 2026-06 | Yahoo Finance (via yfinance) |
| Zotero MCP | Python | MIT | 🟢 active · 2026-05 | Zotero (personal reference library) |
| Tool | Lang | License | Status | What it does |
|---|---|---|---|---|
| ACIC Competition data (aciccomp) | R | unverified | 🔴 dormant · 2020-07 | R packages with the data-generating processes and simulated datasets (with known ground-truth effects) from the 2016/2017 Atlantic Causal Inference Conference competitions. |
| bnlearn Bayesian Network Repository | R | unverified | 🟡 maintained · 2025 | Curated collection of reference Bayesian networks (ASIA, ALARM, HEPAR2, etc.) with known ground-truth structure in multiple formats, the standard ground-truth benchmark for structure-learning evaluation. |
| CausalBench | Python | Apache-2.0 | 🟡 maintained · 2025-06 | GSK benchmark suite (with curated single-cell perturbation datasets) for evaluating network/causal-graph inference methods from gene-perturbation data. |
| causaldata | R · Python · Stata | unverified | 🟡 maintained · 2024-11 | R/Python/Stata packages providing the example datasets (LaLonde, NSW, etc.) used in The Effect, Causal Inference: The Mixtape, and What If textbooks. |
| CEVAE datasets (IHDP) | Python | unverified | 🔴 dormant · 2020-07 | Reference repo bundling the widely-cited IHDP (Infant Health and Development Program) semi-synthetic benchmark CSVs with known ground-truth treatment effects used across ITE papers. |
| JustCause | Python | MIT | 🔴 dormant · 2020-03 | Python framework providing standard causal-inference benchmark datasets (IHDP, IBM ACIC) plus synthetic-data generation and baseline comparison for evaluating ITE methods. |
| RealCause | Python | MIT | 🔴 dormant · 2021-03 | Realistic causal-inference benchmark that fits generative models to real data (LaLonde PSID/CPS, Twins) to produce samples with known ground-truth treatment effects. |
| Tübingen Cause-Effect Pairs | — | unverified | 🟡 maintained · 2023 | Standard benchmark of ~100 bivariate cause-effect pairs with ground-truth causal direction for evaluating pairwise causal-discovery methods (Mooij et al. 2016). |
| WhyNot | Python | MIT | 🔴 dormant · 2020-06 | Python sandbox of dynamic simulators with known ground-truth causal effects for stress-testing causal-inference and sequential decision-making methods. |
Inclusion ≠ endorsement. Licenses/activity were verified during curation but change over time; confirm upstream before relying on a tool in a high-trust context. To propose a tool, see README.md.