Leverageshap Docs by FabianK-Dev · Pull Request #542 · mmschlk/shapiq

FabianK-Dev · 2026-06-08T12:07:39Z

Motivation and Context

Public API Changes

No Public API changes
Yes, Public API changes (Details below)

How Has This Been Tested?

Checklist

The changes have been tested locally.
Documentation has been updated (if the public API or usage changes).
An entry has been added to CHANGELOG.md (if relevant for users).
The code follows the project's style guidelines.
I have considered the impact of these changes on the public API.

Forward-looking spec for the 3 new SV approximators (LeverageSHAP, PolySHAP, OddSHAP). Approximator classes are looked up dynamically by name, so the file auto-skips classes that have not yet been registered in shapiq.approximator. As each implementation lands, the corresponding parametrizations activate. - Interface conformance (always required): index='SV', n_players, max_order/min_order, values shape and dtype, interaction_lookup. - Numerical convergence vs ExactComputer (xfail strict=False): atol schedule by budget percentage. - Determinism: same (n, random_state, budget, game) -> bit-identical output. 75 tests, all currently SKIP on main. Will activate as classes land.

Honors the cross-method testing platform promised to the tutor: unified harness covering every SV approximator in shapiq (the existing 11 — KernelSHAP, SVARM, Permutation*, ProxySPEX, ... — and the 3 new ones from this project) instead of only the new line-up. Approximator list is sourced dynamically from shapiq.approximator.SV_APPROXIMATORS (canonical registry) plus the 3 new project names, deduplicated. Future shapiq additions land in the harness automatically. Split into two scopes: * test_interface_conformance — strict shape/dtype/index/lookup contract from the API spec. Applied ONLY to the 3 new approximators (the contract is ours; existing methods have different default output conventions like ProxySPEX defaulting to FBII and max_order=n). * test_numerical_convergence_vs_exact + test_determinism — apply to ALL SV approximators. Cross-method comparison against ExactComputer ground truth on identical SOUM games. xfail with strict=False so methods that do converge surface as XPASS; methods still under development surface as XFAIL. Two robustness helpers: * _construct_or_skip — tries (n=, index='SV', max_order=1, random_state=) first (covers multi-index methods like SPEX, ProxySPEX, ProxySHAP, MSRBiased, kADDSHAP), then falls back to minimal signature for SV-only methods (KernelSHAP, OwenSamplingSV). * _safe_approximate — skips on ValueError raised by approximators that explicitly refuse a regime (e.g. SPEX 'Insufficient budget to compute the transform' at low budgets). Results: 10 passed, 95 skipped, 90 xfailed, 23 xpassed. The 23 xpassed are existing shapiq SV methods that converge cleanly at full budget on small SOUM — a useful baseline for the upcoming benchmark report.

Drop-in framework that any teammate can merge into their feature branch to run head-to-head benchmarks against ExactComputer across every SV approximator in shapiq, then plot the standard SHAP-literature metric curves. No source files are modified — adds a top-level benchmark/ package, a single test file, and a small in-place test-helper sys.path hook. Does not touch pyproject.toml or any other upstream config. Files added: * benchmark/__init__.py: makes the runner a proper Python package so invocation is 'python -m benchmark.performance'. * benchmark/_discovery.py: single source of truth for SV approximator discovery + SV-mode construction. Holds: - PROJECT_APPROXIMATOR_NAMES: LeverageSHAP, PolySHAP, PolySHAPKAdd / Partial / Prior, OddSHAP. - _SV_CONSTRUCT_OVERRIDES: per-class kwargs for non-standard constructors (PolySHAP variants need max_order / n_explanation_terms / q_prior). - construct_for_sv(): three-stage construction (override -> explicit SV signature -> minimal signature), returning (estimator, exc) so the caller can report the most informative exception. A ValueError from inside a matched signature wins over a TypeError from a signature mismatch. - safe_approximate(): catches ValueError and RuntimeError so sparse approximators that refuse a budget regime (SPEX, ProxySPEX, ...) skip the cell cleanly instead of crashing. * benchmark/performance.py: CLI runner that consumes _discovery, sweeps (method, game, budget, seed), records every cell in a long-format CSV, and emits one PNG per (game, metric) plus a runtime PNG. Seven metrics chosen from the union of LeverageSHAP, PolySHAP, OddSHAP and shapiq.benchmark.metrics literature: MSE / MAE / SSE / SAE / Precision@5 / Precision@10 / KendallTau. Includes a '--check' interface-probe mode that prints a constructibility table without running a sweep. * benchmark/README.md: usage doc covering merge workflow, --check, sweep CLI, output layout, CSV format, metric definitions, plot conventions, and notes on the multi-index approximators that need explicit (index='SV', max_order=1). Files modified: * tests/shapiq/tests_unit/tests_approximators/test_approximators_vs_exact.py: now imports the shared helpers from benchmark._discovery via a tightly-scoped sys.path hook at the top of the file. Picks up the ValueError-priority construction and the RuntimeError-catch that the test file previously did not have. Interface conformance is now applied to the project's six new approximator names (LeverageSHAP, PolySHAP + 3 variants, OddSHAP), so Matthias's PolySHAP variants are no longer silently skipped by the contract check. Verified locally: * pytest test_approximators_vs_exact.py: 10 passed, 170 skipped, 87 xfailed, 26 xpassed. No failures. * python -m benchmark.performance --check: surfaces all 17 method names (11 existing on main + 6 project additions) correctly. * Drop-in compatibility verified by temporary merge into all three feature branches (oddshap_approximator, leverageSHAP, PolySHAP) — clean merge in each, --check picks up the local approximator.

…tions set, Z_list and probs_list and create all-true and all-false coalitions

…x pre-commit errors

…determinism on LeverageSHAP()

…ferent game variables to avoid access counters interfering; Also compare metadata

…ames produce (slightly) different outputs

…ncreased budget

…d tiny-n edge case and add comments to document and explain the test

…use its core claim was not reliable With n = 6, a budget of 100 is above 2^n = 64, so the implementation enters the full-budget/exact regime. In that regime, the result should be identical no matter which seed you use, so asserting that different seeds must differ is false and will fail even though the code is correct. => I lowered the budget to budget=20

…o test_exact_regime_seed_independence and test_stochastic_regime_seed_variability

…stochastic regime

…est_exact_matches_multiple_small_games, test_null_player_axiom and test_minimal_budget_sweep

…er results

…her n to avoid minimal floating errors

…to base regression class

…using @pytest.mark.parametrize("seed", DIVERSE_SEEDS) to prevent accidental overfitting on fixed seeds

…page-summary file that are not ready for review, yet

…d WIP 1-page-summary file that are not ready for review, yet" This reverts commit 45b0035.

…geshap-docs

mmschlk · 2026-06-10T08:39:07Z

@FabianK-Dev, What is different from this PR to #524? I would like to bundle all of the leverage shap implementation together into the Leverageshap PR and not have these separate here.

FabianK-Dev · 2026-06-10T09:05:21Z

@FabianK-Dev, What is different from this PR to #524? I would like to bundle all of the leverage shap implementation together into the Leverageshap PR and not have these separate here.

Hello @mmschlk. Thanks for reviewing. The intention behind this PR was to separate task 4 from the PR #524 so that we can e.g. work on the benchmark and discussion file (as requested in task 4 of the project description) without "bloating up" #524 with more commits. That way you don't have to review any new incoming commits again that aren't part of task 1-3.

I would like to bundle all of the leverage shap implementation together into the Leverageshap PR and not have these separate here.

No problem! If you want to I can continue working on task 4 in #524. In that case, feel free to close this PR. 👍

mmschlk · 2026-06-10T18:38:08Z

Ah I see. Or actually I assume I see, I do not really see a "Task 4" in the PR you linked. 😅

So I will only look at the other PR first, and once this got merged take a look at this one. You can merge/rebase from the other PR branch then into this. :)

FabianK-Dev · 2026-06-10T18:59:33Z

Ah I see. Or actually I assume I see, I do not really see a "Task 4" in the PR you linked. 😅

Great catch! I just opened this as an empty placeholder/draft PR for now. I will push the actual discussion file and anything related to task 4 here later.

So I will only look at the other PR first, and once this got merged take a look at this one. You can merge/rebase from the other PR branch then into this. :)

Sounds great, thanks, will do!

42logos and others added 30 commits May 11, 2026 02:46

feat/gitignore: Add benchmark/results/* to .gitignore

6bede21

leverageSHAP sceleton

2447992

feat/SG-20/leverageshap: Add budget validation, initialize seen_coali…

67400f8

…tions set, Z_list and probs_list and create all-true and all-false coalitions

feat/SG-20/leverageshap: Add leverage score sampling loop

80f4248

feat/SG-20/leverageshap: Run uv run pre-commit run --all-files and fi…

f02e5ed

…x pre-commit errors

feat/SG-20/leverageshap: Add comments to document and explain code

e4d2ff6

implementation of _solver function + first tests

0f8a940

feat/SG-20/leverageshap: Add reference for Lemma 3.2

6a1bb98

feat/SG-22/testing: Add basic test_reproducibility test => Tests for …

44ba153

…determinism on LeverageSHAP()

feat/SG-22/testing: Improve test_reproducibility(): Split up into dif…

bfa15ee

…ferent game variables to avoid access counters interfering; Also compare metadata

feat/SG-22/testing: Test whether different seeds of identical dummy g…

aecacdf

…ames produce (slightly) different outputs

feat/SG-22/testing: Test whether approximation error decreases with i…

109d14a

…ncreased budget

feat/SG-22/testing: Add comments to make test more understandable

2fbabae

Fix: DRY principle applied

030aa00

fix: solve_regression for rank-deficient matrices

1337b00

feat/SG-22/testing: Add tests for exact matches with ExactComputer an…

30acd7b

…d tiny-n edge case and add comments to document and explain the test

feat/SG-22/testing: Split test_reproducibility_different_seeds up int…

058e5d7

…o test_exact_regime_seed_independence and test_stochastic_regime_seed_variability

feat/SG-22/testing: Add test for pairing trick variance reduction in …

e09f1f0

…stochastic regime

feat/SG-22/testing: Add test_leverageshap_vs_kernelshap_mean_error, t…

7267e63

…est_exact_matches_multiple_small_games, test_null_player_axiom and test_minimal_budget_sweep

reproduceability test and changes necessary to actually reproduce pap…

4cb7602

…er results

feat/SG-22/testing: Update test_empirical_convergence_rate to use hig…

6ff9ad3

…her n to avoid minimal floating errors

feat/gitignore: Add benchmark/results/* to .gitignore

097de27

feat/Add DISCUSSION.md (WIP)

0d87b9c

feat/SG-69/refactor: Refactor leverageshap.py to move solve method in…

a772261

…to base regression class

feat/SG-69/refactor: handle NaN/Inf values in regression calculations

3a06f5a

feat/SG-22/testing: Add a list of different fixed seeds and evaluate …

9fed440

…using @pytest.mark.parametrize("seed", DIVERSE_SEEDS) to prevent accidental overfitting on fixed seeds

FabianK-Dev and others added 4 commits June 8, 2026 13:57

feat/SG-22/testing: Fix test by expecting game.access_counter <= 2**n

3a48458

additional unit tests

bcfc038

chore/leverageshap: remove WIP notebook DISCUSSION.md file and WIP 1-…

45b0035

…page-summary file that are not ready for review, yet

Revert "chore/leverageshap: remove WIP notebook DISCUSSION.md file an…

6ede9b5

…d WIP 1-page-summary file that are not ready for review, yet" This reverts commit 45b0035.

github-project-automation Bot added this to shapiq development Jun 8, 2026

FabianK-Dev mentioned this pull request Jun 8, 2026

Implement LeverageSHAP approximator #524

Open

7 tasks

Merge remote-tracking branch 'origin/wu/conformance-test' into levera…

4ac5580

…geshap-docs

mmschlk moved this to 📋 Backlog in shapiq development Jun 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leverageshap Docs#542

Leverageshap Docs#542
FabianK-Dev wants to merge 35 commits into
mmschlk:mainfrom
FabianK-Dev:leverageshap-docs

FabianK-Dev commented Jun 8, 2026

Uh oh!

mmschlk commented Jun 10, 2026

Uh oh!

FabianK-Dev commented Jun 10, 2026

Uh oh!

mmschlk commented Jun 10, 2026

Uh oh!

FabianK-Dev commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

FabianK-Dev commented Jun 8, 2026

Motivation and Context

Public API Changes

How Has This Been Tested?

Checklist

Uh oh!

mmschlk commented Jun 10, 2026

Uh oh!

FabianK-Dev commented Jun 10, 2026

Uh oh!

mmschlk commented Jun 10, 2026

Uh oh!

FabianK-Dev commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants