Smooth terms#198
Open
BerriJ wants to merge 2 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implementation complete. All 871 tests pass (14 new), ruff clean. Summary of what was built:
Phase 1+3 — splines.py (new)
make_bspline_knots/make_bspline_basis: equidistant-knot B-spline basis viascipy.interpolate.BSpline.design_matrix, with exact linear extrapolation beyond boundary knots (mgcvbs="ps"behavior)make_difference_penalty: Frobenius-normalizedmake_centering_constraint: sum-to-zero identifiability via QR null space of the weighted basis column means (make_demmler_reinsch_transform: fixed-at-fit reparameterization giving identity Gramian + diagonal penaltyPSplineTermStateandPSplineTerm(Term)withfeature,n_splines=20,degree=3,diff_order=2,lambda_=None(grid selection) or fixed,ic,forget,knot_padding=0.05; immutableupdatereturning new instances;edf_/lambda_selected_diagnostics; EDF-based IC selection usingcalculate_effective_training_lengthPhase 2 — solver & method
online_coordinate_descent_quadratic(runs CD ononline_coordinate_descent_quadratic_pathwithget_start_betawarm startsQuadraticPenaltyPath(EstimationMethod)— path-based, geometric λ-grid scaled bylambda_, reusesinit_gram/update_gramwith forgetting; registered as"quadratic_penalty"in the factoryVerification — test_terms_pspline.py: partition of unity, linear extrapolation, penalty annihilates degree-$(q{-}1)$ polynomials + bandedness, CD ≡ direct solve,$S=0$ ≡ plain CD (exact), warm-started path correctness, edf monotone in λ with bounds, large-λ polynomial limit, batch-vs-online Gram exactness, and full estimator integration on heteroskedastic $y \sim N(\sin 2x, e^{-0.5+0.3x})$ including
update.One notable deviation from the plan: plain CD on the raw centered basis is too ill-conditioned across the λ-path (descending warm starts hit max-iterations and return wrong solutions). I added the Demmler–Reinsch reparameterization fixed at fit time — Gramian becomes identity, penalty diagonal — so CD converges fast and exactly at every λ, while online Gram updates remain exact since the transform is a fixed linear map of the basis. Docs were added in terms_and_features.md and a runnable example in pspline_terms.py.