Skip to content

AmirNoori68/kan-review

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 

Repository files navigation

kan-review

A structured companion to our KAN review paper.

We welcome corrections, discussions, and new contributions — The updates below come from recent communications with researchers and newly released studies.
If you notice any missing or misattributed references, kindly contact amir_noori@hkbu.edu.hk so they can be added in the next GitHub update and preprint revision.

Quick Nav

  1. Citation
  2. Kolmogorov Superposition Theorem (KST) and Its Refinement Toward Neural Networks
  3. Review and Survey Papers on KANs
  4. Representative Repositories
  5. Bridging KANs and MLPs
  6. Basis Functions
  7. Accuracy Improvement
  8. Efficiency Improvement
  9. Sparsity & Regularization
  10. Convergence & Scaling Laws

1. Citation

Paper and repository reference information:

@article{GuideToKAN,
  title   = {A practitioner's guide to {Kolmogorov--Arnold} networks},
  author  = {Noorizadegan, Amir and Wang, Sifan and Ling, Leevan and Dominguez-Morales, Juan P.},
  journal = {Computer Science Review},
  volume  = {62},
  pages   = {100991},
  year    = {2026},
  issn    = {1574-0137},
  doi     = {10.1016/j.cosrev.2026.100991},
  url     = {https://www.sciencedirect.com/science/article/pii/S1574013726000997}
}

2. Kolmogorov Superposition Theorem (KST) and Its Refinement Toward Neural Networks

More detailed explanations and citations are provided in our KAN review paper.

Year Reference Key Contribution
1900 Hilbert Poses Hilbert's 13th problem
1956 Kolmogorov Preliminary idea of superpositions; first hint toward the theorem
1957 Arnol'd First explicit 3-variable construction (9 terms); counterexample to Hilbert 13
1957 Kolmogorov Full Kolmogorov Superposition Theorem; first general (n)-D proof
1958 Arnol'd Supplies missing lemmas; completes Kolmogorov’s proof
1962 Lorentz Simplified canonical form with a single outer function
1965 Sprecher First single universal inner function
1967 Fridman Shows universal inner functions can be taken Lipschitz-1
1980 de Figueiredo First network-like interpretation; block diagram + learned outer function (Chebyshev basis)
1987 Hecht–Nielsen First explicit neural mapping theorem based on KST
1989 Girosi–Poggio First rigorous critique: inner functions must be non-smooth; outer functions non-parametric
1989 Frisch et al. First computational implementation of Lorentz form; iterative outer-function learning
1991 Kurková First approximation-theoretic reinterpretation; relates network size to modulus of continuity
1992 Kurková Two-hidden-layer sigmoidal approximants; universal inner weights
1993 Sprecher Single universal inner function valid for all input dimensions
1993 Nakamura et al. First fully constructive version with guaranteed accuracy
1994 Nees First piecewise-linear inner maps with geometric error decay; constructive algorithm
1996 Sprecher First executable version of inner function with verified separation property
1997 Sprecher Explicit constructive algorithm for the outer functions
2002 Köppen Corrected continuous monotone inner function; first training-ready KST inner map
2003 Igelnik–Parikh Kolmogorov Spline Network (KSN): trainable spline-based inner/outer functions
2009 Braun–Griebel First correct constructive KST; repairs Sprecher’s scheme
2019 Actor–Knepley Proves (C^1) inner functions impossible; smoothness obstruction
2024 Liu Introduces KAN, the first deep architecture inspired by the Kolmogorov–Arnold representation

3. Review and Survey Papers on KANs

Title Paper
A Practitioner's Guide to Kolmogorov-Arnold Networks Noorizadegan
The first two months of Kolmogorov-Arnold Networks (KANs): A survey of the state-of-the-art Dutta
KAT to KANs: A review of Kolmogorov–Arnold Networks and the neural leap forward Basina
Scientific machine learning with Kolmogorov–Arnold Networks Faroughi
Kolmogorov-Arnold Networks: Overview of Architectures and Use Cases Essahraui
Kolmogorov–Arnold Networks for interpretable and efficient function approximation Andrade
Scalable and interpretable function-based architectures: A survey of Kolmogorov–Arnold Networks Beatrize
Convolutional Kolmogorov–Arnold Networks: A survey Kilani
Convolutional Kolmogorov–Arnold Networks Bonder
A survey on Kolmogorov–Arnold Networks Somvanshi
Kolmogorov-Arnold Networks: A Critical Assessment of Claims, Performance, and Practical Viability Hou

4. Representative Repositories (for regression, function approximation, and PDE solving)

Repository Description
.../pykan Official PyKAN for “KAN” and “KAN 2.0”.
.../Gaussian-KAN Pure Gaussian RBF-KAN implementation, focusing on Gaussian basis functions and scale-parameter effects.
.../PU-GKAN Partition-of-Unity Gaussian KAN implementation, using normalized Gaussian basis functions.
.../pinn_learnable_activation Compares various KAN bases vs. MLP on PDEs.
.../torchkan Simplified PyTorch KAN with multiple variants.
.../awesome-kan Curated list of KAN resources, projects, and papers.
.../Deep-KAN Spline-KAN examples and PyPI package.
.../RBF-KAN Gaussian RBF-based KAN implementation.
.../KANbeFair Fair benchmarking of KANs vs. MLPs.
.../efficient-kan Efficient PyTorch implementation of KAN.
.../jaxKAN JAX-based KAN with grid extension support.
.../fast-kan FastKAN using RBFs for acceleration.
.../faster-kan Uses reflectional switch activations.
.../LKAN Lightweight KAN variants and experiments.
.../neuromancer (fbkans branch) Partition of unity (FBKAN) for PDE solving.
.../relu_kan Minimal ReLU-KAN example.
.../MatrixKAN Matrix-parallelized KAN implementation.
.../PowerMLP MLP-type network with KAN-level expressiveness.
.../FourierKAN Fourier-based KAN layer.
.../FusedFourierKAN Optimized FourierKAN with fused GPU kernels.
.../fKAN Fractional KAN using Jacobi functions.
.../rKAN Rational KAN (Padé/Jacobi rational designs).
.../CVKAN Complex-valued KANs.
.../SincKAN Sinc-based KAN for PINN applications.
.../ChebyKAN Chebyshev polynomial-based KAN.
.../OrthogPolyKANs Orthogonal polynomial-based KAN implementations.
.../kaf_act RFF-based activation library.
.../KAF Kolmogorov–Arnold Fourier Networks.
.../HRKAN Higher-order ReLU-KANs.
.../KINN PIKAN for solid mechanics PDEs.
.../KAN_PointNet_CFD Jacobi-based KAN for CFD predictions.
.../FKAN-GCF FourierKAN-GCF for graph filtering.
.../KKANs_PIML Kurkova-KANs combining MLP with basis functions.
.../MLP-KAN MLP-augmented KAN activations.
.../kat Kolmogorov–Arnold Transformer.
.../FAN Fourier Analysis Network (FAN).
.../Basis_Functions Polynomial bases for KANs.
.../Wav-KAN Wavelet-based KANs.
.../qkan Quantum-inspired KAN variants and pruning.
.../KAN-Converge Additive & hybrid KANs for convergence-rate experiments.
.../BSRBF_KAN Combines B-spline and RBF bases.
.../Bayesian-HR-KAN Bayesian higher-order ReLU-KANs with uncertainty quantification.
.../Legend-KINN Legendre polynomial–based KAN for efficient PDE solving.
.../DeepOKAN Deep Operator Network based on KAN.
.../LeanKAN A memory-efficient Kolmogorov–Arnold Network.
.../SPIKANs A Separation-of-variables to decompose high-dimensional PDEs into smaller KANs.
openkan.org Features a non-spline KAN trained via Newton–Kaczmarz.
.../Anant-Net High-dimensional PDE solver with tensor sweeps.
.../RGA-KANs Deep cPIKANs with variance-preserving initialization.
.../lmkan Lookup-based KAN for fast high-dimensional mappings.
.../KAN_Initialization_Schemes Initialization schemes for spline-based KANs.
.../mlp-kan KAN vs. MLP for PDEs in DeepONet/GNS frameworks.
.../KANQAS_code KANQAS: KAN for quantum architecture search.
.../pkan Probabilistic KAN via divisive data re-sorting.
.../spikans Separable PIKAN (SPIKAN) for high-dimensional PDEs.

5. Bridging KANs and MLPs

Brief result Paper , Code
Equivalence: ReLU^k MLP ↔ B-spline KAN. Wang
Piecewise-linear KAN = ReLU MLP. Schoots
Adaptive spline KANs mimic MLPs with data-driven capacity. Actor
NTK view: richer KAN bases reduce spectral bias vs MLP. Gao

6. Basis Functions

Name Support Equation Grid Type Paper , Code
B-spline Local $\sum_n c_nB_n(x)$ Yes B-spline Liu & Liu , Code & Actor & Basina & Coffman & Guo & Kalesh & Gao & Zeng & Khedr & Lei & Li & Lin & Pal & Howard & Jacob , Code & Aghaei & Patra & Ranasinghe & Rigas , Code & Shuai & Wang & Zhang & Raffel & Schoots & Wang & Wang & Xu & Shen & Yang & Howard , Code & Code & Gong & Guo & Lee & Mallick & Sen
Chebyshev Global $\sum_k c_kT_k(\tanh x)$ No Chebyshev + tanh Sidharth , Code & Code & Yang & Mahmoud & Guo & Faroughi & Yu, Code & Rigas 2025 , Code
Stabilized Chebyshev Global $\tanh\big(\sum_k c_kT_k(\tanh x)\big)$ No Chebyshev + linear head Daryakenari
Chebyshev (grid) Global $\sum_k c_kT_k\Big(\tfrac{1}{m}\sum_i \tanh(w_i x+b_i)\Big)$ Yes Chebyshev + tanh Toscano , Code
ReLU-KAN Local $\sum_i w_iR_i(x)$ Yes Squared ReLU Qiu , Code
HRKAN Local $\sum_i w_i\big[\mathrm{ReLU}(x)\big]^m$ Yes Polynomial ReLU So , Code
Adaptive ReLU-KAN Local $\sum_i w_iv_i(x)$ Yes Adaptive ReLU Rigas , Code
fKAN (Jacobi) Global $\sum_n c_nP_n(x)$ No Jacobi Aghaei , Code
rKAN (Padé/Jacobi) Global $\dfrac{\sum_i a_iP_i(x)}{\sum_j b_jP_j(x)}$ No Rational + Jacobi Aghaei , Code
Jacobi-KAN Global $\sum_i c_iP_i(\tanh x)$ No Jacobi + tanh Kashefi , Code & Shukla & Xiong & Zhang , Code
FourierKAN Global $\sum_k a_k\cos(kx)+b_k\sin(kx)$ No Fourier Xu , Code & Code & Guo & Jiang , Code
KAF Global $\alpha\mathrm{GELU}(x)+\sum_j \beta_j\psi_j(x)$ No Random Fourier + GELU Zhang , Code
Gaussian + residual Local $\sum_i w_i \exp!\Big(-\big(\tfrac{x-g_i}{\varepsilon}\big)^2\Big) + w_b \rho(x)$ Yes Gaussian RBF with SiLU Li , Code & Lee & Abueidda , Code & Koenig , Code & Ta , Code & Buhler & Zhang
Gaussian Local $\sum_i w_i \exp!\Big(-\big(\tfrac{x-g_i}{\varepsilon}\big)^2\Big)$ Yes Pure Gaussian RBF Noorizadegan , Code
Partition of Unity Gaussian Local $\sum_i w_i \dfrac{\exp!\Big(-\big(\tfrac{x-g_i}{\varepsilon}\big)^2\Big)}{\sum_j \exp!\Big(-\big(\tfrac{x-g_j}{\varepsilon}\big)^2\Big)}$ Yes Partition of Unity Gaussian RBF Noorizadegan , Code
RSWAF-KAN Local $\sum_i w_i\left(s_i-\tanh^2\big(\tfrac{x-c_i}{h_i}\big)\right)$ Yes Switch ($tanh^2$) Code
CVKAN Local $\sum_{u,v} w_{uv}\exp\big(-\lvert z-g_{uv}\rvert^2\big)$ Yes Complex Gaussian Wolff , Code & Che
BSRBF-KAN Local $\sum_i a_i B_i(x)+\sum_j b_j\exp\big(-\tfrac{(x-g_j)^2}{\varepsilon^2}\big)$ Yes B-spline + Gaussian Ta , Code
Wav-KAN Local $\sum_{j,k} c_{j,k}\psi\big(\tfrac{x-u_{j,k}}{s_j}\big)$ No Wavelet Bozorgasl , Code & Patra & Pratyush & Seydi & Meshir 2025
FBKAN Local $\sum_j \omega_j(x)K_j(x)$ Yes PoU + B-spline Howard , Code
SincKAN Global $\sum_{i=-N}^{N} c_i\mathrm{sinc}\left(\frac{\pi}{h}(x - i h)\right)$ Yes Sinc Yu , Code
Poly-KAN Global $\sum_i w_iP_i(x)$ No Polynomial Seydi , Code & Attouri 2025

7. Accuracy Improvement

7.1 Physics & Loss Design

Brief result Paper , Code
Physics-informed KAN (cPIKAN): residual attention, entropy-viscosity. Shukla
KAN-PINN for strongly nonlinear PDEs (actuator deflection). Zhang
Attention-guided KAN with NSE residuals + BC losses. Yang
Residual physics + sparse regression (variable-coeff. PDEs). Guo
Self-scaled residual reweighting (ssRBA). Toscano , Code
Augmented-Lagrangian PINN–KAN (learnable multipliers). Zhang
Velocity–vorticity loss for turbulence reconstruction. Toscano
Fractional/integro-diff. operators in KAN. Aghaei
Physics-informed KAN for high-index DAEs (dual-network structure). Lou
Holomorphic KAN for elliptic PDEs; trains only on boundary conditions. Clafa , Code

7.2 Adaptive Sampling & Grids

Brief result Paper , Code
Multilevel knots (coarse→fine) for nested spline spaces. Actor
Free-knot KAN (trainable knots via cumulative softmax). Actor
Grid extension with optimizer state transition. Rigas , Code
Residual-adaptive sampling (RAD). Rigas , Code
Multi-resolution sampling schedule for cPIKAN. Yang

7.3 Domain Decomposition

Brief result Paper , Code
Finite-basis KAN (FBKAN) with PoU blending of local KANs. Howard , Code
Temporal subdomains to improve NTK conditioning. Faroughi

7.4 Function Decomposition

Brief result Paper , Code
Multi-fidelity KAN (freeze LF, learn HF linear + nonlinear heads). Howard , Code
Separable PIKAN (sum of products of 1D KAN factors). Jacob , Code
KAN-SR: recursive simplification for symbolic discovery. Buhler

7.5 Hybrid / Ensemble & Data

Brief result Paper , Code
MLP–KAN mixture of experts. He , Code
Parallel KAN ∥ MLP branches with learnable fusion. Xu
KKAN: per-dim MLP features + explicit basis expansion. Toscano , Code

7.6 Sequence / Attention Hybrids

Brief result Paper , Code
FlashKAT: group-rational KAN blocks in Transformers. Raffel
GINN-KAN: interpretable growth + KAN in PINNs. Ranasinghe
KAN-ODE: KAN as $\dot u$ model (adjoint training). Koeing
AAKAN-WGAN: adaptive KAN + GAN for data augmentation. Shen
Attention-KAN-PINN for battery SOH forecasting. Wei
KANQAS: uses KAN Double Deep Q-Network for quantum architecture search. Kundu 2024 , Code

7.7 Discontinuities & Sharp Gradients

Brief result Paper , Code
SincKAN for kinks/boundary layers. Yu , Code
rKAN (rational bases) for asymptotics/jumps. Aghaei , Code
DKAN: $\tanh$ jump gate + spline background. Lei
KINN for singularities/stress concentrations. Wang , Code
Two-phase PINN–KAN for saturation fronts. Kalesh

7.8 Optimization & Adaptive Training

Brief result Paper , Code
Adam/RAdam warmup → (L-)BFGS refinement. Mostajeran & Daryakenari & Zeng
Hybrid optimizers for sharp fronts. Kalesh
Bayesian hyperparameter tuning for KANs. Lin
Bayesian PINN–KAN (variational + KL) for UQ. Giroux , Code
NTK perspective: conditioning ↔ convergence. Faroughi

8. Efficiency Improvement

8.1 Parallelism, GPU, and JAX Engineering

Brief result Paper , Code
ReLU^m activations replace splines (CUDA-friendly). Qiu , Code
Spline→matmul CUDA kernels (GEMM fusion). Qiu, Code & So, Code
Matrix B-spline evaluation fused on GPU. Coffman , Code
Dual-matrix merge + trainable RFF for scaling. Zhang , Code
Custom GPU backward for KAN attention blocks. Raffel
Parallel KAN ∥ MLP branches (stream/layer parallelism). Xu
Domain decomposition parallelism (multi-GPU, PoU/separable). Shukla & Howard , Code & Jacob , Code
JAX/XLA: jit/vmap/pmap, fusion, memory-aware. Daryakenari & Rigas , Code
lmKANs: multivariate spline lookup tables, CUDA-friendly. Michalkiewicz , Code
Brief result Paper , Code
ReLU^m activations replace splines (CUDA-friendly). Qiu , Code
Spline→matmul CUDA kernels (GEMM fusion). Qiu , Code & So , Code
Matrix B-spline evaluation fused on GPU. Coffman , Code
Dual-matrix merge + trainable RFF for scaling. Zhang , Code
Custom GPU backward for KAN attention blocks. Raffel , Code
Parallel KAN ∥ MLP branches (stream/layer parallelism). Xu , Code
Domain decomposition parallelism (multi-GPU, PoU/separable). Shukla & Howard , Code & Jacob , Code
JAX/XLA acceleration: jit, vmap, pmap, fusion. Daryakenari & Rigas , Code
lmKANs: multivariate spline lookup tables, CUDA-friendly. Pozdnyakov 2025 , Code

8.2 Matrix Optimization & Parameter-Efficient Bases

Brief result Paper , Code
ReLU-power vs B-splines: fewer params, vectorized polynomials. Qiu , Code & Qiu , Code & So , Code
Orthogonal polynomials with cheap recurrences. Shukla & Guo & Mostajeran & Mostajeran & Wang , Code
Compact RBF bases (local Gaussians). Lin & Koeing , Code
Wavelets for multi-resolution and sparse coeffs. Patra
Dual-matrix + RFF compression to cut memory traffic. Zhang , Code
Sparsity regularization (ℓ1/group) with pruning. Guo
Hierarchical channel-wise refinement (shared params). Actor
DEKAN: connectivity via Differential Evolution. Li
Mix spectral (derivatives) + spatial (coeffs) sparsity for operators. Lee
Tensor sweeps + selective differentiation for scalable high-D PDEs. Sidharth , Code
Operator-aware spectral–spatial mixing for near-diagonal matvecs. Lee

9. Sparsity & Regularization

9.1 ℓ1 sparsity with entropy balancing

Brief result Paper , Code
Layerwise ℓ1 on edge activations + entropy balance. Liu , Code
EfficientKAN: direct ℓ1 on weights (simple, practical). EfficientKAN
Sparse symbolic discovery with ℓ1 + entropy. Wang
PDE KAN: ℓ1 + smoothness penalty to denoise coefficients. Guo
Post-training pruning with layerwise ℓ1. Koeing , Code
KAN-SR: magnitude + entropy at subunit level (+ℓ1 on bases). Buhler

9.2 ℓ2 weight decay and extensions

Brief result Paper , Code
AAKAN: ℓ2 + temporal smoothing + MI regularizer. Shen
Small ℓ2 (e.g., 1e−5) improves stability in PINNs/DeepOKAN. Shukla & Toscano

9.3 Implicit and dropout-style regularizers

Brief result Paper , Code
Nested activations (e.g., tanh∘tanh) for bounded outputs & smooth grads. Daryakenari
DropKAN: post-activation masking (noise after spline eval). Altarabichi , Code

10. Convergence & Scaling Laws

10.1 Approximation & Sample Complexity

Brief result Paper
Depth-based convergence rate for spline KANs. Wang
Optimal Besov approximation; dimension-free sample complexity. Kratsios 2025
Minimax statistical rates for additive & hybrid KANs; optimal knot scaling. Liu , Code
Generalization bounds via RKHS and coefficient/Lipschitz complexity. Zhang 2024
Lipschitz-controlled layers improve stability and generalization. Li 2025

10.2 Optimization Dynamics & Spectral Bias

Brief result Paper
KANs show reduced spectral bias vs. MLPs; faster high-frequency learning. Wang 2025
Learnable bases widen NTK spectra; trade off between reach and curvature. Farea 2025 , Code
Gradient-flow convergence guarantees for two-layer KANs. Gao
Chebyshev/cPIKAN maintain better NTK conditioning for PDEs. Faroughi 2025
Initialization schemes improve NTK stability. Rigas 2025 , Code

10.3 Empirical Power Laws

Brief result Paper
KAN error follows consistent power-law decay; grid refinement improves accuracy. Liu , Code
Depth/grid refinement matches theoretical convergence trends. Wang
Scaling behavior influenced by optimization, not just expressivity. Kratsios 2025
Minimax results align: grid resolution drives learning efficiency. Liu , Code
Power-law patterns observed across PDE benchmarks. Faroughi 2025

About

A systematic review of Kolmogorov-Arnold Networks that bridges them with MLPs, highlights their parameter-efficient, interpretable edge-basis design, maps the open-source ecosystem, and offers a practical guide to choosing architectures.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors