A Lean 4 formalization of the Zhang-Yeung conditional information inequality, a non-Shannon-type bound on Shannon entropy that cannot be derived from the Shannon inequalities alone.
The central object is the Zhang-Yeung delta,
Δ(Z, U | X, Y) := I(Z ; U) - I(Z ; U | X) - I(Z ; U | Y),
and the main theorem (Theorem 3 of Zhang and Yeung 1998) is the upper bound
Δ(Z, U | X, Y) ≤ (1/2) · I(X ; Y) + (1/4) · (I(X ; Z,U) + I(Y ; Z,U)).
The proof rests on the copy lemma, which identifies the inequality as genuinely non-Shannon: no amount of Shannon-type reasoning can replace the copy construction.
| Milestone | Title | State |
|---|---|---|
| M0 | Project scaffolding | done |
| M1 | Delta equational lemmas | done |
| M1.5 | Theorem 2 (standalone conditional result) | done |
| M2 | The copy lemma | done |
| M3 | Theorem 3 (main inequality) | done |
| M4 | Theorem 4 (Shannon-cone separation) | done |
| M5 | Theorem 5 (n+2-variable generalization) | done |
| M6 | Polish and release | in progress |
M1.5 is complete: the public ZhangYeung.theorem2 is fully proved with no sorry. The proof factors into (a) a Shannon-algebra reduction of the target inequality to Δ(Z, U | X, Y) ≤ 0 under the hypotheses I[X:Y] = I[X:Y|Z] = 0, and (b) a non-Shannon closing argument via the [@zhangyeung1997] auxiliary p̃/p̂ PMF construction and Real.sum_mul_log_div_leq. Kaced and Romashchenko [@kaced2013] classify this inequality as essentially conditional -- it fails on the closure of the entropic region, so no Lagrange combination of Shannon-type inequalities and the premises suffices, and the KL route is effectively the only known proof. To our knowledge this is the first formalization of any non-Shannon-type information inequality in any proof assistant.
M2 is complete: the copy lemma of [@zhangyeung1998, §III eqs. 44-45] is formalized as the existential copyLemma, which wraps PFR's ProbabilityTheory.condIndep_copies on the pair ⟨X, Y⟩ conditioned on the shared variable ⟨Z, U⟩ and produces an extended law ν with a second conditionally-independent copy (X₁, Y₁) of (X, Y) over (Z, U). Lemma 2 of the same paper (eq. 45) ships both as the abstract Shannon identity delta_of_condMI_vanishes_eq, which holds whenever a single conditional mutual information vanishes, and as six Zhang-Yeung-flavored corollaries specialized to the copy's projections: copyLemma_delta_transport_Y_to_Y₁ and copyLemma_delta_transport_X_to_X₁ transport Δ between μ and ν; copyLemma_delta_identity_Y₁ and copyLemma_delta_identity_X_X₁ instantiate Lemma 2 at the copy; and copyLemma_delta_le_mutualInfo_Y₁ and copyLemma_delta_le_mutualInfo_X_X₁ combine the identity with nonnegativity of the three conditional-mutual-information terms on its right-hand side to yield the two Δ ≤ I(·;·) inequalities that feed the Shannon chase for Theorem 3.
M3 is complete: Theorem 3 of [@zhangyeung1998, §III eqs. 21-23], the headline Zhang-Yeung inequality and the first known non-Shannon-type linear inequality on Shannon entropies of four random variables, is formalized in all three of the paper's equivalent forms. The public theorems zhangYeung (eq. 21, asymmetric), zhangYeung_dual (eq. 22, the X ↔ Y swap), and zhangYeung_averaged (eq. 23, the averaged symmetric form displayed in this README's header) sit on top of a single private zhangYeung_integer that closes the paper's Shannon chase at the integer-scaled shape 2·I[Z:U] - 3·I[Z:U|X] - I[Z:U|Y] ≤ I[X:Y] + I[X:⟨Z,U⟩]; the three public wrappers route this through the M1 form-conversion lemmas delta_form21_iff, delta_form22_iff, delta_form23_iff, and delta_form23_of_form21_form22. The chase itself takes the two M2 copy-lemma inputs (copyLemma_delta_le_mutualInfo_Y₁ and copyLemma_delta_le_mutualInfo_X_X₁), adds a generic three-way mutual-information identity, invokes data processing for PFR's CondIndepFun on the copy (via the ZhangYeung.condIndepFun_comp helper promoted to ZhangYeung/Prelude.lean as part of M3), and transports the result back to μ through IdentDistrib.mutualInfo_eq.
M4 is complete: Theorem 4 of [@zhangyeung1998, §II eq. 26], the scientific payoff of the Zhang-Yeung inequality, is formalized in its exact closure form. The public theorem4 proves ∃ F ∈ Γ_4, F ∉ closure(Γ_4^*), and theorem4_ge_four lifts this to all n ≥ 4. The proof combines (a) a ℚ-valued counterexample witness F_witness_ℚ on Finset (Fin 4) (the paper's explicit set function on lines 368-377, specialized to a = 1) whose Shannon-cone proof now uses decide for the empty / monotonicity branches and a structural submodularity proof for the exceptional pair behavior; (b) the specific Zhang-Yeung violation at the canonical labeling (2, 3, 0, 1) via the permutation Equiv.swap 0 2 * Equiv.swap 1 3; (c) a set-function bridge zhangYeungHolds_of_entropy that restates M3's zhangYeung at the set-function level over a heterogeneous family X : ∀ i : Fin 4, Ω → S i; and (d) a generic entropic-region layer (entropyFn_n, entropyRegion_n, almostEntropicRegion_n, restrictFirstFour) that packages the witness argument as the literal separation \bar{Γ}_n^* \neq Γ_n. The finite witness theorem is retained as theorem4_finite, and the stronger cone-level separation is retained as shannon_incomplete_ge_four.
M5 is complete: Theorem 5 of [@zhangyeung1998, §III eqs. 27-28], the n + 2-variable Zhang-Yeung generalization, is formalized as the public theorems theorem5 and theorem5_averaged. The proof lifts the M3 copy-lemma chase to a tuple-valued copy of the full Fin n-indexed side-variable family, sums the pairwise Lemma 2 inequalities, controls the resulting mutual-information sum by a private mutualInfo_add_n_way_inequality, and closes by data processing plus tuple-level marginal transports. As part of that landing, ZhangYeung/Prelude.lean now exposes the generic helpers IdentDistrib.condMutualInfo_eq, mutualInfo_add_three_way_identity, and mutualInfo_le_of_condIndepFun alongside condIndepFun_comp.
The living roadmap is in docs/plans/todo/2026-04-15-zhang-yeung-formalization-roadmap.md.
- Finite-alphabet, real-valued probability distributions.
- Built on top of PFR's entropy API (
H[X],I[X:Y],I[X:Y|Z]) from teorth/pfr, pinned inlakefile.toml. - Measurability and finite-range side conditions expressed via Mathlib's
[MeasurableSpace],[Fintype], and[MeasurableSingletonClass]typeclasses. - Deferred: infinite alphabets, computable entropy, upstream push of the copy lemma into PFR or Mathlib.
ZhangYeung.lean— project entrypoint; re-exportsZhangYeung.CopyLemma,ZhangYeung.Delta,ZhangYeung.EntropyRegion,ZhangYeung.Prelude,ZhangYeung.Theorem2,ZhangYeung.Theorem3,ZhangYeung.Theorem4, andZhangYeung.Theorem5.ZhangYeung/Prelude.lean— import surface for PFR's entropy API, plus the generic helpersZhangYeung.condIndepFun_comp,IdentDistrib.condMutualInfo_eq,ZhangYeung.mutualInfo_add_three_way_identity, andZhangYeung.mutualInfo_le_of_condIndepFun.ZhangYeung/Delta.lean— M1 delta quantity and equational lemmas (delta_def,delta_comm_cond,delta_comm_main,delta_self,delta_eq_entropy,delta_form21_iff,delta_form22_iff,delta_form23_iff,delta_form23_of_form21_form22,delta_le_mutualInfo).ZhangYeung/EntropyRegion.lean— generic entropic-region infrastructure for Theorem 4:entropyFn_n, the set-level regionsshannonRegion_n/entropyRegion_n/almostEntropicRegion_n, and the first-four restriction maprestrictFirstFourwith its entropic and almost-entropic transport lemmas.ZhangYeung/Theorem2.lean— M1.5 Zhang-Yeung conditional information inequality (theorem2), a standalone formalization of the first known non-Shannon-type conditional information inequality; factored into a Shannon-algebra reduction (theorem2_shannon_identity) and the KL-divergence core (theorem2_delta_le_zero) with supporting lemmasptilde_sum_eq_one,phat_sum_eq_one,delta_eq_sum_log_ratio,sum_joint_eq_sum_ptilde.ZhangYeung/CopyLemma.lean— M2 Zhang-Yeung copy lemma (§III eqs. 44-45 of the 1998 paper); ships thecopyLemmaexistential (a thin wrapper over PFR'sProbabilityTheory.condIndep_copies), the abstract Lemma 2delta_of_condMI_vanishes_eq, and six copy-projection corollaries (copyLemma_delta_transport_Y_to_Y₁,copyLemma_delta_transport_X_to_X₁,copyLemma_delta_identity_Y₁,copyLemma_delta_identity_X_X₁,copyLemma_delta_le_mutualInfo_Y₁,copyLemma_delta_le_mutualInfo_X_X₁) covering the two delta transports, two delta identities, and twoΔ ≤ I(·;·)inequalities feeding the Shannon chase for Theorem 3.ZhangYeung/Theorem3.lean— M3 Zhang-Yeung inequality (Theorem 3 of the 1998 paper, §III eqs. 21-23):zhangYeung(eq. 21),zhangYeung_dual(eq. 22), andzhangYeung_averaged(eq. 23), all derived from a privatezhangYeung_integerthat closes the paper's Shannon chase on the copy joint and routes through the M1 form-conversion lemmas.ZhangYeung/Theorem4.lean— M4 Shannon incompleteness atn = 4andn ≥ 4(Theorem 4 of the 1998 paper, §II eq. 26): the set-function calculus (I_F,condI_F,delta_F), the cone predicates (shannonCone,zhangYeungAt,zhangYeungHolds), the rational-valued counterexample witness (F_witness_ℚ,F_witness), the finite auxiliarytheorem4_finite, the exact closure theoremstheorem4andtheorem4_ge_four, the sequence-level auxiliarytheorem4_seqClosure, and the stronger cone-level corollaryshannon_incomplete_ge_four.ZhangYeung/Theorem5.lean— M5 Theorem 5, then + 2-variable Zhang-Yeung generalization (1998, §III eqs. 27-28): the tuple-copy projection helpers, the privatemutualInfo_add_n_way_inequalitychain-rule bound, and the public theoremstheorem5andtheorem5_averaged.ZhangYeungTest.lean+ZhangYeungTest/Delta.lean+ZhangYeungTest/EntropyRegion.lean+ZhangYeungTest/Theorem2.lean+ZhangYeungTest/CopyLemma.lean+ZhangYeungTest/Theorem3.lean+ZhangYeungTest/Theorem4.lean+ZhangYeungTest/Theorem5.lean— compile-time API regression tests for each module.
Fresh clone or worktree:
bin/bootstrap-worktree # mandatory: downloads Mathlib + PFR prebuilt artifacts, then buildsDay-to-day commands:
make build # lake build ZhangYeung
make test # lake test (ZhangYeungTest example suite)
make lean-lint # lake lint (batteries/runLinter)
make lint # markdownlint-cli2 + cspell
make check # lint + lean-lint + build + test
make help # show all targetsSee AGENTS.md for the full command reference and CONTRIBUTING.md for the development workflow.
- Lean 4 toolchain, pinned in
lean-toolchain. - Lake (bundled with Lean).
- PFR, pinned by Git revision in
lakefile.toml. PFR supplies the entropy API used throughout. - Mathlib (via PFR), downloaded as prebuilt artifacts by the bootstrap script. Never built from source in CI or in a fresh worktree.
Primary sources and transcriptions live in references/. The paper being formalized and the principal background text:
- Zhang and Yeung (1998). On characterization of entropy function via information inequalities. IEEE Transactions on Information Theory 44(4). DOI: 10.1109/18.681320. (
references/papers/zhangyeung1998.pdf; verified transcription atreferences/transcriptions/zhangyeung1998.md.)
Other closely related texts:
- Zhang and Yeung (1997). A non-Shannon-type conditional inequality of information quantities. IEEE Transactions on Information Theory 43(6). (
references/papers/zhangyeung1997.pdf.) This is the primary reference for the M1.5 Theorem 2 proof via Kullback-Leibler divergence. - Yeung (2008). Information Theory and Network Coding. Springer. DOI: 10.1007/978-0-387-79234-7. (Cross-reference for Theorem 3 and the copy lemma.)
Additional entries are listed in references/README.md.
This formalization is being completed with substantial assistance from Opus 4.6 + 4.7 and GPT 5.4, through claude and opencode, and GitHub Copilot.
Copyright 2026 Christopher Boone.
This repository carries REUSE-style mixed-license coverage:
- Lean code, the
bin/bootstrap-worktreeshell script, and hand-authored infrastructure config (Makefile,lakefile.toml, YAML, JSONC, dotfiles,cspell-words.txt, VS Code editor config) under Apache 2.0. Lean files andbin/bootstrap-worktreecarry inline SPDX headers; the remaining infrastructure/config files are covered by theREUSE.tomlconfig-group annotation. - Project-authored prose (READMEs, agent configs, planning documents, research notes, formalization reviews, this README, the NOTICE) under CC BY 4.0 via the
REUSE.tomlprose-group annotation. CODE_OF_CONDUCT.md, adapted from the Contributor Covenant via the Organization for Ethical Source, under CC BY-SA 4.0 via a dedicatedREUSE.tomloverride annotation.- Generated artifacts (
lake-manifest.json,lean-toolchain) dedicated to the public domain under CC0 1.0 via theREUSE.tomlgenerated-group annotation. - Bundled third-party reference material under
references/papers/**andreferences/transcriptions/**underLicenseRef-Reference-Material. Copyright remains with the original authors and publishers; per-publication attribution (IEEE, original authors) is preserved via per-file.licensesidecars rather than a REUSE.toml glob.
See NOTICE and the combination of per-file SPDX headers plus root REUSE.toml annotations for the authoritative coverage. See https://reuse.software/ for tooling.