All notable changes to ks-xlsx-parser are documented here.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Each release lives under a version heading linked to its GitHub compare view at the bottom. Subsections use a fixed set of labels so the log is skimmable:
- Added — new features
- Changed — changes in existing behaviour
- Deprecated — soon-to-be removed features (keep at least one release ahead)
- Removed — removed features
- Fixed — bug fixes
- Security — vulnerability fixes (link to the GHSA advisory)
- Performance — noteworthy perf wins, with numbers
- Docs — user-facing documentation changes
- Internal — refactors, test infra, tooling (only when it affects contributors)
Breaking changes get a ⚠️ BREAKING: prefix and are called out at the top of
the release. Keep entries in the imperative ("add X"), one line each, linking
issues or PRs in parentheses (#123).
- Repository layout flattened on
src/was leaking 13 generic top-level packages (models,utils,parsers, …) into installed wheels and silently droppingpipeline.pyandapi.py(setuptoolspackages.findonly finds packages, not top-level modules). Users hittingfrom ks_xlsx_parser.pipeline import ...on 0.2.0 from PyPI gotModuleNotFoundError. All modules now live undersrc/ks_xlsx_parser/; the wheel'stop_level.txtcontains onlyks_xlsx_parser. Imports inside the package switched fromfrom pipeline importtofrom ks_xlsx_parser.pipeline import. Downstream code that imported the leaked generics (from models import …) MUST migrate tofrom ks_xlsx_parser.models ….
scripts/verify_wheel.py— builds the wheel, installs it in a fresh venv, and asserts the public import surface resolves. Wired into a newwheel-checkjob in.github/workflows/ci.ymland aVerify wheelstep inrelease.yml. Regression guard for the packaging bug above.scripts/triage_recall.py+scripts/append_bench_history.py— turnfailures.ndjsoninto a ranked bucket histogram with exemplar failures, and append each benchmark run totests/benchmarks/reports/history.jsonlso recall is tracked commit-over-commit. Goal: text recall@5 > 0.90.eval_retrieval.py --emit-failures— dumps top-8 ranked chunks per miss with afailure_bucket(answer_absent_from_chunks / present_but_ranked_low / wrong_sheet / geometric_no_overlap / …) for triage. Summary JSON gains afailure_bucketshistogram.Dockerfile.bench+.github/workflows/benchmark.yml— reproducible benchmark image; PR sample run (60 instances), weekly full corpus run.make install-devalias andmake wheel-check/make bench-track/make docker-benchtargets.- New
benchoptional-dependency group (sentence-transformers,numpy) — only the benchmark needs these. docs/recall-investigation.mddocumenting the diagnosis framework and three named hypotheses (chunk-size dilution, formula-expression rendering, range-bookkeeping drift)..claude/skills/recall-failure-triage/SKILL.md— agent skill that consumes the bucket output and proposes ranked fixes.
- Dropped
PYTHONPATH=srcfrom Makefile benchmark targets — the package is now properly installable so callers don't need it. pyproject.toml:packages.findconstrained toks_xlsx_parser*,py.typeddeclared as package data,xlsx-parser-apiconsole script updated toks_xlsx_parser.api:main.
- Retired the in-tree
testBench/corpus. The 1054-workbook stress dataset andmake testbench*targets are gone — benchmarks now run against the public SpreadsheetBench v0.1 corpus, downloaded on demand todata/corpora/(gitignored). Seedocs/corpora.md.
testBench/directory and all bundled real-world / generated workbooks.make testbench-build,make testbench,make testbench-ziptargets.testbenchjob in.github/workflows/ci.yml.testBench-vX.Y.Z.ziprelease asset from the release workflow.tests/test_testbench_roundtrip.py,tests/test_enterprise_scoring.py,tests/test_real_world_datasets.py,tests/test_cross_validation.py.scripts/build_testbench.py,scripts/generate_enterprise_fixtures.py.static_xlsxpytest fixture (the test bench it iterated is gone).
- README, wiki, examples, and contributor docs now point at SpreadsheetBench
(
make bench-robust/make bench-retrieval) as the canonical benchmark. examples/demo.py+examples/generate_examples.pynow write/read fixtures underexamples/fixtures/instead of the (removed)testBench/real_world/.
Benchmark + retrievability release. Adds a head-to-head benchmark against Docling on the SpreadsheetBench corpus (912 instances, 5,458 xlsx files) and fixes three rendering bugs that were silently torpedoing RAG retrieval. ks-xlsx-parser parses 99.945% of SpreadsheetBench and ties Docling at recall@1 / wins at recall@3 (+2.7 pp) and recall@5 (+1.8 pp), plus 36.9% citation-grade geometric recall (Docling 0%, structurally — no A1 anchors).
tests/benchmarks/adapters/docling_adapter.py— Docling adapter speaking the same NDJSON-worker protocol asks_adapter.py(#TBD).tests/benchmarks/_runner.py:docling_runnerfactory wired intovs_hucre.py's--parsersdispatch.scripts/eval_retrieval.py— retrieval-recall benchmark over SpreadsheetBench's(instruction, data_position, answer_position)triples. Usessentence-transformers(defaultBAAI/bge-small-en-v1.5) and computes geometric overlap + numeric/date/boolean-normalized text-match recall@k. Persistent docling subprocess with hard-kill timeout — PyTorch's table-rec loop holds the GIL through C-land so in-process timeouts don't work.scripts/summarize_retrieval.py— re-aggregate aresults.ndjsonintosummary.json/summary.mdif a long run is interrupted.scripts/download_corpora.sh: fetches SpreadsheetBench v0.1 (~96 MB tar.gz) intodata/corpora/spreadsheetbench/(gitignored).tests/benchmarks/README.md— adapter design notes + benchmark how-to.tests/benchmarks/reports/COMPARISON.md— head-to-head report incl. methodology, capability matrix, caveats.Makefile:bench,bench-robust,bench-retrievaltargets.
src/rendering/text_renderer.py: numeric cells now render the raw value (1272) instead of Excel's display-formatted string (1,272.00). The display format defeated substring-match retrieval for the most common RAG query shape ("what was the value in 2020?" → user types1272).src/rendering/text_renderer.py: the[=]formula marker no longer spuriously inflates a cell past its column width, which used to trigger a sci-notation fallback (1.272000e+03) on perfectly small values. Column widths now computed using the same rendering pipeline data rows will use, so the long-value path only triggers on genuinely-too-wide values.src/rendering/text_renderer.py: dates render as ISOYYYY-MM-DDand drop the spurious00:00:00time component on midnight datetimes.src/rendering/text_renderer.py: embedded newlines inside header cells (e.g."租金\n天数") collapse to spaces so they don't tear apart the Markdown grid (regression fixed for租赁收入计提表.xlsx-class layouts).src/chunking/segmenter.py: removed_detect_style_boundaries. The function split a coherent table into 5 fragments at fill-color band boundaries (year-banding, alternating-row shading), shedding header context from data rows. The connected-components + gap detection already handles real boundaries; fill banding is not a semantic one.src/parsers/cell_parser.py:GradientFillcells no longer crash the sheet parser. Accessing.patternTypeon aGradientFill(vs the expectedPatternFill) raisedAttributeError, which propagated up and killed every cell on the sheet. We don't model gradients but we no longer drop the sheet because of them (caught by SpreadsheetBench instance118-8, 8 sheets / 1,244 cells previously lost).
tests/benchmarks/_schema.py:formulasis now nullable onstatus=okrecords. Parsers that don't model formulas (Docling, Marker) can now emit validBenchmarkRecords without tripping schema validation. The schema's load-bearingNonevs0distinction is preserved:None= "feature not modeled by this parser",0= "modeled and observed zero".
scripts/compare_docling.py— superseded by the unifiedtests/benchmarks/framework +eval_retrieval.py. The old script'sScoreCardcomposite score was structurally biased (formula-preservation gave Docling a 0 by definition while contributing 20/100 points; header-propagation used different proxies for each parser); replaced by parser-agnostic text-match and geometric recall metrics.
- ks-xlsx-parser is now ~5% faster on average parse time on SpreadsheetBench than Docling (251 ms vs 265 ms mean), while producing a richer output (formulas, dependency graph, charts, named ranges, etc.).
tests/benchmarks/README.md— new — methodology + adapter design.tests/benchmarks/reports/COMPARISON.md— new — head-to-head report.- README — new "Benchmark — ks-xlsx-parser vs Docling on SpreadsheetBench" section near the top with the headline table.
tests/test_rendering.py: updatedtest_numeric_cells_use_scientific_notation_not_truncationto assert the new raw-numeric rendering (test renamedtest_numeric_cells_render_raw_not_display_formatted)..gitignore:data/corpora/(downloaded benchmark corpora; can run to several GB).Makefile:bench,bench-robust,bench-retrievaltargets.
0.1.1 — 2026-04-17
First public release. MIT-licensed, open-sourced under the
Knowledge Stack ecosystem. Detailed
announcement: docs/launch/RELEASE_NOTES_v0.1.1.md.
- Public Python package
ks-xlsx-parseron PyPI; import asxlsx_parseror the aliasks_xlsx_parser. parse_workbook()returning aParseResultwith.workbook,.chunks, and.serializer— full workbook graph (cells, formulas, merges, tables, charts, CF, DV, named ranges, dependency edges).compare_workbooks()+export_importer()for multi-workbook template alignment and Python-importer generation.StageVerifier/VerificationReport/ExcellentStagefor pipeline stage-level assertions.- RAG-ready
ChunkDTOwithsource_uri,render_text,render_html,token_count,dependency_summary, and xxhash64 content hash. testBench/— 1053-workbook stress corpus (real_world 8 + enterprise 4- github_datasets 10 + stress/curated 26 + stress/merges 5 + generated
1000). Ships as
testBench-v0.1.1.ziprelease asset.
- github_datasets 10 + stress/curated 26 + stress/merges 5 + generated
1000). Ships as
scripts/build_testbench.py— deterministic generator (matrix: 297, combo: 400, adversarial: 300).tests/test_testbench_roundtrip.py— parallel round-trip gate; 1054/1054 passing in ~70 s.- FastAPI web server (
xlsx-parser-api) in the[api]extra. - GitHub Actions:
ci.yml(test matrix on py3.10/3.11/3.12 × ubuntu/macos- dedicated testBench job) and
release.yml(wheel + sdist + testBench zip, PyPI Trusted Publishing).
- dedicated testBench job) and
- Community infra:
CODE_OF_CONDUCT.md,SECURITY.md, issue / PR / discussion templates,FUNDING.yml, pre-commit config.
- Chunk builder caches
detect_circular_refs()per workbook instead of re-running it per block. Real 21k-cell financial model: 307 s → 4.6 s (66×). - Sheet parser iterates openpyxl's
_cellsdict instead ofiter_rows()over the full bounding box. Workbooks with extreme sparse addresses (e.g.A1+XFD1048576): 60 s timeout → 135 ms.
- Conditional-formatting rules (
top10,uniqueValues,duplicateValues,containsText,aboveAverage,belowAverage) no longer reference a non-existentdxfId=0in generated fixtures, so openpyxl can load them back without anIndexError. test_formula_cached_values_matchnow applies a 15 % threshold for workbooks with known openpyxldata_onlycaching gaps, 5 % everywhere else. Seedocs/PARSER_KNOWN_ISSUES.md.
- New README positioned as "Make XLSX LLM Ready" with architecture diagram, comparison table vs pandas/openpyxl/Docling, vertical-use-case section, Knowledge Stack ecosystem links, and prominent Discord + ⭐ call-to-actions.
CONTRIBUTING.mdrewritten with three first-PR paths and Discord as the primary community channel.docs/MAINTAINERS.md— branch-protection playbook, label script, Discussions categories, PyPI Trusted Publishing setup, release checklist.testBench/README.md— dataset layout, manifest schema, licensing.docs/launch/— v0.1.1 release notes + Discord / Twitter / LinkedIn / HN / Reddit / blog announcement drafts.
- Consolidated 53 checked-in
.xlsxfixtures under a singletestBench/tree; updated every path reference in tests, scripts, and demos. - Removed internal-only tooling: Ralph loop scripts, Cursor / Serena agent configs, iteration logs, Knowledge-Stack-internal framing in DESIGN.md.
- Rebranded from
arnav2/XLSXParsertoknowledgestack/ks-xlsx-parser; transferred the repo into theknowledgestackorg and made it public. uv.lockregenerated after dropping the[ralph]extra and addingpytest-timeout/ruff/mypyto[dev].
Private-beta release used inside the Knowledge Stack ecosystem. Not published to PyPI. Superseded by 0.1.1.