A feature release built around two themes: making LWT a real backend for a shell-free mobile client, and a broad, methodical security-hardening pass across the auth, upload, and fetch surfaces. Plus a new source of easy reading material, the Global Digital Library.
Highlights
Shell-free / mobile client
The web shell is no longer required to render the app. The global navbar, the reader chrome (book navigation + audio player), and UI translations are now served from the REST API and rendered client-side, and the review surface and feedback sounds are bundled. A packaged client can choose its server, register and log in in-app, and keep its session alive with proactive token refresh (with a clean 401 teardown). This is what the Lukaisu Android client connects to.
New text source: the Global Digital Library ("Kids' Library")
Browse and search openly-licensed (CC-BY / CC-BY-SA) children's and early-grade readers — including StoryWeaver content — straight from the New Text page, filling the gap in easy texts that Gutenberg and Wikisource leave. Books import via ePUB extraction, image-only picture books are rejected, and difficulty tiers come from GDL's reading levels. The home page shows beginner-aware GDL suggestions: low-vocabulary readers see the easy books first, advanced readers see them below the classics.
Registration without email + recovery code + captcha
The username is now the unique identity, so sign-up needs only a username + password. Email becomes an optional recovery channel. Email-less accounts get a one-time recovery code (shown once, with a /password/recover reset flow that rotates on use). Registration is protected by a self-hosted ALTCHA proof-of-work captcha (no third-party service, no user puzzle; ALTCHA_ENABLED / ALTCHA_HMAC_KEY), plus a honeypot and submission-timing check.
StarDict dictionary uploads via archives (#233)
The import form now accepts .zip, .tar.gz, .tar.bz2, .tar.xz, and .tgz containing the StarDict triplet, and FreeDict downloads import directly. Extraction is shared via a new ArchiveExtractor (zip-bomb cap, path-traversal guard, automatic cleanup).
Security hardening
A multi-phase audit closed a wide range of issues, each with regression tests:
- XSS: fixed
json_encode-into-<script>breakouts (missingJSON_HEX_TAG | JSON_HEX_AMP), DOM sinks in the word popup / tooltips / Glosbe translations, and theaddslashes-into-attribute anti-pattern (feed browse, confirm dialogs). - CSRF: added real CSRF enforcement to the auth POST endpoints (
/login,/register,/password/*, email re-verification) and fixed bulk vocabulary / texts actions that posted without a token. - Auth: open-redirect fix on
auth_redirect, timing-safe OAuth state comparison, and invalidation of remember-me + API tokens on password change/reset. - Authz / IDOR: cross-table ownership guards on dictionaries, feeds, and sentence lookups (
languageBelongsToCurrentUser). - SSRF: outbound fetches (RSS, web/article extractors, Gutenberg, Wiktionary) now route through a central
safeHttpGetthat disables stream-level redirects and re-validates every hop. - Uploads: defensive depth for importers — filename sanitization at the boundary, tar list-before-extract with file caps, size caps on subtitle/JSON/CSV imports, BOM/UTF-16 handling, and reliable temp-file cleanup.
- Audio: hardened position save (
pagehide+sendBeacon, periodic checkpoint), float precision, Whisper MIME re-validation, and a rate limit on transcription. - Dependency scans (
composer audit,npm audit --omit=dev) report 0 advisories.
Fixed
- Navbar hamburger hidden under the status/camera bar on edge-to-edge phones — the navbar now respects safe-area insets.
- Login/registration field icons rendering outside the input (PurgeCSS stripped Bulma's
.icon.is-left/right; now safelisted). - Misspelled
currentlangagesettings key broke the current-language TTS voice and term-translation language context. - 429 PHP 8.5 deprecation warnings cleared (redundant
setAccessible(true)in tests;ord()on a multi-byte char). - Saving a text with multi-word expressions in multi-user mode 500'd on a binding-misalignment FK violation.
- Saving 2+ tags on a term/text threw on the 20-char cap (Tagify comma-serialization now split in the service layer).
- Multi-word term selection captured inline translation hints (now reads the clean surface form).
Developer proposals (docs only)
- Single
data_hexword identity (#237) — replace theTERM<hex>class-as-index with adata_hexattribute. - Term-status model + FSRS scheduling (#238) — collapse the scattered 1–5/98/99 literals onto
TermStatusand align review scheduling with Anki/FSRS.
Both are proposals; implementation is deferred.
Full changelog: https://github.com/HugoFara/lwt/blob/main/CHANGELOG.md