Skip to content

[Store][TE] Fix remaining signed-char ::tolower/::toupper UB missed by #2367#2488

Merged
alogfans merged 2 commits into
kvcache-ai:mainfrom
bp-cheng:fix/tolower-ub-remaining-sites
Jun 18, 2026
Merged

[Store][TE] Fix remaining signed-char ::tolower/::toupper UB missed by #2367#2488
alogfans merged 2 commits into
kvcache-ai:mainfrom
bp-cheng:fix/tolower-ub-remaining-sites

Conversation

@bp-cheng

@bp-cheng bp-cheng commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Fixes #2487

Description

Follow-up to #2367, which fixed UB from passing a (possibly signed) char to ::tolower/::toupper via std::transform. Three sibling occurrences on operator/CLI-controlled input were missed and are still present on main:

File Input source
mooncake-transfer-engine/src/transport/tcp_transport/tcp_transport.cpp:606 env MC_TCP_ENABLE_CONNECTION_POOL
mooncake-store/include/utils.h:225 size-string unit suffix (env MC_MMAP_ARENA_POOL_SIZE, global_segment_size CLI flag, config map)
mooncake-store/include/ha/oplog/oplog_store_factory.h:34 HA oplog_store_type config string

Passing a byte > 0x7F to ::tolower/::toupper sign-extends to a negative int, which is UB (the argument must be representable as unsigned char or equal EOF). All three sites take operator/attacker-controllable bytes, not kernel-generated ASCII.

Each is fixed with the exact idiom from #2367[](unsigned char c) { return std::tolower(c); } — and <cctype> is added to the two files where it was only available transitively (utils.h already includes it). Notably utils.h:274 already uses the safe lambda; line 225 was simply overlooked in the original sweep.

Behavior is unchanged for ASCII input; this only removes the UB on non-ASCII bytes.

Module

  • Mooncake Store (mooncake-store)
  • Transfer Engine (mooncake-transfer-engine)

Type of Change

  • Bug fix

How Has This Been Tested?

Test commands:

# Ubuntu 22.04 (aarch64) container, cmake -DWITH_STORE_RUST=OFF -DUSE_HTTP=ON, ninja
ninja

Test results:

  • Unit tests pass
  • Integration tests pass (if applicable)
  • Manual testing done (describe below)

This is the same class of fix as #2367, which shipped with no new test: the change is behavior-preserving for ASCII, and the UB is not portably observable (on glibc the ctype tables span -128..255, so it does not crash there). Validation is compile-only plus the existing suite: a full incremental build succeeds (all three headers/TUs and their dependents recompile cleanly with the added <cctype> includes), and the existing tcp_transport / store tests pass.

Checklist

  • I have performed a self-review of my own code
  • I have formatted my code using ./scripts/code_format.sh
  • I have run pre-commit run --all-files and all hooks pass
  • I have updated the documentation (if applicable)
  • I have added tests to prove my changes are effective
  • For changes >500 LOC: I have filed an RFC issue

(No new test, matching the precedent set by #2367 for this UB class; the fix is behavior-preserving for ASCII and the UB is not portably observable on the target platform.)

AI Assistance Disclosure

  • No AI tools were used
  • AI tools were used (specify below)

Claude Code (Claude Fable 5) located the three remaining sites missed by #2367, applied the established idiom, and verified the build in a local Linux container. I reviewed the changes and am responsible for them.

…kvcache-ai#2367

PR kvcache-ai#2367 fixed UB from passing a (possibly signed) char to ::tolower/
::toupper via std::transform, but three sibling sites on operator/CLI-
controlled input were missed and remain on main:

- mooncake-transfer-engine/src/transport/tcp_transport/tcp_transport.cpp:
  env MC_TCP_ENABLE_CONNECTION_POOL
- mooncake-store/include/utils.h: size-string unit suffix (env
  MC_MMAP_ARENA_POOL_SIZE, global_segment_size flag, config map).
  Line 274 of this file already uses the safe idiom; line 225 was missed.
- mooncake-store/include/ha/oplog/oplog_store_factory.h: HA
  oplog_store_type config string

A byte > 0x7F sign-extends to a negative int, which is UB for these
functions (argument must be representable as unsigned char or equal
EOF). Apply the same lambda idiom as kvcache-ai#2367 and add <cctype> where it
was only available transitively. Behavior is unchanged for ASCII input.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors string case conversion operations across multiple files to use lambdas with unsigned char arguments instead of directly passing ::tolower or ::toupper, preventing undefined behavior. It also adds the necessary <cctype> header inclusions. The review feedback suggests explicitly specifying the return type of these lambdas as char (using -> char) to prevent potential compiler warnings regarding implicit narrowing conversions from int to char under strict compilation flags.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +35 to +36
std::transform(lower.begin(), lower.end(), lower.begin(),
[](unsigned char c) { return std::tolower(c); });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential compiler warnings regarding implicit narrowing conversions (from int to char) under strict warning flags (such as -Wconversion), it is recommended to explicitly specify the return type of the lambda as char.

Suggested change
std::transform(lower.begin(), lower.end(), lower.begin(),
[](unsigned char c) { return std::tolower(c); });
std::transform(lower.begin(), lower.end(), lower.begin(),
[](unsigned char c) -> char { return std::tolower(c); });

Comment thread mooncake-store/include/utils.h Outdated
Comment on lines +225 to +226
std::transform(unit.begin(), unit.end(), unit.begin(),
[](unsigned char c) { return std::toupper(c); });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential compiler warnings regarding implicit narrowing conversions (from int to char) under strict warning flags (such as -Wconversion), it is recommended to explicitly specify the return type of the lambda as char.

Suggested change
std::transform(unit.begin(), unit.end(), unit.begin(),
[](unsigned char c) { return std::toupper(c); });
std::transform(unit.begin(), unit.end(), unit.begin(),
[](unsigned char c) -> char { return std::toupper(c); });

Comment on lines +607 to +608
std::transform(val.begin(), val.end(), val.begin(),
[](unsigned char c) { return std::tolower(c); });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential compiler warnings regarding implicit narrowing conversions (from int to char) under strict warning flags (such as -Wconversion), it is recommended to explicitly specify the return type of the lambda as char.

Suggested change
std::transform(val.begin(), val.end(), val.begin(),
[](unsigned char c) { return std::tolower(c); });
std::transform(val.begin(), val.end(), val.begin(),
[](unsigned char c) -> char { return std::tolower(c); });

@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...gine/src/transport/tcp_transport/tcp_transport.cpp 0.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@bp-cheng

Copy link
Copy Markdown
Contributor Author

The single failing matrix leg — test-wheel-ubuntu (ubuntu-22.04, 3.12) — died on a CI-runner disk-space issue, not on this change:

##[error]No space left on device : '/home/runner/actions-runner/cached/2.335.1/_diag/pages/...log'

It ran out of space during a torch reinstall in the test env. Evidence this is infra-only and unrelated to the diff:

  • The same wheel test suite passed on test-wheel-ubuntu (ubuntu-24.04, 3.10).
  • All build legs are green: build (3.10/3.12), build-flags, build-musa, build-wheel-cu13, the Docker image, and the Ascend build — so the change compiles across every platform/flag combo.
  • The other two wheel legs show as cancelled (matrix fail-fast after the disk-space leg tripped), not as genuine failures.

The change itself is an 8-line, behavior-preserving ::tolower/::toupper UB fix and can't cause disk exhaustion. Could a maintainer re-run the failed job (or apply run-ci) when convenient? Thanks!

…turn type

Address review feedback on kvcache-ai#2488: std::tolower/std::toupper return
int, so the lambdas implicitly narrow back to char when std::transform
writes them to the std::string output iterator. Strict compiler flags
(-Wconversion / -Wnarrowing on some toolchains) warn on this. Pinning
the return type to char makes the narrowing explicit at the lambda
boundary and silences the warning without changing behavior.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@jfeng18 jfeng18 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified the UB and the fix — LGTM with a scope note.

All three sites are genuine signed-char ctype UB: a std::string byte > 0x7F sign-extends to a negative int when passed to ::tolower/::toupper, which is UB (the argument must be representable as unsigned char or be EOF). The unsigned char lambda parameter forces the well-defined conversion first, and the single-arg std::tolower/std::toupper resolves to the <cctype> overload (not the 2-arg <locale> template). <cctype> is correctly added to the two TUs that only had it transitively. Behavior is identical for ASCII, well-defined for non-ASCII.

Scope note (non-blocking): the same UB class still exists in the C-loop form *ch = tolower(*ch) at 7 sites — topology.cpp:398, tent/src/runtime/memory_prober.cpp:377, tent/src/platform/cuda/cuda_probe.cpp:194, tent/src/platform/rocm/rocm_probe.cpp:185, benchmark/te_backend.cpp:53 & :66, benchmark/tent_backend.cpp:260. These all operate on pci_bus_id (a driver-generated PCI BDF string — ASCII, not operator-controllable), so they are outside this PR's threat model; could be a follow-up cleanup for completeness.

@bp-cheng

Copy link
Copy Markdown
Contributor Author

Thanks for the careful review, @jfeng18 — agreed on the scope note. Those 7 C-loop *ch = tolower(*ch) sites all operate on pci_bus_id, which is a driver-generated PCI BDF string (ASCII, not operator-controllable), so they're outside this PR's threat model as you say. Happy to send a separate follow-up PR that converts them for completeness/consistency if you'd like — just let me know and I'll open one.

@jfeng18

jfeng18 commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

A separate follow-up PR for the 7 BDF-loop sites sounds great — keeping this PR scoped to the operator-controllable paths and handling the ASCII-only BDF loops independently is the right split. Thanks!

@bp-cheng

Copy link
Copy Markdown
Contributor Author

Opened the follow-up for the 7 ASCII BDF-loop sites as discussed: #2504. Keeps this PR scoped to the operator-controllable paths. Thanks again for the review!

@alogfans

Copy link
Copy Markdown
Collaborator

Same comment with #2488. I'd like to use a small inline function.

bp-cheng added a commit to bp-cheng/Mooncake that referenced this pull request Jun 16, 2026
…ite casts

Per @alogfans's review on kvcache-ai#2504/kvcache-ai#2488: replace the repeated
static_cast<char>(std::tolower(static_cast<unsigned char>(*ch))) at each
PCI-BDF lowercasing loop with a single inline te_lower() helper added to
common.h. Same well-defined behavior, no more mass typecasts.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@bp-cheng

Copy link
Copy Markdown
Contributor Author

Thanks @alogfans! I introduced the uniform te_lower() helper in #2504 (where the repeated casts — the "mass typecasts" — actually live, all 7 sites in mooncake-transfer-engine).

For this PR, the 3 sites are single std::transform(..., [](unsigned char c){ return std::tolower(c); }) calls — the exact idiom already merged in #2367, and two of them are in mooncake-store rather than the transfer engine. Converting them here would mean adding the same te_lower() to common.h that #2504 already adds (a guaranteed merge conflict between the two PRs) and re-opening this already-approved change. So I'd suggest keeping #2488 on the merged #2367 lambda idiom and landing the shared helper via #2504. Happy to switch the tcp_transport.cpp site over to te_lower() in a tiny follow-up once #2504 merges, if you'd prefer full uniformity there too.

@alogfans alogfans merged commit 945f3e6 into kvcache-ai:main Jun 18, 2026
21 checks passed
alogfans pushed a commit that referenced this pull request Jun 18, 2026
…up to #2367/#2488) (#2504)

* [TE] Fix signed-char tolower UB in PCI BDF lowercasing loops

Follow-up to #2367/#2488 (requested in #2488): the C-style PCI-BDF
lowercasing loops still pass a (possibly signed) char to tolower, which
sign-extends bytes > 0x7F to a negative int — UB, since the argument
must be representable as unsigned char or equal EOF.

Convert the 7 sites (topology.cpp, tent cuda/rocm probes, tent
memory_prober, benchmark te/tent backends) to
static_cast<char>(std::tolower(static_cast<unsigned char>(*ch))),
matching the idiom from #2367, and add/normalize <cctype> includes.

The strings are driver-generated PCI bus ids (ASCII), so this is a
correctness/consistency cleanup; behavior is identical for ASCII.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* [TE] Address review: use a uniform te_lower() helper instead of per-site casts

Per @alogfans's review on #2504/#2488: replace the repeated
static_cast<char>(std::tolower(static_cast<unsigned char>(*ch))) at each
PCI-BDF lowercasing loop with a single inline te_lower() helper added to
common.h. Same well-defined behavior, no more mass typecasts.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* [TE] Move te_lower() to a lightweight char_util.h to fix tent build

The previous commit put te_lower() in common.h, but pulling common.h into
the TENT TUs collides with tent/common/types.h, which defines
LOCAL_SEGMENT_ID as a macro `#define LOCAL_SEGMENT_ID (0ull)` while
common.h declares `const static int LOCAL_SEGMENT_ID = 0;` — the macro
expands inside the declaration and breaks the build (build-flags CI).

Put te_lower() in a new minimal header char_util.h that only pulls in
<cctype>, and include it at all 7 sites instead of common.h. No collision
with the tent macro, still a single uniform helper per @alogfans's review.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* [TE] Name the helper to_lower() to match the std::tolower convention

Rename te_lower() -> to_lower() in char_util.h and the 7 call sites. The
repo has no existing named lowercase helper to reuse, and to_lower reads
naturally alongside std::tolower.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Store][TE] Three signed-char ::tolower/::toupper UB sites missed by #2367

4 participants