JPEG Lossless: out-of-range predictor reconstructions clamped instead of wrapped modulo 65536 (T.81 H.2.1) — samples saturate to max

### Description

When decoding JPEG Lossless (process 14, SOF3), out-of-range predictor reconstructions are **clamped to the sample range** instead of being **wrapped modulo 65536** as ITU T.81 requires (§H.2.1: "the prediction is calculated modulo 65536"; the reconstructed value is then reduced to the declared precision). Any sample whose `prediction + diff` falls outside `[0, 2^P - 1]` saturates to the maximum sample value, and because the bad value feeds subsequent predictions, the corruption propagates to neighboring samples as well.

This is not a synthetic edge case: Philips and Samsung ultrasound DICOM encoders (transfer syntax `1.2.840.10008.1.2.4.70`, 8-bit, predictor 1) routinely emit diffs that rely on the modulo wrap — e.g. encoding sample value 128 from prediction 52 as diff −180 rather than +76. In a sample of 174 real ultrasound instances from such modalities, 172 decoded incorrectly: essentially every pixel ≥ 128 came back as 255, visually blowing out the image to white.

Worked example: prediction 52, decoded diff −180 → T.81 gives `(52 − 180) mod 65536 = 65408`, low 8 bits → **128**. pylibjpeg-libjpeg returns **255**.

Reproduced on pylibjpeg-libjpeg 2.4.0 and 2.3.0 (macOS arm64, Python 3.11). GDCM, imagecodecs (`ljpeg_decode`), and an independent hand-written T.81 decoder all agree on the expected output for both the real-world files and the repro below, bit-for-bit.

### Minimal reproducer

Fully synthetic 4×6 8-bit grayscale codestream (87 bytes, no patient data) whose positive diffs are encoded as their mod-256-equivalent negative values — a legal encoding that exercises the wrap:

```python
import base64
import numpy as np
from libjpeg import decode

stream = base64.b64decode(
    "/9j/wwALCAAEAAYBAREA/8QAGAABAQEBAQAAAAAAAAAAAAAABAUGBwj/2gAIAQEAAQAA"
    "5n5L8tTPO/Psr538E43ivDy+fvAHgDz54Cw/G+K+N/Mcn//Z"
)

expected = np.array(
    [
        [52, 128, 219, 200, 64, 255],
        [10, 130, 140, 90, 240, 128],
        [0, 127, 128, 129, 254, 1],
        [200, 100, 250, 50, 150, 128],
    ],
    dtype=np.uint8,
)

got = decode(stream)
print(got)
np.testing.assert_array_equal(got, expected)
```

Actual output on 2.4.0:

```
[[ 52 255 255 255 255 255]
 [ 10 255 255 255 255 255]
 [  0 127 255 255 255 255]
 [255 255 255 255 255 255]]
```

Note the propagation: row 0 contains the true value 64, but because it is predicted from an already-saturated neighbor it also comes back as 255.

<details>
<summary>Generator used to build the codestream (for fuzzing more cases)</summary>

```python
def encode_sv1_with_wrap_diffs(arr):
    """8-bit single-component JPEG Lossless SV1 (predictor 1); every positive
    diff is emitted as its mod-256-equivalent negative value so that
    prediction + diff lands outside [0, 255] and the decoder must wrap."""
    height, width = arr.shape
    diffs = []
    for n in range(height * width):
        row, col = divmod(n, width)
        if n == 0:
            prediction = 128
        elif col == 0:
            prediction = int(arr[row - 1, 0])
        else:
            prediction = int(arr[row, col - 1])
        diff = int(arr[row, col]) - prediction
        if diff > 0:
            diff -= 256
        diffs.append(diff)

    def ssss_of(diff):
        return 0 if diff == 0 else int(abs(diff)).bit_length()

    categories = sorted({ssss_of(d) for d in diffs})
    counts = [0] * 16
    for i in range(len(categories)):
        counts[i] += 1
    code_for_category = {}
    code = 0
    for i, category in enumerate(categories):
        code_for_category[category] = (i + 1, code)
        code = (code + 1) << 1

    bits = []

    def put(value, n_bits):
        for b in range(n_bits - 1, -1, -1):
            bits.append((value >> b) & 1)

    for diff in diffs:
        category = ssss_of(diff)
        length, code = code_for_category[category]
        put(code, length)
        if category:
            put(diff if diff > 0 else diff + (1 << category) - 1, category)
    while len(bits) % 8:
        bits.append(1)
    scan = bytearray()
    for i in range(0, len(bits), 8):
        byte = 0
        for bit in bits[i : i + 8]:
            byte = (byte << 1) | bit
        scan.append(byte)
        if byte == 0xFF:
            scan.append(0x00)

    sof = bytes([8]) + height.to_bytes(2, "big") + width.to_bytes(2, "big")
    sof += bytes([1, 0x01, 0x11, 0x00])
    dht = bytes([0x00]) + bytes(counts) + bytes(categories)
    sos = bytes([1, 0x01, 0x00, 0x01, 0x00, 0x00])
    stream = b"\xff\xd8"
    stream += b"\xff\xc3" + (len(sof) + 2).to_bytes(2, "big") + sof
    stream += b"\xff\xc4" + (len(dht) + 2).to_bytes(2, "big") + dht
    stream += b"\xff\xda" + (len(sos) + 2).to_bytes(2, "big") + sos
    stream += bytes(scan) + b"\xff\xd9"
    return stream
```

</details>

### Expected behaviour

Decoded samples match a conformant T.81 decoder: reconstruction modulo 65536, then truncation to the declared precision (GDCM and imagecodecs both produce the `expected` array above).

### Environment

- pylibjpeg-libjpeg 2.4.0 (also reproduced on 2.3.0)
- pylibjpeg 2.1.0, pydicom 3.0.2 (bug also reproduces with `libjpeg.decode` directly, no pydicom involved)
- Python 3.11, macOS 15 arm64

Happy to provide more real-world DICOM samples privately if useful (the originals contain PHI, hence the synthetic repro).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JPEG Lossless: out-of-range predictor reconstructions clamped instead of wrapped modulo 65536 (T.81 H.2.1) — samples saturate to max #90

Description

Minimal reproducer

Expected behaviour

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

JPEG Lossless: out-of-range predictor reconstructions clamped instead of wrapped modulo 65536 (T.81 H.2.1) — samples saturate to max #90

Description

Description

Minimal reproducer

Expected behaviour

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions