Description
When decoding JPEG Lossless (process 14, SOF3), out-of-range predictor reconstructions are clamped to the sample range instead of being wrapped modulo 65536 as ITU T.81 requires (§H.2.1: "the prediction is calculated modulo 65536"; the reconstructed value is then reduced to the declared precision). Any sample whose prediction + diff falls outside [0, 2^P - 1] saturates to the maximum sample value, and because the bad value feeds subsequent predictions, the corruption propagates to neighboring samples as well.
This is not a synthetic edge case: Philips and Samsung ultrasound DICOM encoders (transfer syntax 1.2.840.10008.1.2.4.70, 8-bit, predictor 1) routinely emit diffs that rely on the modulo wrap — e.g. encoding sample value 128 from prediction 52 as diff −180 rather than +76. In a sample of 174 real ultrasound instances from such modalities, 172 decoded incorrectly: essentially every pixel ≥ 128 came back as 255, visually blowing out the image to white.
Worked example: prediction 52, decoded diff −180 → T.81 gives (52 − 180) mod 65536 = 65408, low 8 bits → 128. pylibjpeg-libjpeg returns 255.
Reproduced on pylibjpeg-libjpeg 2.4.0 and 2.3.0 (macOS arm64, Python 3.11). GDCM, imagecodecs (ljpeg_decode), and an independent hand-written T.81 decoder all agree on the expected output for both the real-world files and the repro below, bit-for-bit.
Minimal reproducer
Fully synthetic 4×6 8-bit grayscale codestream (87 bytes, no patient data) whose positive diffs are encoded as their mod-256-equivalent negative values — a legal encoding that exercises the wrap:
import base64
import numpy as np
from libjpeg import decode
stream = base64.b64decode(
"/9j/wwALCAAEAAYBAREA/8QAGAABAQEBAQAAAAAAAAAAAAAABAUGBwj/2gAIAQEAAQAA"
"5n5L8tTPO/Psr538E43ivDy+fvAHgDz54Cw/G+K+N/Mcn//Z"
)
expected = np.array(
[
[52, 128, 219, 200, 64, 255],
[10, 130, 140, 90, 240, 128],
[0, 127, 128, 129, 254, 1],
[200, 100, 250, 50, 150, 128],
],
dtype=np.uint8,
)
got = decode(stream)
print(got)
np.testing.assert_array_equal(got, expected)
Actual output on 2.4.0:
[[ 52 255 255 255 255 255]
[ 10 255 255 255 255 255]
[ 0 127 255 255 255 255]
[255 255 255 255 255 255]]
Note the propagation: row 0 contains the true value 64, but because it is predicted from an already-saturated neighbor it also comes back as 255.
Generator used to build the codestream (for fuzzing more cases)
def encode_sv1_with_wrap_diffs(arr):
"""8-bit single-component JPEG Lossless SV1 (predictor 1); every positive
diff is emitted as its mod-256-equivalent negative value so that
prediction + diff lands outside [0, 255] and the decoder must wrap."""
height, width = arr.shape
diffs = []
for n in range(height * width):
row, col = divmod(n, width)
if n == 0:
prediction = 128
elif col == 0:
prediction = int(arr[row - 1, 0])
else:
prediction = int(arr[row, col - 1])
diff = int(arr[row, col]) - prediction
if diff > 0:
diff -= 256
diffs.append(diff)
def ssss_of(diff):
return 0 if diff == 0 else int(abs(diff)).bit_length()
categories = sorted({ssss_of(d) for d in diffs})
counts = [0] * 16
for i in range(len(categories)):
counts[i] += 1
code_for_category = {}
code = 0
for i, category in enumerate(categories):
code_for_category[category] = (i + 1, code)
code = (code + 1) << 1
bits = []
def put(value, n_bits):
for b in range(n_bits - 1, -1, -1):
bits.append((value >> b) & 1)
for diff in diffs:
category = ssss_of(diff)
length, code = code_for_category[category]
put(code, length)
if category:
put(diff if diff > 0 else diff + (1 << category) - 1, category)
while len(bits) % 8:
bits.append(1)
scan = bytearray()
for i in range(0, len(bits), 8):
byte = 0
for bit in bits[i : i + 8]:
byte = (byte << 1) | bit
scan.append(byte)
if byte == 0xFF:
scan.append(0x00)
sof = bytes([8]) + height.to_bytes(2, "big") + width.to_bytes(2, "big")
sof += bytes([1, 0x01, 0x11, 0x00])
dht = bytes([0x00]) + bytes(counts) + bytes(categories)
sos = bytes([1, 0x01, 0x00, 0x01, 0x00, 0x00])
stream = b"\xff\xd8"
stream += b"\xff\xc3" + (len(sof) + 2).to_bytes(2, "big") + sof
stream += b"\xff\xc4" + (len(dht) + 2).to_bytes(2, "big") + dht
stream += b"\xff\xda" + (len(sos) + 2).to_bytes(2, "big") + sos
stream += bytes(scan) + b"\xff\xd9"
return stream
Expected behaviour
Decoded samples match a conformant T.81 decoder: reconstruction modulo 65536, then truncation to the declared precision (GDCM and imagecodecs both produce the expected array above).
Environment
- pylibjpeg-libjpeg 2.4.0 (also reproduced on 2.3.0)
- pylibjpeg 2.1.0, pydicom 3.0.2 (bug also reproduces with
libjpeg.decode directly, no pydicom involved)
- Python 3.11, macOS 15 arm64
Happy to provide more real-world DICOM samples privately if useful (the originals contain PHI, hence the synthetic repro).
Description
When decoding JPEG Lossless (process 14, SOF3), out-of-range predictor reconstructions are clamped to the sample range instead of being wrapped modulo 65536 as ITU T.81 requires (§H.2.1: "the prediction is calculated modulo 65536"; the reconstructed value is then reduced to the declared precision). Any sample whose
prediction + difffalls outside[0, 2^P - 1]saturates to the maximum sample value, and because the bad value feeds subsequent predictions, the corruption propagates to neighboring samples as well.This is not a synthetic edge case: Philips and Samsung ultrasound DICOM encoders (transfer syntax
1.2.840.10008.1.2.4.70, 8-bit, predictor 1) routinely emit diffs that rely on the modulo wrap — e.g. encoding sample value 128 from prediction 52 as diff −180 rather than +76. In a sample of 174 real ultrasound instances from such modalities, 172 decoded incorrectly: essentially every pixel ≥ 128 came back as 255, visually blowing out the image to white.Worked example: prediction 52, decoded diff −180 → T.81 gives
(52 − 180) mod 65536 = 65408, low 8 bits → 128. pylibjpeg-libjpeg returns 255.Reproduced on pylibjpeg-libjpeg 2.4.0 and 2.3.0 (macOS arm64, Python 3.11). GDCM, imagecodecs (
ljpeg_decode), and an independent hand-written T.81 decoder all agree on the expected output for both the real-world files and the repro below, bit-for-bit.Minimal reproducer
Fully synthetic 4×6 8-bit grayscale codestream (87 bytes, no patient data) whose positive diffs are encoded as their mod-256-equivalent negative values — a legal encoding that exercises the wrap:
Actual output on 2.4.0:
Note the propagation: row 0 contains the true value 64, but because it is predicted from an already-saturated neighbor it also comes back as 255.
Generator used to build the codestream (for fuzzing more cases)
Expected behaviour
Decoded samples match a conformant T.81 decoder: reconstruction modulo 65536, then truncation to the declared precision (GDCM and imagecodecs both produce the
expectedarray above).Environment
libjpeg.decodedirectly, no pydicom involved)Happy to provide more real-world DICOM samples privately if useful (the originals contain PHI, hence the synthetic repro).