Skip to content

LuizScarlet/AEIC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ultra-Low Bitrate Perceptual Image Compression with Shallow Encoder

Asymmetric Extreme Image Codec for Real-Time Encoding!

Tianyu Zhang, Dong Liu, Chang Wen Chen

University of Science and Technology of China, The Hong Kong Polytechnic University

arXiv  visitors 

📝 Overview

  1. Ultra-low bitrate image compression (<0.05bpp) is increasingly critical for bandwidth-constrained and computation-limited encoding scenarios such as edge devices.
  2. We show that ultra-low bitrate allows for shallow encoders and propose Asymmetric Extreme Image Compression (AEIC) framework that pursues simultaneously encoding simplicity and decoding quality. Specifically, AEIC:
    • Outperforms advanced methods in terms of rate-distortion-perception performance.
    • Delivers exceptional encoding efficiency for 35.8 FPS@1080P
    • Maintains competitive decoding speed compared to existing methods.

⌛ Updates

[TODO] Pack the remaining code ...
[2026/04/06] Release training code for AEIC-ME.
[2026/03/11] Release pretrained checkpoints for inference.
[2026/03/10] Results on benchmarks are now available, see results/.
[2026/02/26] Initial release of this repo.

😍 Performance

  1. Rate-Perception performance:

  2. Rate-Distortion performance:

  3. Visual performance:

  4. Practical coding latency (ms) on two kinds of GPUs and image resolutions. Both the encoding and decoding process include the autoregressive entropy coding with the entropy model. The best results are highlighted in bold, while the best results among ultra-low bitrate codec are underlined. "OOM" means out of memory. We also report the 🔴 [encoding FPS] for AEIC models:

  5. Complexity in parameters (M) and MACs (K) per pixel:

⚙ Installation

conda create -n aeic python=3.10
conda activate aeic
pip install -r requirements.txt

⚡ Inference

Step 1: Prepare your datasets for inference

<PATH_TO_DATASET>/*.png

In our paper, we adopt the following test datasets:

Step 2: Download pretrained checkpoints

  1. Download SD-Turbo and VAE Decoder from Hugging Face.
  2. Download AEIC checkpoints. We provide 2 variants:
    • AEIC-ME: Moderate encoder variants.
    • AEIC-SE: Shallow encoder variants for real-time encoding.

Step 3: Build the entropy coding engine

sudo apt-get install cmake g++
cd src
mkdir build
cd build
cmake ../cpp -DCMAKE_BUILD_TYPE=Release[Debug]
make -j

Step 4: Inference for AEIC models

Please modify the paths in compress.sh, then run bash compress.sh:

python /src/compress.py \
    --sd_path="<PATH_TO_SD_TURBO>/sd-turbo" \
    --img_path="<PATH_TO_DATASET>/Kodak" \
    --rec_path="<PATH_TO_SAVE_OUTPUTS>/rec" \
    --bin_path="<PATH_TO_SAVE_OUTPUTS>/bin" \
    --codec_type="AEIC-SE" \ # Or AEIC-ME
    --codec_path="<PATH_TO_AEIC>/AEIC_SE_ft2.pkl" \
    --vae_decoder_path="<PATH_TO_VAE_DECODER>/halfDecoder.ckpt" \
    # --use_practical_entropy_coding

Notes:

  • The default inference settings enable --use_tiled_vae and --use_tiled_unet for the best reconstruction performance. For fast decoding, please consider disabling tiling options in src/my_utils/testing_utils.
  • To produce practical bitstreams with entropy coder, please enable --use_practical_entropy_coding .

Step 5: Evaluation (Optional)

Run bash eval_folders.sh to compute reconstruction metrics with src/evaluate.py. Please make sure --recon_dir and --gt_dir are specified:

python src/evaluate.py \  
    --gt_dir="<PATH_TO_DATASET>/Kodak/" \  
    --recon_dir="<PATH_TO_SAVE_OUTPUTS>/rec/"

🔥 Training

Step 1: Prepare your datasets for training

Our training data includes:

  • Flickr2K: Contains 2560 2K-resolution images.
  • DIV2K Training Set: Contains 800 2K-resolution images.
  • CLIC: Contains 585 (CLIC 2020 Training) + 41 (CLIC 2020 Validation) + 60 (CLIC 2021 Test) 2K-resolution images.
  • The first 10K images from LSDIR.

We use h5py to organize training data. To construct a .hdf5 training file, please refer to src/my_utils/build_h5.py.

Step 2: Train AEIC-ME (Moderate Encoder)

We perform lightweight training using at most 4x RTX 3090 (24G) GPUs. Consider adjusting batch_size and gradient accumulation for faster or better training performance.

  1. Pretrain a base model with relaxed bitrates: bash pretrain.sh
    Note: You may skip pretraining with our pretrained AEIC_ME_pretrain.pkl.

  2. Finetune towards target bitrates with GAN: bash finetune.sh
    Note: Adjust base.lambda_rate in config/finetune_AEIC_ME.yaml to reach different ultra-low bitrates.

📖 Citation

If you find this work helpful, please consider citing us. Thanks! 🥰

@InProceedings{Zhang_2026_CVPR,
    author    = {Zhang, Tianyu and Liu, Dong and Chen, Chang Wen},
    title     = {Ultra-Low Bitrate Perceptual Image Compression with Shallow Encoder},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2026},
    pages     = {12118-12128}
}

@InProceedings{Zhang_2025_ICCV,
    author    = {Zhang, Tianyu and Luo, Xin and Li, Li and Liu, Dong},
    title     = {StableCodec: Taming One-Step Diffusion for Extreme Image Compression},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {17379-17389}
}

📓 License

This work is licensed under MIT license.

🥰 Acknowledgement

This work is implemented based on StableCodec. During development, we draw inspiration primarily from shallow-ntc, AdcSR and PocketSR. Thanks for their great work!

✉️ Contact

If you have any questions, please feel free to drop me an email:

  • zhangtianyu[at]mail.ustc.edu.cn

Releases

No releases published

Packages

 
 
 

Contributors