Skip to content

Schindler-EPFL-Lab/SEAR

Repository files navigation

SEAR: Simple and Efficient Adaptation of Visual Geometric Transformers for RGB+Thermal 3D Reconstruction

This project aims to estimate camera poses of RGB and Thermal images together.

Hugging Face | arXiv

Install

Clone this repo and VGGT

git clone https://github.com/Schindler-EPFL-Lab/SEAR.git
cd SEAR
git clone https://github.com/facebookresearch/vggt.git

Install with uv:

uv sync --all-extras

Train the model

Install VGGT checkpoint VGGT-1B.

To train our model run this script:

python sear/scripts/train_sear.py --thermal-vggt.vggt-path /path/to/vggt/weights.pth

Ablation studies can run by using the other aggregator-types found in sear/ablation_models/possible_aggregators.py.

Models can be evaluated after training with sear/scripts/eval/ablation_vggt.py.

To run the evaluation see the tutorials for camera pose and point cloud, relative camera pose from two views and dependence on thermal ratio.

Training Data

Our training dataset is a combination of the following dataset:

We provide a compilation of all training dataset as well as ours.

See details of the data processing in Dataset documentation.

Cite us

@misc{skorokhodov2026searsimpleefficientadaptation,
      title={SEAR: Simple and Efficient Adaptation of Visual Geometric Transformers for RGB+Thermal 3D Reconstruction},
      author={Vsevolod Skorokhodov and Chenghao Xu and Shuo Sun and Olga Fink and Malcolm Mielle},
      year={2026},
      eprint={2603.18774},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.18774},
}

About

Simple and Efficient Adaptation of Visual Geometric Transformers for RGB+Thermal 3D Reconstruction

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors

Languages