When Fast Fourier Transform Meets Transformer for Image Restoration

SFHformer, an image restoration framework that integrates Fast Fourier Transform (FFT) with Transformer architecture to address various image degradations like dehazing, deraining, and deblurring. By analyzing degradation from a frequency perspective, the model uses a dual-domain hybrid structure for multi-scale receptive fields, with spatial and frequency domains focusing on local and global modeling respectively. Extensive tests on thirty-one datasets across ten restoration tasks show SFHformer outperforms existing methods while balancing performance, parameter size, and computational cost.

Official implementation. Xingyu Jiang, Xiuhui Zhang, Ning Gao, Yue Deng School of Astronautics, Beihang University, Beijing, China Thanks for your interest in our work, we will continue to optimize our code. If you have any other questions, please feel free to raise them in the issues, and I will try my best to address them - May 20, 2025: Our extension work SWFormer:"Image Restoration via Multi-domain Learning" of SFHformer is available at https://arxiv.org/pdf/2505.05504. Github Code: https://github.com/deng-ai-lab/SWFormer. - Apr 11, 2025: We release some visualizations of the dataset in the Visual result section. - Mar 27, 2025: We release the pre-training weights of ITS and OTS with the test code in the dehazing folder. - Oct 17, 2024: The train code is now open and our paper is available here - Jul 25, 2024: Paper accepted at ECCV 2024. Abstract: Natural images can suffer from various degradation phenomena caused by adverse atmospheric conditions or unique degradation mechanism. Such diversity makes it challenging to design a universal framework for kinds of restoration tasks. Instead of exploring the commonality across different degradation phenomena, existing image restoration methods focus on the modification of network architecture under limited restoration priors. In this work, we first review various degradation phenomena from a frequency perspective as prior. Based on this, we propose an efficient image restoration framework, dubbed SFHformer, which incorporates the Fast Fourier Transform mechanism into Transformer architecture. Specifically, we design a dual domain hybrid structure for multi-scale receptive fields modeling, in which the spatial domain and the frequency domain focuses on local modeling and global modeling, respectively. Moreover, we design unique positional coding and frequency dynamic convolution for each frequency component to extract rich frequency-domain features. Extensive experiments on thirty-one restoration datasets for a range of ten restoration tasks such as deraining, dehazing, deblurring, desnowing, denoising, super-resolution and underwater/low-light enhancement, demonstrate that our SFHformer surpasses the state-of-the-art approaches and achieves a favorable trade-off between performance, parameter size and computational cost. Experiments are performed for different image restoration tasks including, image dehazing, image deraining, image desnowing, image denoising, image super-resolution, single-image motion deblurring, defocus deblurring, image raindrop removal, low-light image enhancement and underwater image enhancement. Deraining Datasets: Rain200L/Rain200H DDN-Data DID-Data Train DID-Data Test SPA-Data Raindrop Dehazing Datasets: ITS OTS O-HAZE NH-HAZE DENSE-HAZE SOTS Low-light Enhancement Datasets: LOLv1 LOLv2 FiveK Motion Deblur Datasets: Motion Blur GoPro/HIDE/RealBlur-R/RealBlur-J Defocus Deblur Datasets: DPDD Desnowing Datasets: CSD SRRS Snow100K Underwater Enhancement Datasets: UIEB LSUI Denoise Datasets: SIDD Super-resolution Datasets: DIV2K Set5 Set14 B100 Urban100 Manga109 Low-light Enhancement Datasets: LOLv2-r LOLv2-s Motion Deblur Datasets: GoPro For more details, see the supplementary material here Here is the BibTeX citation for the paper: @inproceedings{jiang2024fast, title={When Fast Fourier Transform Meets Transformer for Image Restoration}, author={Jiang, Xingyu and Zhang, Xiuhui and Gao, Ning and Deng, Yue}, booktitle={European Conference on Computer Vision}, pages={381--402}, year={2024}, organization={Springer} } Part of our code is based on the Dehazeformer and Restormer. Thanks for their awesome work. If your submitted issue has not been noticed or there are further questions, please contact jxy33zrhd@buaa.edu.cn.