Underwater Image Super-Resolution Using Frequency-Domain Enhanced Attention Network

Underwater images super-resolution (SR) is a challenging task due to underwater images usually contain severely blurred details, color distortion, and low contrast. Although numerous deep learning-based methods have been developed to solve these problems, these methods suffer from huge model parameters and computations. To address this gap, we propose a frequency-domain enhanced attention network (FEAN), supported by a series of frequency-enhanced attention modules (FEAM), for accurate underwater SR. Specifically, we start by utilizing a Gaussian filter to decompose the features into high and low frequencies and pass them to the FEAM. Then, in the high-frequency path, we propose a multi-scale attention enhancement block (MAEB) to extract rich image texture information. While in the low-frequency path, we perform a simple convolutional operation to realize the brightness and contrast adjustment of the image. Further, we devise a channel attention fusion block (CAFB) to integrate the enhanced high and low-frequency features to further strengthen the powerful representational capability of the network. Finally, we employ two convolutions to further modulate the features on the high-frequency path for effective color bias correction and detail enhancement. Experimental results show that our FEAN performs better than other underwater SR methods on the USR-248 dataset, with PSNR values of 29.97 dB, 26.23 dB, and 23.99 dB, corresponding to <inline-formula> <tex-math notation="LaTeX">$\times 2$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$\times 4$ </tex-math></inline-formula>, and <inline-formula> <tex-math notation="LaTeX">$\times 8$ </tex-math></inline-formula> scale factors.


I. INTRODUCTION
Underwater image super-resolution (SR) plays a crucial role in various fields such as scientific research and resource development.However, the harsh underwater imaging environment can lead to a series of problems such as reduced contrast, color distortion, scattering effects, and insufficient brightness in underwater images.Therefore, we need to seek ways to improve the quality of these images to meet our development needs.In our research, we will focus on image SR to address these issues.
The associate editor coordinating the review of this manuscript and approving it for publication was Md.Moinul Hossain .
Image SR technology utilizes the internal information and statistical characteristics of images to restore lost highresolution (HR) details from low-resoluFion (LR) images.In the fields of medical [1], [2], remote sensing [3], [4], surveillance [5], [6], [7], image SR technology has widespread applications.Classical image SR methods are usually based on techniques such as signal processing, interpolation, and filtering.However, due to the excessive manual design rules, they lack the ability to learn autonomously and exhibit limited effectiveness in dealing with diverse image processing needs.Recently, in the domain of SR, convolutional neural network (CNN) [8], [9], [10], [11], [12], [13], [14], [15], residual networks (ResNet) [16], [17], [18], [19], [20], [21], and generative adversarial network (GAN) [22], [23], [24], [25], [26] have become major research trends.Dong et al. [9] used a three-layer convolutional network to fit nonlinear maps, obtaining good performance compared to traditional SR methods.Subsequently, Lim et al. [17] adopted a residual network and removed the batch normalization layer [27] to construct a deeper network and achieve significant SR accuracy improvement.Xu et al. [20] combined the ideas of residual connections and dense connections, enabling the network to capture image features from different levels and perspectives, which helps to improve the diversity and richness of features.Jiang et al. [21] designed a hierarchical dense residual block (HDB) to reduce the complexity and number of parameters introduced by dense connections.Zhang et al. [10] constructed an attention mechanism to automatically adjust the interdependence between feature channels.Although the above methods perform well in natural image SR tasks, they do not fully consider the unique characteristics of the underwater environment, such as color casts, color artifacts, and blurring.Accordingly, a wide variety of underwater SR networks have emerged to solve these adverse effects.Islam et al. [28] presented deep simultaneous enhancement and super-resolution (Deep SESR) for underwater SR reconstruction and enhancement.Wang et al. [29] proposed a progressive frequency-interleaved network (PFIN), that used a frequency-domain progressive mechanism to learn different information.Ren et al. [30] presented a novel U-Net-based reinforced Swin-Convs Transformer (URSCT) for simultaneous enhancement and SR.Chen et al. [31] constructed a progressive attentional learning (PAL) method for the underwater SR task to learn a nonlinear mapping from LR images to HR images.Sharma et al. [32] used the traditional pixel-wise and feature-based cost function to propose a novel framework, called Deep WaveNet, improving the spatial resolution of degraded underwater images.Similarly, Chen et al. [33] established a range-dependency learning network (RDLN) to model short and long-range dependency of multiscale features.
Although some progress has been made in underwater image SR, however, existing methods still face some challenges.On the one hand, most SR models have large parameter scales and high memory overheads, making it difficult to apply these models directly to underwater robots.On the other hand, few methods have used frequency domain-based strategies to adjust the high and low-frequency information of an image, thereby discouraging accurate reconstruction of the edges and texture of an image.
To address the aforementioned limitations, we propose a lightweight and effective frequency-domain enhanced attention network (FEAN) for underwater image SR reconstruction.Our FEAN architecture consists of multiple frequency-enhanced attention modules (FEAM), which can explore high and low-frequency components of the image to generate more delicate features.To be specific, we first utilize Gaussian filtering to decompose the features into high and low-frequency branches and process them separately using different operations.Then, we devise a multi-scale attention enhancement block (MAEB) to catch realistic textures in the high-frequency branch, while leveraging simple convolution operation to enhance the low-frequency branch.Finally, we propose a channel attention fusion block (CAFB) to fuse and modulate frequency domain information according to the importance of features.Experimental results demonstrate that our FEAN exceeds mainstream SR approaches with fewer parameters and computations.
To summarize, our contributions can be highlighted in three main aspects: • We propose a lightweight FEAN for underwater SR tasks.Thanks to the network backbone FEAM, supported by the MAEB and CAFB, our FEAN achieves excellent performance with low resource consumption.
• MAEB captures information at different scales and recalibrates features to focus on valuable characteristics, improving the representation of high-frequency branches and reconstructing the edge and texture of underwater images.
• CAFB aggregates high and low-frequency information effectively, and assigns different weights to different levels of information, jointly regulating the impact of information in the frequency domain.

II. RELATED WORK A. UNDERWATER IMAGE SR
With the development of deep learning, image SR technology has become a hot research topic in computer vision tasks.Unlike natural images, underwater images are subject to complex lighting conditions and environments, which tend to have undesirable effects such as color deviation and detail blurring.Therefore, to further improve image reconstruction quality, plenty of works have been developed for underwater image SR tasks.In 2020, the first publicly available dataset USR-248 [24] was introduced for underwater SR tasks.Meanwhile, researchers proposed two fully-convolutional deep residual networks, named SRDRM and SRDRM-GAN, obtaining good performance and visual effects.
Deep SESR adopted the multi-scale strategy to mitigate the local distortion problem in underwater scenes.URSCT [30] integrates Swin Transformer into the U-Net model, thus enhancing its global feature capture capability and ensuring the improvement in the overall reconstruction results.As more and more underwater datasets become available, A. Aghelan [34] employed a pre-training and migration learning strategy to fine-tune Real-ESRGAN [35] using the USR-248 and UFO-120 datasets, generating higher-resolution and better-quality underwater images.RDLN [33] reduces the number of parameters of the network by introducing a channel-splitting module in multi-scale convolution.At the same time, it makes full use of the different properties of CNN and Transformer to learn the short-range and long-range dependencies to generate highquality high-resolution images.Although many effective design methods have been developed to solve underwater SR tasks, existing methods neglect the interaction of frequency information, resulting in poor image reconstruction.

B. ATTENTION-BASED NETWORK
Attention mechanism is an effective optimization tool that focuses on important features to improve computational efficiency and is widely used in SR tasks.Zhang et al. [10] designed a residual channel attention block to learn the channel statistics among channels, boosting the discriminative ability of the network.Additionally, residual attention module (RAM) [36] and channel-wise and spatial feature modulation (CSFM) [37] combined channel attention and spatial attention for utilizing the interdependencies between channels and spatial features.Thanks to this advantage, the attention mechanism has also been introduced in underwater SR tasks to enhance the network reconstruction efficiency.For example, an attention-guided multipath cross-convolution neural network (AMPCNet) [38] introduced a channel attention mechanism in the field of underwater SR to assign more significant weights to high-frequency information, enhancing the edge and texture information in the image.In Deep WaveNet [32], the authors used the convolutional block attention module to regulate the flow of channel-specific information in the network adaptively.PFIN [29] devised a global spatial attention block to catch the meaningful context information by learning a group of weights for removing distorted information.Fu et al. [39] devised nonlocal attention and channel attention mechanisms respectively to mine and enhance more detailed features.These attention mechanism designs can capture valuable feature statistics, but they do not well consider feature different scale information and frequency information, which is not conducive to high-quality underwater image reconstruction.Inspired by the advantages of the attention mechanism, we design MAEB to strengthen the high-frequency features and recover more fine features further.Meanwhile, we propose a lightweight CAFB to explore the interaction of information in different frequency domains, improving the representational capability of the network.

III. PROPOSED METHOD
The overall architecture of the proposed FEAN is illustrated in Fig. 1.The network can be divided into three stages: (1) Stage 1 focuses on extracting shallow features from the input image denoted as X ∈ R H ×W ×3 .(2) Stage 2 is specifically designed for frequency separation and enhancement.(3) Stage 3 aims to rebuild HR underwater images, and the final output is denoted as S ∈ R rH ×rW ×3 .Here, H and W represent the height and width of the input image, respectively.The r is the scaling factor used during the process.
Stage 1: As investigated in [40], we utilize a single convolutional layer to extract the shallow feature F 0 from the LR input X .
Here, H SFE (•) denotes the shallow feature extraction convolution with a 3 × 3 kernel size.F 0 means the extracted shallow features, are then passed on to the next stage for further processing.
Stage 2: Non-uniform degradation in underwater scenes can seriously affect the quality of image restoration, and a model with rich feature representation capabilities is needed.Therefore, we explore different frequency domain information to enhance the texture details and color correction of the image.We use Gaussian filtering to separate the shallow features into high-frequency component F 0 HH and low-frequency component F 0 LL .Gaussian filtering is effective in suppressing high-frequency noise when performing signal or image smoothing, effectively improving the signal-tonoise ratio.And with the linear filtering method, Gaussian filtering is usually computationally faster.We then use K FEAMs to extract more finer features.The detailed description of FEAM will be provided in Section 3.3.This process can be represented as follows: where denote the output features of the k-th FEAM.After the step-by-step extraction by the FEAMs, the extracted high and low-frequency features are aggregated and smoothed using a 1 × 1 convolutional layer, as illustrated below: where [•, •] represents the concatenation operation along the channel dimension.f 1×1 (•) denotes a 1 × 1 convolutional layer.k 1 and k 2 represent the learnable scaling factors.F assemble refers to the aggregated features.
Stage 3: This stage is used for underwater image reconstruction, where the refined features are upsampled to the target HR image S. We refer to the technique from the paper [41] and use the method of using both H UP0 and H UP1 in the upsampling process.This approach is able to merge low-resolution image features and high and low frequency features, thus mitigating the loss of information.The process can be calculated as follows: where both H UP0 (•) and H UP1 (•) contain a 3 × 3 convolution and a sub-pixel convolution.

A. MODEL LEARNING
To ensure a fair comparison with state-of-the-art competing methods, we employ the L 1 loss function to preserve intricate textures and local structures.The training dataset consists of LR images and their corresponding HR images, denoted as: , the goal of training FEAN is to optimize the L 1 loss function:

B. FREQUENCY ENHANCED ATTENTION MODULE (FEAM)
To achieve a higher quality of underwater images, the network needs to explore different frequency domain information of the image.In such a way, the network works well for color bias correction, contrast enhancement, and detail restoration.To this end, we propose FEAM that mainly consists of MAEB and CAFB to explore different frequency properties of underwater images.As depicted in Fig. 1(a), the Gaussian filter decomposes features into high-frequency and low-frequency paths and then processes the frequency domain information in a heterogeneous manner.In the highfrequency path, we devise MAEB to further enhance the high-frequency features.While in the low-frequency path, we utilize a 1 × 1 convolution and a 3 × 3 convolution for cost-effective feature extraction.The low-frequency information of an image mainly reflects the parts of the image that change more slowly.By using a smaller convolutional kernel design, the filtering of local regions can be realized, thus reducing the high-frequency noise in the image and achieving the effect of adjusting the brightness and contrast.Subsequently, different frequency features will be passed to the CAFB for further fusion and enhancement.More importantly, we adopt two convolutions to preserve the original spatial information, and then modulate the features in the high-frequency branch, referred to as the spatial modulation block (SMB).The process can be expressed as follows: where H k SMB (•) represents two 3 × 3 convolutional operations in the k-th SMB.H k MAEB (•) and H k CAFB (•) denote the k-th MAEB module and the k-th CAFB module respectively.The f 3×3 (•) denotes the convolution with a 3 × 3 kernel size.The f 1×1 (•) denotes the convolution with a 1 × 1 kernel size.The F k ′ HH ∈ R H ×W ×C represents the high-frequency features enhanced by MAEB, while the F k ′′ HH ∈ R H ×W ×C denotes the high-frequency features adjusted in CAFB.The F k ′ LL ∈ R H ×W ×C denotes the augmented low-frequency feature.For brevity, the activation function and residual representation are omitted.
Multi-scale Attention Enhancement Block (MAEB) can distill richer texture features to help high qualitative image reconstruction in the high-frequency branch, as depicted in Fig. 1(b).MAEB achieves a more comprehensive focus on high-frequency features by considering feature information at different scales to better preserve image details.Firstly, we adopt two convolutions with different dilation rates to catch two different features F k−1 d=5 ∈ R H ×W ×C and F k−1 d=3 ∈ R H ×W ×C , expanding the learning of contextual cues.Here, we employ two 3×3 convolutional operations, corresponding to dilation factors of d equal to 3 and 5, respectively.This configuration effectively expands the receptive field without introducing additional parameters or computational complexity.Then we fuse multi-scale features and employ the Sigmoid operation to generate distinct attention weights α.Next, the attention weights recalibrate features at different scales to further highlight information-rich regions.Finally, we integrate these features using different convolution operations, and a residual connection is introduced to deliver the flow of information.The above process can be expressed as follows: Here, σ denotes the Sigmoid function.The f d=3 3×3 and f d=5 3×3 represent the expansion convolution with expansion coefficients of 3 and 5, respectively.For brevity, the residual connections are omitted.
Channel Attention Fusion Block (CAFB) aims to merge the high and low-frequency information to enhance the representation of important features and reduce the impact of noise.CAFB integrates the effects of different frequency domain information on image restoration, and focuses more effectively on the high and low frequency features of the key parts by integrating the frequency domain information in order to improve the network's ability to reconstruct the details of underwater degraded images.Initially, we concatenate frequency domain information and employ 1 × 1 convolution to obtain fused features.Then we perform feature statistics on each channel of feature maps by employing average pooling.Subsequently, we enhance feature statistics through two 3 × 3 convolutions.Next, Gumbel Softmax is applied to yield channel attention weights β.Gumbel Softmax introduces a more flexible and differentiable strategy, enabling the model to handle discrete random variables more effectively.In CAFB, this method enables the model to selectively focus on high-frequency and lowfrequency features.Meanwhile, by introducing continuity in discrete selection, the model maintains differentiability, allowing for the use of backpropagation algorithms for training.Finally, the channel attention weights recalibrate the features on the high and low frequency paths respectively, resulting in the ultimate feature representation.Through this process, we effectively integrate various information, thereby elevating the feature expression capacity.The process can be expressed as follows: where w(•) represents the function of Gumbel Softmax.The avg(•) represents the operation of average pooling.

IV. EXPERIMENTS
In this section, we begin by describing the datasets used in the experiments and providing implementation details for network training.Subsequently, we assess the performance of our proposed network on two publicly available datasets.
We then examine the contributions of individual components within our proposed network.

A. DATASET AND EXPERIMENTAL SETUP
In this research, we employ publicly accessible underwater image datasets for training our network, specifically the USR-248 dataset [24] and the UFO-120 dataset [28].The USR-248 dataset is a comprehensive collection of high-resolution underwater images and their corresponding low-resolution pairs.It includes LR images of sizes 320 × 240, 160 ×120, and 80 × 60, corresponding to downsample factors of ×2, ×4, and ×8, while the HR images are of size 640 ×480.
A total of 1060 RGB image pairs are used for training and validation, along with an additional 248 test images.The UFO-120 dataset comprises 1500 paired images for training and an additional 120 images for testing.The HR images in this dataset have dimensions of 640 × 480, while the LR images come in sizes of 320× 240, 213×160, and 160×120, corresponding to downsample factors of ×2, ×3, and ×4, respectively.The performance of image SR was thoroughly assessed using three criteria: namely, the peak signal-tonoise ratio (PSNR), structural similarity index (SSIM), and underwater image quality measure (UIQM).
We employ the Adam optimizer to minimize the objective function in our network, with optimizer parameters β 1 =0.9, β 2 =0.999, and ε = 10 −8 .The initial learning rate is set to 1e − 3, and it undergoes halving every 200 epochs.To accommodate memory limitations [42], each batch is composed of 32 LR patches of size 60 × 60 for the SR task, and the number of FEAM is set to K =4 and the number of hidden layers is set to C=64.The random seed was set to 111 throughout our experiments, facilitating result reproducibility.Our model is implemented using the PyTorch framework and executed on an NVIDIA RTX 3060 GPU.

B. EXPERIMENTAL EVALUATION ON THE UFO-248 DATASET
We compare our proposed FEAN with several SR networks, including SRCNN [9], VDSR [16], DSRCNN [43], EDSRGAN [17], SRGAN [22], ESRGAN [23], LatticeNet [44], SRDRM [24], SRDRM-GAN [24], AMPCNet [38], PAL [31], and RDLN [33].The experimental results can be seen in Table 1.In comparison to popular SR works, our FEAN shows competitive results on all scale factors.Compared to SR methods used for natural images such as LatticeNet, our method boosts PSNR, SSIM, and UIQM by up to 2.19%, 18.5%, and 4.88%, respectively.This is because our FEAN takes into account the interaction between different frequency information, making it better used for underwater image recovery.In addition, our FEAN obtains optimal and sub-optimal performance compar to underwater SR methods.For example, our network yields gains of 0.16dB and 0.02 PSNR and SSIM at a large scale factor ×8 than AMPCNet, even using fewer computations.Further, RDLN 6140 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
adopts the Transformer structure to improve reconstruction accuracy, but its evaluation metrics are much lower than our proposed FEAN.This indicates that our proposed network does not require complex mechanisms to improve the results, but exploits the proposed FEAM to explore and utilize different frequency characteristics to achieve better performance.Moreover, we present visual comparisons on the USR-248 dataset in Fig. 2. We can find that our FEAN yields better visual effects, whose recovered image produces sharper texture and detail information.AMPCNet and PAL show severe blurring artifacts.In contrast, our proposed FEAN can effectively solve these issues by exploring frequency domain cues.

D. ABLATION STUDY
To explicitly exhibit how our proposed components improve restoration results, we respectively remove MAEB, CAFB, and SMB to retrain the network, and the experimental results are shown Table 3.When FEAN is without MAEB, the performance drops dramatically, and the PSNR, SSIM, and UIQM are respectively reduced by 0.46dB, 0.022dB, and 0.04dB.

1) ABLATION OF PROPOSED CAFB IN FEAN
We compare the performance of the Squeeze-and-Excitation block (SEB) [48] and CAFB modules on the dataset  USR-248.The experimental results are presented in Table 4.In contrast to the FEAN model using SEB, FEAN model presents approximate results at scale factor ×2, however, its PSNR improves by 0.012dB and 0.02dB at scale factor ×4 and ×8, respectively.

2) ABLATION OF PROPOSED GAUSSIAN FREQUENCY SEPARATION BLOCK IN FEAN
We compare the FEAN models with the Gaussian frequency separation module removed on the USR-248 dataset.As seen from the experimental results in Table 4, the FEAN model with Gaussian frequency separation performs better in both PSNR and SSIM values at different deflation scales.

3) ABLATION OF PROPOSED MAEB WITH VARYING DILATION RATES IN FEAN
We conduct comparative experiments on the dilation convolution of MAEB with dilation rates set to 2 and 4, compared to the previous settings of 3 and 5, and the detailed results are shown in Table 4. MAEB with dilation rates of 6142 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.3 and 5 exhibits superior performance while maintaining the same number of parameters.We believe that adopting a larger convolutional dilation rate helps to capture a wider range of high-frequency features, thus enhancing the details of underwater image SR.
In addition, as demonstrated in Fig. 4, we present the average gray-scale feature maps with and without MAEB enhancement on the high-frequency branch at different stages.We can see that the features enhanced by MAEB have sharper and finer texture details.
Although the networks with and without CAFB have smaller gains in SSIM and UIQM, they differ by 0.14 dB in PSNR.This is because CAFB can effectively gather high and low-frequency features to restore abundant texture characteristics.Fig. 5 illustrates the heatmap of the high and low-frequency branches with and without CAFB processing for the first and fourth phases.The first row indicates the high-frequency branches and the second row indicates the low-frequency branches.As Fig. 5 (c) and (d) demonstrate, the model processed with CAFB better enhances the frequency domain information, exhibiting more positive responses at the target object.
Moreover, FEAN augmented with SMB acquires a PSNR of dB and a UIQM of 0.06 improvements, with increasing parameters by only 80K.The convergence results for different components are shown in Fig. 6, it can be seen that FEAN without SMB converges more slowly.These results reveal the validity and rationality of the proposed components, which jointly contribute to the performance and computational efficiency.

E. EFFECTIVENESS OF MULTI-MODAL OBJECTIVE FUNCTION
We investigate the influence of loss function for the underwater SR task, as shown in Table 5.We conduct three loss function experiments, namely, L1, L2, and L1+L2.The experimental results show that the L1 loss function performs well in terms of PSNR and SSIM values, which are ahead of L2 by 0.172 and 0.025, respectively.Additionally, L1 outperforms L1+L2 in both PSNR (with a lead of 0.179dB) and SSIM (with a lead of 0.024).It is to be noted that the L1 loss function is superior in PSNR and SSIM, but it has some deficiencies in UIQM index compared to L1+L2.This   is probably because the L1 loss function pays more attention to minimizing the absolute error in the optimization process and ignores some image features related to human visual perception.

F. MODEL COMPLEXITY
To exhibit the computation efficiency of the proposed method, we evaluate it in terms of the number of parameters, FLOPs, average time, and computational efficiency,  respectively.We select some representative methods to compute the inference time on the USR-248 dataset, which are executed on the same device with the RTX3060.As Table 6 displays, the average time per image of our FEAN is achieved at 0.04671s (21.41fps), which is faster than most popular underwater approaches.As analyzed in Section IV-B, the quantitative results show that our proposed method obtains better restoration accuracy with appropriate computational complexity and superior computational efficiency.This is because our FEAN is composed of a sequence of FEAMs that keep the entire network light enough, facilitating fast inference in the network.Therefore, we can conclude that our proposed method strikes a good balance between network reconstruction performance and computational efficiency.

V. CONCLUSION
In this work, we propose a lightweight and effective frequency-domain enhanced attention network (FEAN) for underwater SR.To be specific, frequency enhanced attention modules (FEAM), composed of attention enhancement block (MAEB) and channel attention fusion block (CAFB), acts as the backbone of the network inferring rich texture details in a coarse-to-fine manner.MAEB can capture more useful information in the high-frequency branch to facilitate the reconstruction of HR images.Further, CAFB is proposed to aggregate high and low-frequency information, which can achieve color deviation correction and detail enhancement.Extensive experiments show that our FEAN attains better performance with less model complexity than mainstream underwater SR methods.However, the diversity of underwater environments and limited training data limit the generalization ability of the model.In the future, we will focus on physical model learning to build a network that can adapt to a variety of imaging conditions in order and construct a large-scale dataset of real underwater images to improve the performance of super-resolution models for underwater images.

FIGURE 1 .
FIGURE 1.The overall pipeline of our proposed frequency-domain enhanced attention network (FEAN) is implemented by sequential stacking of FEAMs.(a) The architecture of the multi-scale attention enhancement block (MAEB).(b) The architecture of the channel attention fusion block (CAFB).

TABLE 2 .
Quantitative results on the UFO-120 dataset with scale factors of ×2, ×3, and ×4 for underwater image SR.Bold indicates the best performance.

TABLE 4 .
Ablation studies of proposed components on the USR-248.''FEAN w/o Gaussian'' denotes the model that does not use Gaussian filtering for frequency separation, ''FEAN with MAEB(2,4)'' denotes the model with MAEB dilation rates set to 2 and 4, and ''FEAN with SEB'' denotes the model that uses SEB instead of CAFB.

FIGURE 4 .
FIGURE 4. Visualized feature maps of FEAN with and without MAEB using grayscale colormap.(a)-(d) show feature maps of high-frequency branch without MAEB; (e)-(h) show feature maps of high-frequency branch with MAEB.

FIGURE 5 .
FIGURE 5. Visualized feature maps of FEAN without and with CAFB.(a) and (b) show the heatmaps on the high and low-frequency branches without CAFB; (c) and (d) show the heatmaps on the high and low-frequency branches with CAFB.

FIGURE 6 .
FIGURE 6. Convergence analysis of different components on UFO-120 with scale factor ×4, where the x-axis is the number of Epochs, the y-axis in (a) is the PSNR value, and the y-axis in (b) is the SSIM value.

TABLE 3 .
Ablation studies of proposed components on the UFO-120 with scale factor ×4.

TABLE 5 .
Ablation studies show the effectiveness of different loss functions on the USR-248 with scale factors ×4 dataset.

TABLE 6 .
Computation efficiency of representative methods with scale factors ×4 on the USR-248 dataset.