Loop Residual Attention Network for Automatic Segmentation of COVID-19 Chest X-Ray Images

As COVID-19 continues to put pressure on the global healthcare industry, using artificial intelligence to analyze chest X-rays (CXR) has become an effective way to diagnose the virus and treat patients. Despite that many studies have made significant progress in COVID-19 detection, accurately segmenting infected regions with variable locations and scales from COVID-19 CXR remains challenging. Therefore, this paper proposes a novel framework for COVID-19 CXR image segmentation. Specifically, design a loop residual module to cyclically extract feature information in the process of encoding and decoding splicing, avoiding the loss of complex semantic information in network computing. At the same time, an absolute position information coding block is proposed to strengthen the position information of feature pixels. Moreover, a hybrid attention module is designed to establish semantic associations between channels and multi-scale spaces. Better feature representation is formed by the fusion of location and scale information to alleviate the impact of variable infection regions on segmentation performance. Extensive experiments are conducted on the public COVID-19 CXR dataset COVID-Qu-Ex, and the results show that our network is leading and robust compared to other networks in COVID-19 segmentation.


I. INTRODUCTION
COVID-19 is an acute respiratory infectious disease caused by the coronavirus. Most infected people usually have fever, cough, fatigue, and headaches, and severe patients experience symptoms such as pneumonia, acute respiratory syndrome, and renal failure [1]. At the same time, the virus continues to mutate, such as Delta, Lambda, and Omicron, which have recently appeared. The mutant strains are more stealthy and infectious, posing a more significant threat to people's health. Therefore, how quickly and accurately detect infected patients and take corresponding isolation and treatment measures is very important to defeat the epidemic.
Due to the solid latent and occult nature of COVID-19, the current reverse transcriptase-polymerase chain reaction (RT-PCR) technology cannot detect the virus in a timely and accurate manner [2], [3]. Artificial intelligence technology based on the neural network has demonstrated The associate editor coordinating the review of this manuscript and approving it for publication was Diego Oliva . powerful capabilities in multiple tasks in recent years [4]. At the same time, the development of translational medicine bridges the gap between basic research and clinical applications and builds a bridge for artificial intelligence to understand COVID-19 [5], [6]. Therefore, deep learningbased COVID-19 CXR image analysis technology has become an effective way to assist medical staff in diagnosing new coronary pneumonia [7].
Recently, many researchers have made significant progress in detecting pneumonia using chest X-ray images. Reference [8] proposed a multi-task framework (Covid-MANet) that first segmented the lungs and then performed the segmentation, localization, and severity assessment of the infected area. Reference [9] Design a Convolutional Support Estimation Network (CSEN) that fuses representation-based classification methods into neural networks to classify patient CXR images. A TransUNet is proposed in [10], which uses a Transformers encoder to encode feature image patches as input sequences for extracting global context information. However, splicing high and low semantics in these networks is relatively simple, and the semantics cannot be well integrated, easily leading to the loss of pixel features. Therefore, when handling multi-focal, low-contrast, and unclear COVID-19 CXR images, as shown in Fig. 1(a), it is difficult for the network to obtain sufficient feature information. Reference [11] proposed a patch-based deep neural network structure that can be stably trained with smaller datasets, and multiple patch votes determine the classification results. In Reference [12], the method of human-machine collaboration was used to construct a COVID-QU-Ex dataset, and various classical networks were used to complete the detection, localization, and quantification of COVID-19. Reference [13] and [14] use transfer learning and pre-trained models to detect and classify COVID-19. However, most of the above studies ignore the impact of absolute location information on COVID-19 detection and cannot sufficiently establish the semantic dependencies between infected regions under large-scale changes. Therefore, when handling CXR images with highly similar images but very different infected areas, as shown in Fig. 1(b), the network cannot maintain high sensitivity to location information. At the same time, the network can also not cope with the interference caused by large-scale changes in the infected area on the detection performance, as shown in Fig. 1(c). This paper proposes a novel coronavirus pneumonia segmentation network LRA-Net based on absolute location information to solve the above problems. Specifically, for the problem of missing feature pixels, design a loop residual block (LRB) to narrow the semantic gap. Unlike the previous single-branch, single-layer skip connection residual structure, this structure has multiple branches, considering skip connections across multiple layers and reverse skip connections. Skip connections across multiple layers can enable convolutional layers to receive semantic features from farther away. That is, the feature input of the convolutional layer not only contains the input and output of the previous layer but also contains the input and output of more layers. The reverse skip connection concatenates the feature information inversely and then feeds it into the convolutional layer, which makes the capture of feature information more comprehensive. The whole module acts as a bridge connection layer in the middle of the encoder and decoder. It not only plays the role of buffering feature semantics and promoting encoder-decoder feature fusion but also obtains a richer feature representation.
Aiming at the problem of sensitive location information, propose an absolute position module (APM) to enhance position information. This module adopts the form of brute force encoding, inserts two-position encoding layers with a depth of 1, and encodes the two-dimensional position information of the pixel features before convolution. Through this operation, the Cartesian spatial absolute position information [15] is fused into the pixel features, which strengthens the representation of the pixel absolute position information, and makes the convolution feel the global spatial information of the feature.
Aiming at the problem of large-scale changes in infected regions, design a mixed attention module (MAB) to capture the interdependencies between scale changes. Specifically, the module is designed behind 2, 3, 4, and 5 layers of encoders, combining channel attention and multi-scale spatial attention blocks. Firstly, a channel attention mechanism is established to complete the extraction of channel relationships. Then the result is used as the input of the spatial attention block, and the interdependence between multi-scale spaces is established by the method of feature splicing after pyramid pooling. In this way, complete the information interaction between the channel and the multiscale space, reducing the interference of multi-scale changes in the infected area on the network performance.
The main contributions of this paper are summarized as follows: 1) A loop residual module is designed to alleviate the semantic gap phenomenon between encoders and decoders and promote the fusion of semantic features between encoders and decoders. At the same time, The stepped cross-layer skip connection of the module itself can also enrich the semantic information of the encoder and enhance its representation of pixel features. 2) An absolute position information module is proposed to enhance the network representation of position information. By constructing an encoding layer with two-dimensional spatial information, the feature pixels are encoded with two-dimensional position information to participate in the calculation of the network. This not only avoids the problem of position information dilution during pooling and downsampling but also enhances the ability of convolution to perceive pixel spatial location information. 3) A hybrid attention module is designed to capture more complex multi-scale lesion information. First, cascading channels and spatial attention blocks establish the interdependence of channels and spaces. Secondly, in the spatial attention part, pyramid pooling is used for feature extraction, which enhances the association between the attention channel mechanism and non-local spatial context information. The remainder of this paper is organized as follows: Section II briefly reviews related work, Section III describes VOLUME 11, 2023 our proposed LRA-Net network model structure, Section IV conducts the extensive experimental evaluation of the model, and Section V concludes the paper.

II. RELATED WORKS
A. CHEST X-RAY TECHNOLOGY Chest X-ray is a commonly used radiological diagnostic technique for pulmonary disease [16], [17], [18]. Typically, the lungs of COVID-19 patients show symptoms such as groundglass opacity [9], bilateral radiological abnormalities [19], cloudy surrounding air, and diffuse air space disease [20]. CXR technology can easily obtain image information about the lungs, heart, and bones. Therefore, since the outbreak of the new crown, more and more researchers have explored the use of CXR technology to diagnose lung infections in COVID patients. A hybrid deep learning framework (VDSNet) with a VGG16 pre-trained model, data augmentation, and spatial transformation was proposed in [21], which achieved 73% validation accuracy on the NIH CXR image dataset. Reference [22] proposed a capsule network-based framework (COVID-CAPS) to classify four classes of CXR images containing COVID, achieving 95.7% accuracy. A novel deep learning neural network architecture (COVID-Net) was proposed in [23], which introduced a lightweight Projection-Extend-Projection-Extend (PEPX) design, and the detection accuracy of the new crown CXR images reached 98.9%. Reference [24] proposed a sequence region generation network (SRGNet) to simultaneously localize and segment infected regions of COVID-19 CXR images. Reference [25] proposed an end-to-end network (ReCovNet) that can discriminate CXR images of 14 lung diseases and provided a benchmark dataset QATA-COVID-19 for CXR detection. In conclusion, more and more studies have shown that chest X-ray technology has provided an effective way of pneumonia detection.

B. COVID-19 IMAGE SEGMENTATION
Accurate segmentation of COVID-19 infected areas is essential for diagnosis, treatment, and epidemic prevention and control. Many researchers have used deep learning network models to complete the task of pneumonia segmentation. Reference [26] proposed a dual-branch combinatorial network (DCN) for the segmentation and classification of pneumonia. Reference [27] proposed a multi-scale dilated convolutional network (MSDC-Net), which uses dilated convolution to capture contextual semantic information at different scales, and aggregates feature learned in multiple modules during downsampling to improve segmentation accuracy. Reference [28] proposed a multi-net model (MultiR-Net) that combines COVID-19 classification and lesion segmentation, fusing features between two subnetworks through a reverse attention mechanism and an iterative training strategy. Reference [29] proposed a deep neural network (DNN) framework with multi-dimensional kernels and dilated residual blocks in the encoding process to obtain variable receptive fields in feature extraction. Reference [30] proposed a collaborative learning framework for assessing infection severity in COVID-19 patients. In Reference [31], an Inf-Net network fuses edge region information and is proposed to establish the relationship between region information and edge information. A semi-supervised segmentation system is introduced to alleviate the lack of data. Reference [28] propose a noise-resistant segmentation framework ((COPLE-Net)) to improve the performance of handling noisy labels so that two adaptive mechanisms are interrelated.

C. ENCODER-DECODER NETWORK STRUCTURE
The encoder-decoder is a classical medical semantic segmentation network structure. Through the feature extraction of the image in the encoder stage and the feature recovery of the image in the decoder stage, a symmetrical network framework is formed. Reference [33] proposed a classical U-Net network structure for biomedical image segmentation. Reference [34] improved the Attention U-net model of Tversky Loss, adding multi-scale input and weighted multilevel predictions as a loss. Reference [35] proposed an interactive attention refinement network (attention RefNet) to enhance segmentation feature information by using channel and spatial attention blocks (SCA). The UNet++ model proposed by [36] reduces the semantic gap between the encoder and decoder through a series of nested dense skip connections. The UNet3+ model structure proposed by [37] adds a full-size skip connection and a classification guidance module based on UNet++, which produces more accurate segmentation results. A ResBCDU-Net network is proposed in [38], which replaces the encoder stage with ResNet-34. It uses Bidirectional Convolutional Long Short-Term Memory (BConvLSTM) as the concatenated channel module to improve segmentation accuracy. In conclusion, the encoderdecoder structure is still an effective biomedical image segmentation structure.

III. PROPOSED METHOD
CXR images have very high inter-class similarities and intraclass differences [22]. Need the location information to assist the network in accurately identifying and segmenting the pixels in the infected area. Therefore, the location information of each pixel is vital for diagnosing new coronary pneumonia. Most of the segmentation models based on encoder-decoder structure are researching how to improve the performance of feature extraction in the encoding stage [27], [28], [29]. This paper proposes a novel encoder-decoder segmentation network based on absolute position information (LRA-Net). as shown in Fig. 2. The network integrates the absolute spatial position information coding layer in the decoder stage, which encodes the position of the pixels and strengthens the absolute position information of the feature pixels. Cascading attention modules in each convolution layer build multi-scale spatial attention and channel interdependencies. Moreover, the loop residual structure reduces the information difference between high and low semantics.

A. LOOP RESIDUAL BLOCK
The encoder-decoder structure obtains richer features due to the concatenation of low-level and high-level semantic information. Such as the classic Unet [43] and its variants [34], [36], SegNet [42], etc. However, directly splicing features will lead to a semantic gap in the network, resulting in the decline of network segmentation performance [44].
This paper proposes a loop residual block (LRB), the use of which weakens the semantic gap between encoders and decoders. The first two convolutional layers of this LRB use skip connections to form cascaded residual convolutional layers. The whole is divided into three branches: upper, middle, and lower. The upper branch is a forward connection. Firstly, the output of the first residual convolutional layer module is spliced. Then the output of the second residual convolutional layer module is spliced to obtain the upper branch output finally. The lower branch is a reverse connection. First, the output of the second residual block convolutional layer is spliced with the output of the first residual convolutional layer. Then the input of the splicing module is used to obtain C 2 : (1) where I represents the input of the residual structure. O 1 is the output of the first residual block. O 2 is the output of the output of the second residual block. Cat(·) is feature splicing processing.
Then C 1 and C 2 are again spliced with the output O 2 of the second residual block to get the final C: Input C into the last layer of convolution module, and finally get the output O through a 1 × 1 convolution: where f 1×1 (·) represents a 1 × 1 convolution operation. Through the above method, the network obtains richer semantic information based on retaining the pixel features of the encoder stage. At the same time, it also avoids the semantic gap phenomenon in direct splicing, making the semantic fusion between the encoder and the decoder milder.

B. ABSOLUTE POSITION MODULE
Related studies have tried to add absolute location information to the network to enhance its performance. Reference [39] and [40] proposed that location information is usually learned implicitly using zero padding, which allows CNNs to encode absolute location information. [41] studies have shown that CNNs exploit the positional information of absolute space by specifically responding to the filters of the specific absolute space. However, in many cases, such absolute position information may be diluted or lost after global average pooling and maximum global pooling.
In order to deal with the dilution of position information in the COVID-19 CXR image segmentation process, this paper proposes an absolute position information module (APM).
APM adopts the most direct way to encode the pixel position information in the horizontal and vertical directions using two encoding layers of depth 1 for the image features of the input network.
where F apm represents the encoding function, H and W refer to the height and width of the feature map, respectively, and m and n refer to the horizontal and vertical position information given to the feature map. Then, the encoded feature information is subjected to a convolution operation. In this way, the feeling of the location information of the infected area is more evident in the subsequent network calculation. At the same time, the aspp module is used at the bottom connection of the encoder and decoder, which further strengthens the network's acquisition of lesion information at different scales.

C. MIXED ATTENTION BLOCK
Self-attention mechanism [43] is widely used in various fields of deep learning to improve the discriminative ability of network models [28], [32]. [35] proposed an interactive segmentation network that combines features at various levels using channel and spatial attention. [45] proposed a network combined with a soft attention mechanism to improve information extraction on the relative positions of pixels. However, the above attention often ignores the connection between channels and multi-scale spaces and is insufficient for information extraction under large-scale changes. This paper proposes a module that integrates channel attention and multi-scale spatial attention (MAB) to improve the accuracy of pneumonia segmentation under large-scale changes. This module is mainly divided into two parts: channel attention block and attention block. Given an original feature P ∈ R C×H ×W , Reshape it to get P ∈ R C×N . Where N represents the number of spatial position pixels, and N = H * W . Then P * P T , and then obtain the channel attention map A ∈ R C×C through Softmax: A mn represents the influence of channel n on channel m, then multiply A and P to get the result R C×N . At the same time, reshape the result into R C×H ×W and then multiply it by the learnable factor ∅ to get the final result Q∈ R C×H ×W : Q c indicates that the feature of each channel is the weighted sum of the original feature and other channels, thus enhancing the interdependence between the semantics of different channels. Next, perform pyramid pooling operation on the obtained Q c , and splicing the pooled result to obtain F = cat(Q x c ). Use 1 × 1 convolution to compress F in the channel dimension to obtainF c , and then perform matrix commensurate with Q C to where X ⊆ {1, 3, 5, 7}, represents the size of the convolution kernel in pyramid pooling, and f 1×1 () represents the use of 1 × 1 convolution compression channels. Pooling with convolution kernels of different sizes obtains spatial information at different scales. Therefore, Y s represents that each spatial pixel is a weighted pixel of different scales, and establishes a connection with the channel.

D. SEGMENTATION LOSS FUNCTION
The segmentation of COVID-19 CXR images is mainly to determine the attribution of each pixel [43]. In this paper, it is necessary to divide the pixels belonging to the foreground (infection) and the background (normal), so the binary cross-entropy loss is used as the supervision constraint function of this segmentation network: where y i is the label of the sample, p i is the probability of being predicted to be an infected area, and (1 − p i ) is the probability of being predicted to be a normal area.

IV. EXPERIMENTS A. DATA AND PREPROCESSING
This paper mainly uses the public dataset of COVID-Qu-Ex collected by researchers at Qatar University [6]. This dataset has 11,956 COVID-19 images, including 2,913 images with infection labels, and all image pixels are 256 × 256. Using these 2913 CXR images, completed the entire COVID-19 infection segmentation experiment. At the same time, we also performed some lung segmentation experiments to verify the model's effectiveness for simple tasks. Table 1 shows the division of the dataset used in this paper. At the same time, during the training process, we also use affine transformations (rotation, flipping, scaling, etc.) for data augmentation.

B. EVALUATION METRICS
We quantitatively evaluate the model's performance at the pixel level using a confusion matrix. First, the pixels in the infected area are marked as positive, and the background pixels are marked as negative. Then count the following elements: the number of pixels correctly predicted as the positive class (TP); the number of pixels correctly predicted as the negative class (TN); the number of pixels incorrectly predicted as the positive class (FP); the number of pixels incorrectly predicted as negative class Number of pixels (FN). Finally, we evaluated the model's performance using the following metrics: Accuracy, Precision, Recall, F1-score, and MIoU. The mathematical definitions of these evaluation metrics are as follows: The accuracy here is the ratio of correctly classified pixels among the overall pixels.
The accuracy rate here refers to the probability that among the samples predicted to be infected pixels are actually infected pixel samples.
The recall rate here refers to the probability of predicting an infected pixel sample among samples that are actually infected pixels.
The F1 here is the harmonic mean of precision and recall. It is often used to measure the overall performance of both when high precision and high recall are required.
In order to evaluate the reliability of the model in all aspects, Mean Intersection over Union (MIoU) is also introduced to evaluate the overlap rate of the actual segmentation mask and the predicted segmentation mask.

C. IMPLEMENTATION DETAILS
The experiments in this paper are conducted with an Intel Xeon Silver 4210 CPU @ 2.20 GHz, 64 GB RAM, and an Intel Xeon Gold 6142 CPU@2.67GHz, 10.5-GB NVIDIA GeForce RTX 3080. The experimental language is Python 3.8, and all models were executed in the Pytorch 1.10. During the training process, considering the computational efficiency and memory consumption, used the Adam optimizer and set =0.9 and =0.999. The initial learning rate was set to 0.0001, and the epoch was set to 100. At the same time, an adaptive learning rate decay strategy is adopted. After every 10 epochs, if the validation set loss does not decrease, the learning rate decay to 0.1 times the original. Set the batch size to 8, weight decay of 0.0002, and used early stopping and gradient clipping to prevent overfitting. Finally, the model weights obtained by training are tested on the test set, and the corresponding evaluation indicators are obtained.

D. INFECTION SEGMENTATION RESULTS
To evaluate the effectiveness of the proposed LRA-Net on COVID-19 CXR image segmentation, we compare it with eight state-of-the-art methods. Comparison networks, including classic FCN [46] and DeeplabV3 [47], U-Net [33] and U-Net++ [36] commonly used in the medical field, lightweight MiniSeg-Net [48], attention-rich Attention-Unet [34] and CENet [49], and a collection of multi-functional COPLE-Net [32]. The above networks are all open source, and to maintain rigor, we use the same training parameters and evaluation methods for all networks. Table 2 shows the segmentation metrics of the above networks on 583 testing COVID-19 images. Because MiniSeg-Net is a lightweight deep learning model, the segmentation performance is the worst among all networks, only reaching an MIoU of 77.50%. After FCN and DeeplabV3 use the Resnet50 pre-training model as the backbone, MIoU only exceeds 80%, while our proposed LRA-Net does not use any pre-training method. Although U-Net++ improves the way U-Net splices high and low semantic information, it improves the segmentation performance of the network, reaching an MIoU of 81.24%. But since LRA-Net mixes multiple feature modules, the performance still leads the way. Attention-Net and CE-Net use a deep supervision method to optimize the network, further improving the segmentation performance. On Precision, CE-Net achieves an optimal index of 89.42%. In the future, deep supervision will also be an essential research direction for LRA-Net segmentation performance improvement. Overall, LRA-Net performs well in the evaluation indicators. It achieved 96.17%, 89.24%, 90.12%, 89.24%, and 81.94% on the five indicators of Accuracy, Precision, Recall, F1-Score, and MIoU, respectively. In addition to Precision metrics. Compared with the best metrics among the other eight models, this article network improves by 0.08%, 0.02%, 0.4%, and 0.57% on Accuracy, Recall, F1-Score, and MIoU, respectively; Compared with the basic U-Net network, the five indicators are improved by 0.24%, 1.49%, 0.19%, 1.03%, and 1.47% respectively.  Overall, this article network is effective in the COVID-19 segmentation task. Fig. 3 shows the qualitative evaluation results of all models on the test set. Selected images of pneumonia with different degrees of infection in the test set for a comprehensive analysis. The infected area of mild patients generally showed a small and concentrated feature; the infected area of severe patients spread over the entire lung area; the infected area of moderate patients generally showed a complex irregular contour. These characteristics fit well with the medical interpretation of the lesion process. For CXR images with mild small-area infection, all other networks except the lightweight MiniSeg-Net can perform well segmentation. Nevertheless, it can still be seen from the second row that the segmentation effect of this article network and CE-Net is ahead of other networks. From the segmentation effects of the fourth and fifth rows in Fig. 3, for severe large-area infections, this article network segmentation effect is significantly better than other networks. This is because MAB obtains more feature information at different scales in the encoder stage, and LAB strengthens the absolute position information and enhances the sensitivity to the feature position and scale. For complex infection areas, the segmentation effect of all networks is not very good due to the large irregularity of features and high contour complexity. Lines 6, 7, and 8 in Fig. 3. But relatively speaking, the segmentation performance of this article network is also better than other networks.

E. LUNG SEGMENTATION RESULTS
In this section, to evaluate the adaptation performance of this article network, conduct related lung segmentation experiments on healthy and Non-COVID-19 CXR image datasets. The segmentation results are compared with several other representative networks, and the evaluation results are shown in Tables 3 and 4, respectively.
As can be seen from Tables 3 and 4, this article model still has an advantage in the simpler lung segmentation task. Specifically, in the healthy lung segmentation task, although Precision and MIoU are lower than COPLE-Net, the accuracy, recall, and F1-Score have achieved the best indicators of 98.73%, 98.31%, and 98.23%, respectively. In the Non-COVID-19 lung segmentation task, in addition to the Recall index, the accuracy, precision, F1-Score, and MIoU all achieved the best results, which were 98.44%, 97.65%, 97.58%, and 95.27%, respectively. Fig. 4 shows visualizations of different models on the lung segmentation task. Since all networks achieve a good metric in the lung segmentation task, the visual difference is not obvious, but relatively speaking, this article network still has a slight lead. Overall, the adaptability of this article network to simple tasks is demonstrated.

F. ABLATION ANALYSIS
To explore the influence of the main modules in the LRA Net network on the segmentation performance, do a related ablation study on the COVID-19 segmentation task. The experimental results are shown in Table 5. It can be seen that compared with the baseline model, the constructed APM layer and the designed LRB and MAB have promoted the network's performance. The APM layer encodes the information on the absolute position of the pixel before the convolution operation, which plays an auxiliary role in the pneumonia segmentation task that is highly sensitive to position information. The loop residual function of the LRB module can alleviate the problem of significant feature gaps in the direct splicing of codecs and promote the fusion of high and low semantics. The cascaded attention modules in MAB establish dependencies between channels and channels and between channels and multi-scale spaces. At the same time, because the convolution kernels of different sizes are used to downpool the features passing through the channel, the feature information in different scale spaces is enriched, and the feature representation ability of the model is enhanced. In general, the segmentation performance of the LRA Net network is improved due to the fusion of the functions of the above modules.

G. VISUAL ANALYSIS
In order to more clearly show the details of feature changes during network training, we visualized the feature maps of some layers in network computing and compared them with the corresponding layer features of the classic U-Net network. Fig 5 shows the visualization of layers E1 and E2 in the encoder stage, layers D3 and D4 in the decoder stage, and 1 × 1 convolutional and sigmoid layers. On the whole, the network features become more abstract from the encoder stage and recover more clearly in the decoder stage. After carefully comparing our network and the U-Net, we can see that this article network has more prominent semantics and more obvious outlines in the decoder stage, and the output results of the sigmoid layer are more accurate. This is not only reflected in the several channels shown in Fig. 5 but also in the other channels. This is mainly because this article network adopts the LAB module when combining the encoder features, which buffers the semantic difference while extracting richer semantic information. At the same time, the APM module strengthens the location information and establishes the association of features and absolute spatial location information. The MAB module establishes the interdependence of channel attention and multi-scale spatial attention, making pixel classification more accurate and feature outlines clearer.
We use Grad-CAM [50] to visualize the network as a heatmap to observe the attention regions' location during network learning. As shown in Fig. 6(a), this article network is more accurate and reliable when focusing on the infected areas of COVID-19 images. In contrast, other networks focus on more features of non-lung-infected regions, such as bones and background features. For images of non-COVID-19 and normal lung images, the COVID-19 segmentation network should not find the features of the new crown. Nevertheless, other networks also acquire some features, as shown in Fig. 6(b) and Fig. 6(c). This shows that other networks are not VOLUME 11, 2023 47487 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  specific for detecting the new crown and learning additional features. Relatively speaking, this article segmentation network did not obtain COVID-19 features on normal and non-COVID-19 images, which also illustrates the robustness and specificity of this article network in focusing on pneumonia features.

H. WEAKNESSES AND FUTURE PROSPECTS OF THE MODEL
This article proposed segmentation model has achieved good results on the COVID-19 segmentation task, but the model still has room for improvement. First, this article model increases the computational load of the network due to the addition of multiple functional modules, so we need to continue to study how to simplify each functional module to reduce the computational load of the model. Second, the performance improvement of our model on the lung segmentation task is not obvious, mainly because the lung segmentation task is relatively mature, and most networks can obtain better segmentation metrics. Finally, while enhancing the absolute position information helps improve our segmentation model's performance, we find that inserting APMs in inappropriate positions actually degrades the model's performance. Therefore, it is still worth exploring whether the absolute position information has a good boost for other tasks.

V. CONCLUSION
This paper proposes LRA-Net, a segmentation model for COVID-19 based on absolute location information. Firstly, design a loop residual block (LRB) to alleviate the semantic difference between codecs. Secondly, propose the Absolute Position Information Module (APM), which enables the convolution process to perceive the global spatial information of pixel features. Finally, a mixed attention module (MAB) was designed to establish the interdependence between channels and multi-scale spaces. This structure correlates pixel information into location and multi-scale space for better feature representation. This method can effectively alleviate the interference caused by complex features and scale changes. The effectiveness and robustness of our model are verified through sufficient experiments. In the future, we plan to explore robust and efficient pneumonia segmentation networks further to facilitate the development of computer diagnostic techniques.

APPENDIX
The code for this article is implemented in python and is available from the following GitHub repositories: https://github.com/ThirteenYue/LRANet