CNN-Based Image Quality Classification Considering Quality Degradation in Bridge Inspection Using an Unmanned Aerial Vehicle

Key information for the maintenance and diagnosis of structures including bridges can be obtained from the processing of digital images acquired by unmanned aerial vehicle (UAV). However, low-quality images caused by various problems such as UAV movement, inspection environment, and camera parameters can lead to inappropriate structural evaluation due to the difficulty of digital image processing. Therefore, an appropriate assessment method for image quality considering the deterioration of the inspection image in the structural inspection procedure is required. In this study, a new image quality assessment (IQA) using a convolutional neural network (CNN) is proposed in consideration of various degradation factors that may occur in the structure inspection image. The first stage presents a method to obtain consistent quality against various interference factors of deterioration that may occur in inspection images. Adjusting the camera parameters minimizes the degradation of the inspection image. Subsequently, low- and high-quality images are distinguished according to the proposed image acquisition method. The second stage is the classification of the inspection dataset using the CNN-based image quality classifier model through training of data classified according to their quality. Experimental validation of the proposed method shows that the results are similar to the Human Visual System (HVS), which means subjective quality classification, and that the inspection image can be classified with more accurate and shorter processing time.


I. INTRODUCTION
Inspection of infrastructure, including bridges, using UAV equipped with vision sensors has been a major research topic in the field of maintenance and structural condition monitoring (SHM) over the past few years [1], [2], [3]. The best-known advantage of inspection using UAV is that remote control and accessibility are more advantageous than conventional visual inspection, allowing large bridges to be The associate editor coordinating the review of this manuscript and approving it for publication was Gerardo Flores . inspected in a short time and at a low cost. By using UAV for inspection, it is possible to reduce human safety accidents that occur in hazardous work environments [4]. In addition, the results obtained from the image processing of the inspection data provide objective results compared to the subjective structural assessment that depends on the skill level of the inspector [5]. The main purpose of a structural inspection is to evaluate the condition of each structural element by updating newly identified changes to past assessment reports to ensure that the asset is safe or meets service requirements. Although there are sufficient motives to enable efficient inspection and obtain objective assessment report results when using UAV, the problem of visually identifying damage still needs to be addressed [6]. It seems that more advanced inspection and monitoring technologies for UAV-based bridge inspection are needed to completely replace human-based methods.
The most important task in UAV-based bridge inspection of structures is to properly detect and quantify damage in the obtained images [7]. The UAV acquires images of the entire area or a specific area at risk of damage while moving through the three-dimensional space of the structure to be inspected. If a large bridge structure is inspected, the number of images obtained can range from at least several thousand to tens of thousands. Damage to be identified in these images also includes cracks, spalling, efflorescence, and exposed rebar, and of course, can be present in combination. Multiple detections of damage in such enormous image data is almost impossible manually and should be automated as much as possible.
Numerous specific studies have been conducted to automatically detect damage in images. Conventional image processing techniques include filters [8], morphological analysis [9], and statistical methods [10]. These conventional methods have some obvious limitations. Proper detection is difficult in images where noise cannot be removed properly, such as rough surfaces, and multiple damage detection is impossible. Also, image processing techniques for a large number of high-resolution inspection images are not suitable as they require a lot of processing time. As parallel operation using a graphic processing unit (GPU) was developed in the computer vision field, studies on models for damage detection and classification using convolutional network-based deep learning algorithms were carried out [11], [12], [13]. The damage detection model using deep learning extracts similarly recognized results by continuously learning and adjusting the features of labels from image datasets with preguided labels. Deep learning-based algorithms can automatically detect damage in less time than traditional digital image processing with reasonable architecture and better optimization methods. Cha et al. proposed a crack image classification model based on CNN [14]. Deep learning-based algorithms have been validated as suitable for image-based damage detection compared to conventional image processing algorithms. Hoskere et al. developed a CNN model for multiclass detection of six types of damage including cracking, spalling, corrosion, etc [15]. A detection model consisting of two parallel networks can classify multiple types of damage classes simultaneously in civil infrastructure inspection images. In addition, the performance of deep learning-based damage detection reaches a level above the human level with high reliability and accuracy [16], [17]. The study of detecting damage in imaging has become a major mainstream among various processes for the automation and practical application of UAV-based monitoring of civil structures.
In terms of performance and efficiency, the monitoring of damage to civil structures using vision images has great expectations, but some problems need to be solved for the complete replacement of humans. Among them, an important problem facing UAV-based inspection technology is obtaining image data with a consistent level of high-quality images during the inspection process. Previous relevant studies have indicated that the use of degraded quality images can directly adversely affect the outcome of the damage detection step, limiting the extended application of UAV [18], [19]. Also, Lee et al. showed that high quality images could be used to improve deep learning-based damage detection performance by finding undetectable cracks in degraded images [20]. On the other hand, it means that damage may not be adequately detected in degraded images. From this result, the detection result of damage may be different depending on the quality level of the acquired image, which is directly related to the structural condition assessment. In contrast to acquiring images in a stationary ideal situation, in UAV systems, quality can be degraded by various environments such as motion, wind, self-vibration, and light illumination. Likewise, improper camera performance and internal parameter settings (sensor characteristics, exposure time, etc.) can be degrading factors. In the inspection of UAV bridges, there may be quality deterioration such as blur, illuminance, and focus, which may occur in combination with various interference factors, and may be undetected by loss of important pixel information about the damage. Typically, in the process of acquiring an image, the pilot roughly checks the image while controlling the UAV. The image quality can depend on the skill and proficiency of the pilot, and there is no proper quality assessment method. Inspection of structures that require accurate assessment of damage in images may require additional costs for re-images of perceived low-quality images, and even low-quality images may not be recognized due to the absence of image quality assessment methods.
IQA is the evaluation and quantification of the quality level of images using various processing algorithms and indicators, similar to the human visual recognition process. So far, most IQA methods and research have been developed around optical imaging [21], [22], [23], [24], [25]. However, the direct use of these methods in the inspection image is somewhat limited. This is because there are some differences in the level of quality to be recognized for each specific field, such as inspection images. As a specific example, the inspection image inevitably obtains a relative underexposure noise image in a lightless environment such as under the deck of a bridge, but if the damage detection algorithm can extract the information clearly, it is inappropriate to be perceived as low quality. The other is that the deterioration of the examination image can be specified by certain factors (i.e., blur, exposure, focus). In general, IQA algorithms can be divided according to whether a reference image is used. First, Full-reference image quality assessment (FR-IQA), such as error visibility method [26], structural similarity [27], and information theoretical method [28], evaluate quality by comparison of relative similarity or correlation with high quality reference images. However, it is contradictory to obtain a reference image in which all distortions do not exist to evaluate the quality of inspection images in bridge inspection using UAV. In contrast, no-reference image quality assessment (NR-IQA) is an estimate of degraded quality similar to human visual systems without any reference image. In the problem of inspection images, several researchers have proposed the NR-IQA method to identify low quality images. Duque et al. proposed quality parameters including sharpened and entropy in the inspection image of a glued-laminated timber arch bridge using UAV [29]. Jung et al. proposed a method to identify blurry inspection images using grayintensity variation (SGV) parameters [6]. Lee et al. proposed a local blur map-based quality evaluation metric through discrete wavelet transform (DWT) [20]. The proposed method using the local blur map was validated to show the performance close to the classification result by the human visual system in comparison with the existing IQA metric method in the inspection image of the pier and deck of the concrete bridge. However, most of the disadvantages of IQA are that the quality degradation caused by a single distortion can be adequately evaluated, but it is difficult to evaluate the problem of multiple distortions at once. Similarly, in actual inspection images where multiple distortions are included in the images, adequate quality metrics and thresholds for evaluation cannot be established.
This study proposes a method to obtain consistent levels of images in bridge inspection using UAV, and then introduces a CNN-based IQA learning framework that can effectively evaluate the quality on datasets with multiple distortions.
The first aim is to present a consistent level of acquisition environment so that low quality does not occur by analyzing multiple distorted images in bridge inspection using UAV. Typical quality degradations include problems such as motion blur, over and underexposure, and out-of-focus. These quality problems can avoid image distortion by adjusting the speed of the UAV and shutter speed, international standards organization (ISO), and aperture. The second aim of this study is to build a classifier for CNN-based IQA using the inspection image dataset classified according to distortion. The proposed IQA method is motivated by the success of a quality classifier of blind images through a CNN-based deep learning classifier in the field of computer vision [30], [31], [32], [33], [34]. Classifiers specialized for inspection images using CNN show relative performance compared to the results of various quality evaluation indicators. It also provides information that using images classified according to the evaluation results is more appropriate for the inspection results of the structure. P This study is organized as follows. Section II describes a method of acquiring a consistent level of data from structural inspection using UAV and CNN-based IQA. In Section III, experimental validation and discussion of the proposed method are performed using actual structural image data. Section IV covers the conclusion.

II. PROPOSED METHODOLOGY FOR CNN-BASED IQA
This section summarizes the overall process for image acquisition with consistent quality levels and CNN-based IQA in bridge inspection with UAV. Figure 1 represents a comprehensive process established to assess the quality of images in structural inspection images. In the first stage, the appropriate parameters are determined when the UAV acquires images for structural inspection. Among the variables that are determined here, external conditions include UAV speed and illumination, and internal factors of a camera include aperture, shutter speed, and ISO. The conditions required to obtain a high-quality image are determined according to motion blur, exposure problems, and out-of-focus, which are typical deterioration in quality that can occur in structure inspection images. Finally, the data is classified according to the image quality level, which is used for training the second stage, the CNN-based IQA model.
In the second step, the CNN-based IQA model is trained via manually classified datasets based on quality. High quality means sharpened images, and low quality consists of blurred, over-exposed and under-exposed images, each separated by annotation. In the dataset, 90% of the images according to each annotation are used for training and the rest is left as data for validation. The prepared training data are fed to feature learning to evaluate according to the quality on the validation set. Finally, the finetuned CNN model is applied for assessment when a new inspection dataset is given as input data. From the result, we can isolate low quality images from the whole data.

A. STAGE1: APPROPRIATE BRIDGE INSPECTION IMAGE ACQUISITION CONDITIONS CONSIDERING QUALITY DETERIORATION
Motion blur, which is typically observed, is a deterioration in which the boundary line becomes vague when there is an object movement during image data acquisition. This can be caused by the vibration or moving speed of the UAV in the structural inspection procedure and is also closely related to the shutter speed of the camera. Damage, such as cracks, is not detected properly when inspection data with ambiguous overall boundaries of the image is acquired due to motion blur. Out-of-focus is a phenomenon that occurs in poorly focused inspection images, and similar to motion blur, the boundaries appear blurred. Another cause of degradation, underexposure, occurs mainly in dark skies and in environments such as the base of the bridge, and the entire pixel is darkened due to the lack of light provided to the image sensor. Conversely, overexposure is a problem of image quality degradation due to the overall bright pixels as too much light is provided to the image sensor. Figure 2 shows the deterioration of the actual bridge inspection images.
For data analysis according to quality, not only high quality inspection images but also low quality images such as motion blur, out-of-focus, underexposure, and overexposure were acquired for real bridge structures. The actual bridges for which data acquisition was performed are J and H bridges in Chungcheongnam-do, South Korea, as shown in figure 3. For UAVs to obtain inspection images, Inspire 2 manufactured by DJI Technology Co., Ltd. was used, and the details of the used camera (Zenmuse X7) and lens are shown in Table 1. In the degradation of quality, variables (i.e., UAV speed, shutter speed, aperture, ISO) are assigned to suggest appropriate image acquisition conditions. The image according to each variable is obtained by repeatedly inspecting the pier, the bottom and the side of the deck of the target structure. In subsequent sections, the appropriate image acquisition conditions are introduced through image analysis for variables in the case of quality distortion.

1) SETTINGS TO AVOID MOTION BLUR
Motion blur is a sharpening smeared by the integration of changes in illuminance due to movement within an exposure period, and is a complex problem that can be spatially distorted, non-linear and localized [35]. Two factors that affect this motion blur are the speed of the UAV and the shutter speed. On the side of the UAV acquiring the inspection image, motion blur causes a loss of pixel information when the UAV's speed is too fast for the shutter speed. Figure 4 is an example of the inspection image according to the UAV speed and shutter speed. Figure 4 (a) is a comparison of images according to the UAV speed at a fixed shutter speed obtained by targeting the pier of a bridge. In the image on the left, motion blur occurred because the movement was too fast compared to the shutter speed, and the boundary line is smeared. On the other hand, the image on the right shows a sharpened image because the shutter speed was sufficient due to the movement of the UAV. This result indicates that an appropriate fast shutter speed should be selected at high UAV speeds to suppress motion blur generation. Figure 4(b) shows the comparison results according to shutter speed in a fast UAV. Similar to the result of (a), in the image on the left, blurring of the boundary line is observed due to motion blur. The image on the right was taken with a relatively fast shutter speed at the same UAV speed. In this result, no motion blur occurred, but a relatively dark image is observed. In photography knowledge, a high shutter speed leads to a underexposure image because the amount of light received by the image sensor is inevitably low. Therefore, shutter speed is related to the underexposure issue discussed in later sections. If the shutter speed is fast, motion blur due to the movement of the UAV can be removed, but it cannot be increased unconditionally because the image is acquired darkly Images were obtained at 1/100s to 10000/s shutter speed according to UAVs' low speed (1 m/s), high speed (2.4 m/s and 3 m/s) and very high speed (4 m/s) to determine the effect on motion blur. The critical shutter speeds observed with experimental data are 1/100s, 1/1600s, 1/2000s, and 1/8000s, respectively, depending on the UAV speed. Figure 5 shows the area of concern for motion blur by comparing shutter speed according to UAV speed. motion blur can occur if image acquisition is performed in conditions within the region. Also, as mentioned before, the too fast shutter speed will cause underexposed images, so an appropriate value should be selected according to the UAV speed. Therefore, in the structural inspection procedure, it is necessary to prevent motion blur by determining an appropriate shutter speed according to the speed of the UAV in operation.

2) SETTINGS TO AVOID OUT-OF-FOCUS
Even when inspecting structures using UAVs, low quality images are obtained if the focus is on the background rather than the inspection object when the depth of field is insufficient. The depth of field is controlled by four factors. First, the shorter the focal length of the camera lens, the deeper the depth of the field, and the longer it is like a telephoto lens, the shallower it. The camera and subject distance, which is the focus distance, also affects the depth of field. Depending on the optical characteristics, the problem of out-of-focus is usually prominent in images obtained at close range. Similarly, the distance between the subject and the background results in a more pronounced out-of-focus on subjects located further away. The most important factor above all is the condition of the aperture. The state of the aperture is usually indicated by the F-number, with smaller values indicating more openness. A smaller F-number is more susceptible to focus problems due to the shallow depth of field.
In an inspection image acquisition experiment using a UAV, only the aperture value was used for out-of-focus related quality factors. This is because images were acquired with a fixed 35mm focal lens and a constant working distance of 3m in the experimental environment. Figure 6 shows possible focusing problems depending on the value of the aperture. The smaller the aperture value, the shallower the depth of field, so the image must be precisely focused. On the other hand, very large apertures avoid out-of-focus problems with a very deep depth of field. Using an F-number of 5.6 to 8, taking into account the lens and working distance to detect microscopic damage such as cracks, avoids problems with focus. However, similar to the shutter speed, less light is received at a narrow aperture, which can cause underexposure images. In fact, high quality images will be obtained if the correct depth of field and focus are achieved in the range of distances of the inspection area from the camera. Lenses and inspection environments that affect depth of field are bound to be very different. Therefore, to avoid quality problems related to out-of-focus, it is very important to always focus well on the inspection area, and it is necessary to secure an appropriate lens and aperture value by paying attention to the working environment and the corresponding depth of field.

3) SETTINGS TO AVOID EXPOSURE PROBLEMS
This section discusses underexposure and overexposure that may appear in inspection images in bridge inspections using UAVs. Naturally, exposure issues are determined by the light available in the image acquisition environment. A key factor in exposure is illumination conditions, a quality-related factor not previously discussed. If the illumination level is the same, the factors that affect exposure are shutter speed, aperture, and ISO. As described in the previous sections, too fast shutter speeds and narrow apertures limit the amount of light provided to the image sensor, causing underexposure problems. Under these conditions, ISO can be used to adjust the exposure of the image. where ISO is the sensitivity setting for the image sensor to light. Even with a small amount of light provided at a high ISO sensitivity, a bright image can be obtained. However, at ISO values that are too high, noise can appear due to large amounts of light particles. Figure 7 shows the image data of structures according to ISO and exposure values under fixed shutter speed and aperture conditions. Here, the exposure value (EV) is a quantified value of the illumination conditions, and the higher the value, the brighter the environment. The images in the first column are from the inside of the bridge box girder. Proper images could not be obtained even at the highest ISO values due to the illumination environment being too low. Therefore, in an environment in which light is very insufficient, it is necessary to obtain an image of appropriate quality by adjusting the shutter speed and aperture. The second column is images obtained from the bottom deck of the bridge, and at the highest ISO, images of adequate quality were obtained. Noise due to high ISO may occur in the inspection image, but it may not need to be considered in image processing as it appears as a relatively clear image. The images in columns 3 and 4 are data from a typical illumination environment. In order to obtain consistent quality data related to the exposure problem in the experimental results, it is necessary to adjust the camera parameters according to the degree of illumination determined by the inspection environment. The second stage aims at CNN-based IQA modeling that classifies inspection images according to quality, similar to human perception. Considering the complexity of the inspection image content and the variety of degradation factors, we adopted a quality evaluation method using deep VOLUME 11, 2023 neural networks. The CNN-based deep neural network trains high quality and low-quality features on images to improve the performance of classification. A trained CNN-based IQA model classifies each inspection image according to its quality based on the learned features. The following section details the CNN-based IQA modeling process for inspection images.

1) DATA PREPARATION
CNN-based deep neural networks need to train various data to perform classification on new inputs. For instance, when a new underexposed image is supplied, the CNNbased classifier trained on with motion blur and exposure issues, should anticipate the right answer. From this perspective, the training of CNNs depends on well-annotated data, which is directly related to the performance of the classifier. The training inspection images in this study were obtained according to various qualities in the stage 1 experiment. The dataset comprises of sharpened, motion blur, overexposed and underexposed images, and is used to train a CNN-based IQA classifier. A total of 11,990 images of UAV inspection data are used for learning CNN-based deep neural networks.
The total data consists of 5,870 sharp images, 2,581 motion blur images, 2,153 underexposed images, and 1,386 overexposed images. Focus-related deterioration was excluded from the quality evaluation classification because there were not enough out-of-focus datasets. Similar to human perception, no data augmentation techniques except rotation were used to account for the quality level of the overall image. Also, 10% of the data for each quality factor is used as validation data in training.

2) OVERALL ARCHITECTURE FOR INSPECTION IMAGE QUALITY ASSESSMENT
The construction of a deep neural network using the VGG-16 [37] architecture for evaluating image quality is shown in Figure 8. In this architecture, training parameters are reduced using small filters, so neural networks can be deeply developed and provide high accuracy with simple structures. Convolution, max pooling, fully linked, and softmax layers are all included in the overall structure, and the layers are connected in accordance with their respective functions. To incorporate quality information, the dataset images are continuously trained on image attributes.
The convolutional layer is the core of deep learning and serves as a filter for extracting image features. As the image is transmitted, these convolution layers form an activation map in which features are integrated. Activation maps with reduced spatial dimensions as data passes through the convolution layer retain only information about image features. Furthermore, the convolution layer is used in conjunction with rectified linear unit (ReLU) [38], an activation function that provides nonlinearity to perform a complex classification.
The pooling layer extracts feature information from the activation map passed through the convolutional layer once more. In this process, the features of the image are more concentrated, and unnecessary noise can be removed. The features of the image aggregated by passing through layers earlier are connected to a fully connected layer. A fully connected layer flattened into a one-dimensional shape is used for image classification. Finally, the softmax function outputs the most likely class from the information about the image features passed up to the fully connected layer. A model that passes through various layers and learns invariant image features potentially perform classification similarly to the way a human visual perceives the real world.

3) TRAINING THE CNN-BASED IMAGE QUALITY ASSESSMENT AND RESULTS
The CNN-based IQA which is proposed in this study is trained through transfer learning. For quality classification of inspection data, a pre-trained VGG-16 network from the ImageNet dataset [39] is used as a fixed image feature extractor. As shown in Figure 8, the weights of all neural networks except for the last layer connected to the fully connected layer are fixed, and fine-tuning is performed on the trainable layer. Transfer learning has the advantage of being able to train using a pre-trained, high-performance image feature extractor, which allows it to generalize on relatively small data sets. For the hyperparameter to minimize the loss function, the decay learning rate from 0.002 and the momentum value of 0.9 are utilized for optimization of stochastic gradient descent. Table 2 shows the loss function cost and classification accuracy according to training epochs with the set hyperparameters. As training progresses, it is confirmed that the cost of the accuracy and loss functions gradually improve for training and validation datasets. As you can see in the table, the 3rd epoch gave the highest accuracy performance on the training dataset. On the other hand, the trained model showed the highest accuracy at the 10th epoch on the validation dataset. Since the training set accuracies obtained in the two epochs are similar, the model obtained in the 10th epoch is utilized for subsequent experimental validation. Figure 9 shows the results of the 10th epoch CNN-based IQA classifier model classifying sharp image, motion blur, and exposure issues in the validation dataset.

III. EXPERIMENTAL VALIDATION
In this section, an experimental validation of the performance of the CNN-based IQA classifier is performed. The datasets used for validation are images acquired by UAVs inspecting bridges. These experimental validation datasets were not involved in model training. The first case of the experimental validation of the CNN-based quality classification model is the comparison of image contrast. Then, a second experimental verification is performed using motion blur-degraded images acquired from the new bridge for performance comparison. To evaluate the performance of CNN-based quality     to quality using a CNN-based classifier, and performance comparison is performed with scores from conventional NR-IQA.P Figure 10 is the inspection image data set used to evaluate exposure issues in the primary validation experiment. In the E and H bridges in Figure 3, the UAV acquired the piers, bottom and side of the deck as inspection areas. The entire image consists of images that can be actually obtained while the UAV inspects the bridge. While some images have enough quality to detect visible damage, others have poor technical settings that result in an exposure that is either too low or too high. For example, crack damage included in image #14 is sufficiently visible to the human eye, but in image #5, even if there is actual damage, it cannot be detected as an exposure problem. The dataset associated with 16 exposures is directly applied without preprocessing to CNN-based image quality classification, including MOS-based subjective image quality classification and conventional NR-IQA.

2) COMPARISON OF CLASSIFICATION RESULTS FOR VALIDATION 1
Subjective classification was conducted in accordance with experimental quality scores based on MOS in order to compare the performance of CNN-based quality classifiers. Through surveys, 10 experts who carry out human visual inspection or use images to assess structural damage assigned quality ratings to image materials using categories like bad (1), poor (2), fair (3), good (4), and excellent (5). Since the empirical quality for evaluating the degree of distortion of image quality may vary from person to person, scores subjectively evaluated by several people were combined. In order to classify the subjective image evaluation results of each image, the high score is classified as high quality and the low score is classified as low quality based on 3 scores, which means fair. Table 3 provides a classification comparison between the proposed CNN-based IQA results and MOS-based subjective assessment. Conflicting classification results of the two approaches are indicated in bold in the image material number. There was only a difference in one image #6, acquired from the side of bridge deck. Image #6 was subjectively evaluated as an underexposure problem due to the dominant swarthy pixel area. It seemed to have been classified as high quality since the CNN-based IQA approach does not perceive it as a serious deterioration in comparison to other underexposed images. On the other hand, for the remaining 15 image materials, CNN-based classification results are all consistent. It can be seen that the proposed quality classification method provides excellent prediction accuracy even on untrained UAV inspection images. By comparing with subjective classification results, the following conclusions can be drawn. First, the proposed CNN-based quality classification method performs very well regardless of component for areas where damage detection is required, such as piers, side and bottom of deck. In addition, it is possible to effectively distinguish images that experts judged to be unavailable for damage detection due to contrast. It could be concluded that the proposed method is reliable for identifying quality in inspection images with exposure issues.
The performance of the proposed CNN-based classifier is evaluated by comparison with MOS-based subjective classification results. A confusion matrix is employed in statistical analysis to determine the effectiveness of the classification results. Figure 11 is a confusion matrix constructed based on the classification results in Table 3, and the MOSbased subjective classification results are compared as true classes. Table 4 shows the performance metrics of CNNbased quality classification results computed from the confusion matrix. Compared to subjective quality classification methods that require a lot of time and manpower but are considered accurate, the proposed CNN-based method shows high concordance of classification results. Statistical analysis of CNN-based image quality classification performance provides the following insights. High-performance classification can be conducted at a low cost compared to the resources required for MOS-based subjective classification. In addition, it is possible to evaluate quality distortion factors such as underexposure and overexposure. As a result, the proposed method can be effectively applied to classify the quality of inspection images in which contrast is fundamentally distorted.
The results of the proposed CNN-based IQA and conventional NR-IQA approach for exposure-related image degradation problems are discussed in this section. The NR-IQA method extracts characteristics that can represent the quality of an image and measures the degree of distortion. However, these methods are limited in their direct application to quality classification. This is because it is very ambiguous to  classify images with a certain score as good images. Therefore, an indirect comparison of CNN-based quality classification results is performed according to the quality distortion scores of the inspection image materials. For comparison, the NR-IQA method adopted the Perception-Based Image Quality Evaluator (PIQUE) [40], which is known as a blind image quality evaluation method for general distortions including exposure. The perception-based quality estimator PIQUE for quantification is: where D sk is the amount of distortion allocated for the block. And N SA refers to the number of spatially active blocks in the main binary image space. C 1 is a positive constant included to prevent numerical instability. A higher image quality distortion score means lower image quality. Figure 12 provides the image distortion scores of NR-IQA (PIQUE) for distortion-related image materials and the classification results of subjective and CNN-based methods. As previously referred, image #16 was evaluated as the most distorted by the NR-IQA method, and image #14 was evaluated as the highest quality image. Compared to the previous classification results, generally lower quality images received higher scores. It should be noted here that for image #6, only the CNN-based method was evaluated as low quality. From the high distortion score of #6 image compared to the other images, the NR-IQA method is also rated as problematic. This supports that CNN-based methods properly classify inspection image quality. However, some problems were identified in the results of NR-IQA. First, image #12 was evaluated as excessively distorted compared to the classification results, and image #8 was evaluated as not severely distorted even though it was distorted. This presents a problem that the existing NR-IQA method is difficult to have generalized performance to inspection images. In the intervals where scores are particularly high or low, NR-IQA seems to measure quality scores appropriately. However, the limit of NR-IQA is found in the middle score area from image #12 to image #8. As the test data increases, a large error in the middle region may occur. Therefore, the proposed CNN-based classification method is a more appropriate evaluation method than the conventional NR-IQA.

B. CASE 2: INSPECTION IMAGES WITH MOTION BLUR ISSUE 1) DATA PREPARATION
Since none of the existing inspection data adequately contained motion blur and sharp images, data acquisitions for the new bridge were performed. Figure 13 shows the G-bridge located in Chungcheongbuk-do, South korea used for data collection. The target bridge with span of the 65m is a massive structure with a steel box and a prestressed concrete substructure. The inspection area for obtaining data is a specific area of the bottom deck. Using a UAV at a working distance of 3 to 4 meters, inspection data were collected. The images were acquired in an oblique orientation due to an issue where the camera-mounted gimbal could move up to 40 degrees. By adjusting the UAV and shutter speed, the validation dataset for assessing blur quality is obtained. Figure 14 shows the 36 inspection image materials used to evaluate the issue of motion blur in a secondary validation experiment. Images were acquired with the bottom of deck of the G bridge as the inspection area. Some images are difficult to detect for damage due to the effects of motion blur, and quantification that requires accuracy is problematic. Figure 13 is an example of a sharp (#17) image and a blurry (#34) image included in the dataset. The image quality degradation due to motion blur is clearly perceptible through the enlarged area in the image patch. While the pixels in the image on the left appear generally clear, the enlarged patches in the image on the right appear to be highly blurry. The dataset of 36 images is directly applied without preprocessing to CNN-based image quality classification including MOSbased subjective image quality classification and classical NR-IQA.

3) COMPARISON OF CLASSIFICATION RESULTS FOR VALIDATION 2
In the second experimental validation, subjective classification was performed according to the MOS-based empirical   quality score for the validation image materials to compare the performance of the CNN-based quality classifier for motion blur. Through a questionnaire survey, 10 experts evaluated the quality that human visual inspection or imagebased damage detection seems difficult. Through the comprehensive opinions of various experts, high quality and low VOLUME 11, 2023 quality were classified under the same conditions as validation 1 (Please refer to validation 1 section). Table 5 provides a classification comparison between the proposed CNN-based IQA evaluation results and MOS-based subjective evaluation on the motion blur dataset. Image materials that were classified in conflict with the two classification results are shown in bold with the corresponding number. Figure 16 shows comparison of image classifications as high quality in only one method. The images (#8, #23) that only the CNN-based method rated as high quality resulted in the expert's recognition that motion blur was present. In fact, the zoomed-in image patch reveals a little degree of motion blur. It seems that the CNN-based results performed such classifications because the effect of motion blur on these images is small compared to other images. On the other hand, image #22 was classified as low quality by the proposed method, even though experts judged it to be of sufficient quality. This is likely the cause of the motion blur effect on the left patch while the lines are obvious on the right patch. It seems that the quality was evaluated as good enough to detect crack damage appearing in a line-like shape from the inspector's perspective. On the other hand, in the remaining 21 inspection images, both CNN-based classification and MOS-based subjective classification results are consistent. In the dataset containing the motion blur problem, the following conclusions can be drawn by comparing the proposed method with the subjective classification result. Images that are close to the classification criteria in MOS-based subjective rating scores have slightly conflicting assessments for quality classification. In other words, images with slight motion blur had errors in CNN-based quality classification. However, most images can be effectively classified for qualities that experts have determined to be difficult to detect due to motion blur. Therefore, it suggests that the proposed method is sufficiently applicable to identify quality caused by motion blur.
In the second validation, the CNN-based image quality classification performance is evaluated for the check image deteriorated by motion blur. The comparison group to be validated is the result of MOS-based subjective classification, the same as the previous validation. Figure 17 is a confusion matrix constructed based on the classification results in the dataset with motion blur in Table 5. Here, MOS-based subjective classification results are compared as true classes. Table 5 describes the performance indicators of CNN-based quality classification results for motion blur computed from the confusion matrix. MOS-based subjective classification results classified with the same amount of time and manpower as in the previous verification case and the proposed image quality classification results show similar results. This means that inspection images distorted due to motion blur can be classified inexpensively and accurately with the proposed method.
This section discusses the results of the proposed CNN-based IQA and conventional NR-IQA approaches for motion blur-related image degradation issues.  Experimental validation indirectly compares CNN-based quality classification results according to the quality scores of inspection image data including motion blur. Indirect comparison is because it is not appropriate to directly classify image quality evaluation results as in validation case 1. In the NR-IQA method of optical imaging, metrics such NIQE [41], BRISQUE [24], SSEQ [42], and SGV are generally used. The SGV approach [6], which specializes in motion blur detection, is employed as a comparative metric in this study. The quality evaluation SGV, which measures the degree of motion blur according to the boundary characteristics of an image, is as follows: where SGV k represents the score for evaluating the quality of the k-th inspection image. N and M represent the number of vertical and horizontal pixels in the image, respectively. G k is the gray intensity of the pixel located at i, j of the kth image. A quality measure of motion blur is evaluated by changing the surrounding gray intensity for every pixel in the image.
The SGV value of each image, as determined by Equation (2), serves as the quality score in this case. Better data quality is indicated by a higher value as determined by the operator score. Figure 18 presents the image quality scores of NR-IQA (SGV) and classification results of subjective and CNN-based methods for check image materials containing motion blur problems. As for the SGV score, which means quality level, image #17 was evaluated as the best quality image, while image #9 was evaluated as the lowest quality image by the SGV method. Compared to the previously mentioned CNN-based and MOS-based quality classification results, in general, images classified as high quality have high scores. In terms of SGV scores, inconsistent evaluations appear in the middle of the score distribution from image #16 to image #29. As discussed from the classification results in the previous section, it supports the difficulty of evaluating the quality of images that are less affected by motion blur. It should be noted here that image #16 was recognized as low quality in both classification results, but has a high SGV score. Similarly, #29 is classified as high quality but has a low SGV score. In figure 19, the image patch in both cases shows that the SGV scores are yielding erroneous results. The SGV-based NR-IQA method shows results that are generally consistent with those classified in high and low score areas. However, in the mid-scoring area, it is found that the evaluation does not match the quality level. This reveals that the conventional NR-IQA method does not have generalized performance for quality classification on inspection images.

C. DISCUSSION
In the two inspection datasets for validation that were not used for training, the CNN-based IQA model proposed in this study effectively classifies degraded images due to exposure and motion blur. For comparative validation, a MOS-based expert subjective quality classification method was adopted, and the proposed method achieved very high performance compared with human subjective classification. In addition, the results classified by the two methods were compared with the quality evaluation scores of the conventional NR-IQA method. The NR-IQA method is an algorithm that evaluates and scores the quality of content in an image, and cannot be used for direct classification performance verification. When the NR-IQA results were presented sequentially, results similar to those of the proposed method and subjective evaluation were shown in the areas of very good quality or poor quality. However, for the image materials corresponding to the middle score area, there was a big difference from the classification result. This is often accompanied by inappropriate evaluation when the score evaluated by the NR-IQA method is insignificant in the degree of degradation in the inspection area image dataset compared to the subjective quality classification result. On the other hand, the proposed CNN-based quality classification results mostly matched the subjective classification results compared to the NR-IQA method.
Although the previously proposed NR-IQA can evaluate the degree of image degradation, it is inappropriate for the classification of low-quality images, which is the main purpose of this study. Therefore, methods that can be classified according to the quality of the current data must be processed by subjectively evaluating the images. It is clear that this method can distinguish degraded images with high accuracy.  However, it is not suitable for processing a large amount of data such as inspection images. It seems very unreasonable for a human to manually classify the quality of numerous inspection images. On the other hand, the proposed CNNbased image quality classification method can evaluate degradation in 5.7 seconds per frame for high-resolution images of 3840 × 2160 size and classify problematic images. From the experimental results, the proposed method approached the subjective image classification results with high accuracy. However, there are challenges to overcome regarding data processing time. For thousands of inspection data, the proposed method requires considerable time, about 16 hours. In addition, with the development of cameras and image sensors, the size of images that can be used is gradually increasing. As a result, it is necessary to reduce the processing time required for large datasets through model compression and parameter reduction.
There are also obvious advantages to employing only highquality images for bridge assessment evaluation rather than data of uncertain quality. Figure 20 shows the stitching results from the secondary experimental validation dataset applying only high-quality images (a) and whole images (b). When only high-quality images are used, the overall modeling results are insufficient due to the limited number of data used for stitching. When whole images that are not classified according to quality are used, it produces better stitching outputs. However, low quality images used in image processing may affect stitching results. The degradation caused by the low quality can be seen in a magnified image patch of the same area of the two stitching results. Blurred lines may be visible in image patch #1 of figure 20 (a), but the effect of blur is removed in (b), when only a clean image is used.
The inspection image appears blurry overall in figure 20 (a) #2 image patch, but in (b) this effect seems to have been removed. Based on these results, it appears that some of the deterioration images included in the significant amount of data have an adverse influence on the stitching or 3D modeling of the inspection results. Furthermore, undetectable damage caused by low-quality images can result in inadequate structural assessment. To address this issue, it is important to evaluate the quality of the inspection data and identify the location of the low-quality data.

IV. CONCLUSION
A strategy for consistently acquiring a UAV-based bridge inspection dataset, as well as a method for classifying deteriorated quality images using a CNN-based IQA classifier, was proposed in this paper. The overall framework described in this paper consists of two stages. In the first stage, the methodology for obtaining images having consistent quality through data analysis acquired from several deteriorations such as motion blur, out-of-focus, overexposure, and underexposure in actual bridge structures was developed. Inspection data is secured according to the external environment and camera internal parameters related to deterioration, and variable values (e.g., UAV speed, shutter speed, aperture, ISO) are specified in the image in which deterioration has occurred. By using specific variable values, it is feasible to prevent acquiring low-quality images. It is also used to classify training data for CNN-based IQA. After that, a CNN-based IQA classifier trained on a dataset classified according to quality is described. Using pre-trained VGG-16, fine-tuning was conducted in order to properly extract features for degradation. The trained CNN-based IQA model has validated overall performance in classifying images according to their quality through experimental analysis. Through a comprehensive evaluation of the proposed model for inspection image data with motion blur and exposure concerns, it is confirmed that the proposed method has consistent results compared with the MOS-based subjective classification results measured by inspection experts. In addition, CNN-based quality evaluation showed high performance through indirect comparison with PIQUE and SGV-based NR-IQA methods. In other words, the CNN-based IQA model can learn image features for quality classification well, just as humans are capable of recognizing a quality deterioration. The CNN-based IQA method has the advantages of performing image processing for quality evaluation in less time than the conventional methods, and betterlevel image processing results for stitching or 3D modeling also can be achieved by isolating the deteriorated image. In terms of data evaluation, the CNN-based IQA method is considered an important strategy in the UAV-based bridge inspection system by automatically classifying images that are inappropriate for structural assessment during the inspection process.
However, the proposed methodology can still be further advanced in terms of practical bridge inspection using UAV. In this study, classification methods according to various quality deterioration were proposed, but it is necessary to develop a technique that can evaluate the level of each degradation. In that case, a low-quality image may be reconstructed through an image enhancement method to finally secure a high-quality image. In the training data of the proposed CNNbased IQA model, the most frequently occurring specific degradations were considered; however, other issues such as structures obscured by shadows or obstacles should be additionally addressed to more completely evaluate structural inspection images.