IL-YOLO: An Efficient Detection Algorithm for Insulator Defects in Complex Backgrounds of Transmission Lines

Insulators play a pivotal role in power transmission lines, and the timely detection of defects in insulators is crucial to prevent potentially catastrophic consequences in terms of human lives and property. This paper proposes an insulator defect detection algorithm, named Insulator Lack-You Only Look Once (IL-YOLO), addressing the limitations observed in existing research concerning the complex background and multi-target challenges in insulator detection. The IL-YOLO algorithm focuses on detecting insulator defects within the intricate background of power transmission lines. To enhance its functionality, we propose three improved modules. Firstly, the Insulator Lack-Global Attention Mechanism (IL-GAM) addresses issues such as the mutual influence of weights and loss of detailed information in the original module. Secondly, the Insulator Lack-C3 (IL-C3) module is designed to emphasize key information while preserving feature extraction and fusion. Lastly, the Insulator Lack-SPPFCSPC (IL-SPPFCSPC) module enhances attention to both key and global information while extracting effective information from multi-scale features. Experimental results demonstrate that IL-YOLO achieves a detection accuracy of 91.2%, marking a 3.6% improvement compared to the YOLOv5 algorithm. Furthermore, precision improves by 0.5%, recall increases by 6.3%, and the F1 score sees a boost of 3.8%. Notably, IL-YOLO achieves a frame rate of 90 frames per second (FPS), showcasing its capacity for real-time detection. Additional experiments affirm IL-YOLO’s accuracy in completing insulator defect detection tasks in both general and complex backgrounds, highlighting its substantial advantages in addressing complex background and multi-target challenges.


I. INTRODUCTION
As of the conclusion of 2022, the total length of 220 kV and above transmission lines in China has reached 794,000 kilometers [1].The continuous growth of the Chinese power industry has led to a gradual expansion in the scale of transmission lines.Insulators, integral components of transmission lines, are prone to damage from prolonged exposure to outdoor conditions, resulting in phenomena such as the insulator defect image depicted in Fig 1 .Statistically, over 75% of global power grid accidents are attributed to insulator The associate editor coordinating the review of this manuscript and approving it for publication was Gaetano Zizzo .
defects annually [2], posing a significant threat to the secure and stable operation of power grids.Therefore, the effective detection of insulator defects holds substantial practical importance.
Historically, the detection of insulator defects relied on manual inspection.However, with the emergence of deep learning in the field, object detection algorithms have progressively replaced traditional manual inspections.In contrast to manual inspections, which are costly and time-consuming, object detection algorithms can efficiently and safely accomplish the task of insulator defect detection.This transition signifies a noteworthy advancement in the insulator detection capabilities of the power industry.Presently, object detection algorithms are broadly categorized into single-stage detection algorithms and two-stage detection algorithms [3].Single-stage detection algorithms predominantly include the You Only Look Once (YOLO) series [4], [5], [6], [7], [8], and the Single Shot Multi-Box Detector (SSD) [9].Conversely, two-stage detection algorithms are primarily represented by the Region-based Convolutional Neural Network (R-CNN) series [10], [11], [12].Previous research has predominantly concentrated on enhancing the detection accuracy of two-stage detection algorithms, as evidenced in the works of Shuang et al. [13], Ou et al. [14], and Wang et al. [15].
To address the challenge of relatively slow detection speed, single-stage detection algorithms have been introduced.Sadykova et al. [16], building upon the YOLOv2 network, incorporated data augmentation tools to prevent overfitting of training data, resulting in improved recognition of insulators covered with ice, snow, water, and similar substances.Liu et al. [17] combined YOLOv3 with CSPDarknet53, introducing the CIOU Loss and K-means++ clustering algorithm to enhance the detection accuracy of insulator defects, albeit at the cost of a significant reduction in detection speed.Bao et al. [18], utilizing YOLOv5 as the base network architecture, introduced the CA attention mechanism module in the Backbone and incorporated the Bi-FPN network structure, thereby improving the detection accuracy of insulator defects.However, the model introduced an excessive number of parameters, hindering edge deployment.Wang et al. [19], by introducing Darknet53 to replace the original backbone of YOLOv4, aimed to enhance the detection accuracy of insulators but focused only on insulator detection, neglecting the broader issue of insulator defect detection.Han et al. [20], in designing the D-CSPDarknet53 network to replace the YOLOv4 backbone, incorporated the Shuffle Attention (SA) mechanism into the feature fusion network and introduced a new detection head to enhance the recognition capability of insulator defects.However, there remains room for further optimization in terms of both detection accuracy and speed.Huang et al. [21], achieving model lightweighting by pruning redundant layers from YOLOv5, introduced an adaptive attention module between adjacent residual modules to enhance the network's feature learning capability.Nevertheless, further improvement is required in terms of model detection accuracy.Yi et al. [22], building upon YOLOv5, introduced GSConv, designed the VoV-GSCSP module and MaECA attention mechanism module, optimized the loss function in SPPF, and introduced the SIoU loss function.However, there is still potential for further optimization and enhancement in model accuracy.
In general, deep learning has showcased substantial potential in the realm of insulator defect detection.However, current research encounters challenges in addressing complex backgrounds and multi-target issues in insulator detection.This paper adopts YOLOv5 as the foundational model, which, in comparison to other versions, achieves a more favorable balance between detection accuracy and speed, with a model size conducive to edge deployment.Building upon this foundation, the paper introduces a detection algorithm named Insulator Lack-You Only Look Once (IL-YOLO), specifically designed to tackle the task of insulator defect detection within the intricate background of power transmission lines.The introduction of this algorithm brings forth novel perspectives and methods to overcome existing challenges in insulator detection.The key contributions of this study are summarized as follows.
a) To address the issue of low saliency in detecting faulty targets within complex backgrounds, which often leads to challenges in accurate detection, the introduced Insulator Lack-Global Attention Mechanism (IL-GAM) attention module is incorporated into the backbone.This augmentation enhances the model's ability to recognize defect positions, intensifies its focus on crucial information, and enhances the extraction of key features effectively during the early and middle stages of the network.Consequently, this augmentation contributes to an improvement in the model's overall detection accuracy.b) To address the challenge of information loss during the feature extraction and fusion process, the proposed Insulator Lack-C3 (IL-C3) module is introduced to replace the original C3 module in the network.This module intensifies the emphasis on target information, mitigates the likelihood of gradient vanishing issues, consequently reducing instances of missed and false detections.This modification results in a more compact model size while concurrently enhancing its capacity to extract pertinent information pertaining to insulator defects.c) To address the multi-target challenge associated with insulator defects, the proposed Insulator Lack-SPPFCSPC (IL-SPPFCSPC) module replaces the original SPPF module.This novel module, while extracting effective information from multi-scale features, enhances attention to both key and global information.This improvement enables the algorithm to adeptly classify and extract information from multi-target images, thereby further augmenting the algorithm's overall detection performance and accuracy.

II. METHOD
In this chapter, we will introduce a range of improvement strategies and the comprehensive architecture of the proposed algorithm, all designed to be plug-and-play.

A. IL-GAM
In the task of insulator defect detection, we encounter two primary challenges.Firstly, insulators are frequently positioned in complex background environments, including other transmission lines and buildings, which introduce an abundance of background information in terms of texture, color, and shape.This complexity poses a challenge to the defect detection task.Secondly, the use of long-distance photography, due to the impracticality of capturing images at close range, may lead to difficulties in extracting characteristic information about the defect's position, thereby affecting the accuracy of defect detection.
In recent years, various attention mechanisms have continually surfaced, showcasing significant progress in performance.These encompass Squeeze-and-Excitation (SE) [23], Convolutional Block Attention Module (CBAM) [24], Coordinate Attention (CA) [25], Normalization-based Attention Module (NAM) [26], and others.While these attention mechanisms have achieved noteworthy results in enhancing model performance, they generally concentrate on information from only two dimensions.In contrast, the recently introduced Global Attention Mechanism (GAM) [27] can comprehensively utilize information from all three dimensions, thereby further enhancing model performance.
However, when addressing insulator defect detection tasks in complex backgrounds, GAM still exhibits certain limitations.These include the influence of channel attention weight on the output of spatial attention weight and the excessive blurring of information for small targets, leading to a reduction in defect detection accuracy.Recognizing these challenges, this paper introduces an optimized and enhanced attention mechanism named Insulator Lack-Global Attention Mechanism (IL-GAM).The overall module design is depicted in Fig 2 .In the initial stage, the input feature map F 1 is simultaneously fed into both the channel attention module and the spatial attention module.This simultaneous processing is implemented to avoid mutual interference of weights, enhance the exchange and integration of global information, and consequently improve computational efficiency.The outputs from these modules are then summed, and the resulting sum undergoes sigmoid activation to obtain the final attention weights.Subsequently, these weights are element-wise multiplied with the input feature map F 1 .The resulting product is then added to the residual structure, preserving the richness of information related to small targets and reinforcing the representational capacity of the output feature map F 2 .The formula is expressed as ''(1)'' and ''(2)''. where: • M F represents the mixed-domain attention weight.
• M c represents the channel attention weight.
• M s represents the spatial attention weight.
• σ represents the sigmoid activation function.

1) CHANNEL ATTENTION MODULE
In the Channel Attention Module, the Mish activation function [28] is employed instead of the conventional ReLU [29].
The functions are compared in Fig 3 .Mish demonstrates greater flexibility in managing negative values, facilitating information propagation and mitigating gradient saturation issues.Simultaneously, it maintains smoothness to enhance the efficacy of gradient descent.This enhancement equips IL-GAM with improved feature extraction capabilities, especially in the context of complex image backgrounds.The Channel Attention Module initiates by transforming the dimensions of the input feature map F 1 .Subsequently, it employs a Multi-Layer Perceptron (MLP) for information propagation, followed by dimension reverse transformation.Finally, the processed result undergoes sigmoid activation.The Mish function is utilized for information processing within the MLP.The structural details are depicted in Fig 4 .The formula is expressed as ''(3)''. where: • M C represents the channel attention weight.
• per represents the dimension transformation operation.
• MLP represents the fully connected layer.
• σ represents the sigmoid activation function.
14534 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

2) SPATIAL ATTENTION MODULE
The Spatial Attention Module primarily employs convolution for processing.Two convolutional kernels of size 7 × 7 are utilized.By executing dimension reduction followed by dimension augmentation, spatial dimension information is extracted and fused to enhance attention to spatial details.Finally, the sigmoid function is applied for further processing.
The structural details are illustrated in Fig 5 .The formula is expressed as ''(4)''. where: • M S represents the Spatial Attention Module.
• σ represents the sigmoid activation function.

B. IL-C3
In YOLOv5, the C3 module showcases capabilities in feature extraction and fusion.However, for the specific task of insulator defect detection, this module's design presents certain limitations.Insulator defects typically manifest small target characteristics, and the feature information within the defect area often resembles the surrounding background.
The original C3 module processes information exclusively through 1 × 1 and 3 × 3 convolutions.When confronted with small target characteristics and areas exhibiting high similarity, this structure may result in the loss of detailed information, thereby impacting the accurate extraction of insulator defects.
To mitigate the previously mentioned issues, this paper introduces the Insulator Lack-C3 (IL-C3) module, drawing inspiration from the core concepts of ResNet [30], GSConv [31], and SE [23].This enhancement is designed to more effectively address small target features, boost the extraction capability of key information, and enhance the model's performance in the task of insulator defect detection.The structure is illustrated in Fig 6.
The process begins with a 1 × 1 convolutional layer performing dimension reduction on the input feature map, thereby reducing the computational complexity of the model.The dimension-reduced feature map is then fed into the Insulator Lack-BottleneckX (IL-BottleneckX ) module, specifically designed for feature extraction to intensify attention to crucial information.Subsequently, the output of the IL-BottleneckX module is concatenated with the original input feature map, aiming to fuse the high-level semantic information extracted and the low-level features from the original input.Finally, a 1 × 1 convolutional layer is employed to adjust the channel number and obtain the output feature map.The formula is expressed as ''(5)''.
The IL-BottleneckX module consists of two substructures: IL-Bottleneck1 and IL-Bottleneck2, illustrated in Both sub-structures primarily utilize 1 × 1 convolutional layers and 3 × 3 GSConv for dimension reduction and feature extraction.Subsequently, the SE attention mechanism is employed to focus on crucial information related to insulator defects.Finally, a 1 × 1 convolutional layer ensures the output channel number is consistent with the input, enhancing the quality of feature fusion and thereby improving network detection accuracy.The formula is expressed as ''(6)'' and ''(7)''. where: • f 1×1 represents the convolutional layer with a 1 × 1 convolutional kernel.
• S denotes the SE attention mechanism.

C. IL-SPPFCSPC
In YOLOv5, the Spatial Pyramid Pooling (SPP) module elevates the network's detection accuracy by integrating features from various scales.Despite the capabilities demonstrated by the SPPF [7] module and the SPPCSPC [8] module proposed in 2023 for multi-scale feature extraction, they still possess certain limitations in addressing the multi-target issue associated with insulator defects in power transmission lines.
To tackle these challenges, this paper introduces the Insulator Lack-SPPFCSPC (IL-SPPFCSPC) module.Drawing inspiration from the design principles of SPPF to enhance model training speed, this module integrates Global Average Pooling (GAP) and Global Max Pooling (GMP) to comprehensively consider crucial information from the target and global context.Research [24] suggests that the simultaneous use of these two pooling methods enhances feature diversity and expressive power, thereby improving the model's robustness and generalization, especially in handling multi-target problems in complex backgrounds.The IL-SPPFCSPC module, through cross-scale information fusion, adapts to the requirements of images with different resolutions, enhancing the model's adaptability and generalization performance, ultimately optimizing detection effectiveness and accuracy.The overall module design is illustrated in Fig 9.
The process initiates with the input feature layer undergoing feature extraction through convolutional layers with kernel sizes of 1 × 1, 3 × 3, and 1 × 1. Subsequently, the input flows into the Compound Pooling module for multi-scale feature fusion, as depicted in Fig 10 .Following this, a 1 × 1 convolutional layer is applied to simultaneously reduce the number of channels and parameters.The processed output then enters a 3 × 3 convolutional layer to restore the channel number, and the resulting output is concatenated with the initial output of the 1 × 1 convolutional layer.Finally, a 1 × 1 convolutional layer is employed to maintain the output's channel number consistent with the original input feature layer.The formula is expressed as '' (8), as shown at the bottom of the next page''.The Compound Pooling module continuously feeds the input into three convolutional layers with kernel sizes of 5, padding of 2, and global max-pooling, as well as global average-pooling with the same operations.Each output is concatenated with the original input channels.The formula is expressed as '' (9), as shown at the bottom of the next page''.where: • f 1×1 represents the convolutional layer with a 1 × 1 convolutional kernel.
• C represents the concatenation operation.
• CP represents the Compound Pooling module.
• GAP represents the global average pooling.
• GMP represents the global max pooling.

D. IL-YOLO NETWORK MODEL
The IL-YOLO algorithm proposed in this paper represents an optimization of the YOLOv5 algorithm, incorporating significant enhancements to the backbone and neck networks of YOLOv5.
In the backbone, IL-YOLO introduces the IL-GAM attention mechanism module to enhance the network's focus on critical areas, eliminating interference from non-essential information and improving the feature extraction capability. where: • ρ represents the Euclidean distance between the center points of two bounding boxes.
• b represents the center point of the predicted box.
• b gt represents the center point of the ground truth box.
• w gt represents the width of the ground truth bounding box.
• h gt represents the height of the ground truth bounding box.
• w represents the width of the predicted bounding box.
• h represents the height of the predicted box.
• v represents the parameter measuring the similarity of aspect ratios.

III. EXPERIMENTAL RESULTS AND ANALYSIS A. SELECTION AND CONSTRUCTION OF THE DATASET
For the task of insulator defect detection in power transmission lines, this study implemented a series of effective strategies.Recognizing the relatively small scale of the original dataset, which could lead to insufficient model training and underfitting issues, we conducted in-depth research and introduced strategies such as data augmentation and feature enhancement to improve training effectiveness.The primary datasets utilized include the China Power Line Insulator Dataset (CPLID) and some images collected by drones in actual scenarios.The division between the training set and the test set followed an 8:2 ratio.The specific number of data images is shown in Table 1.

B. EXPERIMENTAL ENVIRONMENT AND HYPERPARAMETER SETTINGS
The experiment was conducted on a Windows 10 operating system using an NVIDIA GeForce RTX 3080 for both training and testing.The model was built using the PyTorch 1.12.1 framework and the Python 3.10 programming language.
In our experiment, stochastic gradient descent (SGD) was employed as the optimizer.The initial learning rate was set to 0.01, and it was updated using the cosine annealing learning rate schedule.The optimizer's momentum and weight decay values were configured at 0.937 and 0.0005, respectively.Each model underwent training for 200 epochs, with a batch size of 16 images per iteration.Regarding data augmentation parameters, the scores for enhancing the hue, saturation, and brightness of input images were set to 0.015, 0.7, and 0.4, respectively.During preprocessing, there was a 50% chance of horizontal flipping for the images.Additionally, cutout data augmentation techniques were employed to enhance the model's generalization and accuracy.More detailed hyperparameter settings are provided in Table 2.

C. EVALUATION METRICS
To rigorously evaluate the model's performance, this study employs commonly used metrics in object detection assessment.
Specifically, True Positives (TP) denote instances where the model accurately predicts positive samples, indicating the number of correctly detected insulators.False Negatives (FN) 14538 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.represent cases where the model incorrectly predicts positive samples as negative, indicating the number of insulators that were missed (undetected).False Positives (FP) correspond to situations where the model incorrectly predicts negative samples as positive, reflecting the number of falsely detected insulators.
Recall (R) is the ratio of the number of samples correctly predicted as the positive class by the classifier to the total number of actual positive samples.The formula is expressed as ''(13)''.
Precision (P) is the ratio of the number of samples correctly predicted as the positive class by the classifier to the total number of samples classified as the positive class.The formula is expressed as ''(14)''.
Average Precision (AP) is the average of the areas under the precision-recall curves calculated for each class.The formula is expressed as ''(15)''.
Mean Average Precision (MAP) is the average of the Average Precisions (AP) calculated for all classes.The formula is expressed as ''( 16)''. where: • n represents the set of detection categories.
• ρ(r) represents the average precision for predicting targets within each category.F1 is the harmonic mean of precision and recall.A higher F1 score indicates a more effective experimental method.The formula is expressed as ''(17)''.
Frames Per Second (FPS) is a measure of the number of frames processed per second in image or video processing.
Giga Floating Point Operations Per Second (GFLOPS) measures the number of floating-point operations executed per second during model inference, serving as an indicator of model computational efficiency.

D. RESULT ANALYSIS 1) ABLATION EXPERIMENTS
This study investigates the effectiveness of IL-GAM, IL-C3, and IL-SPPFCSPC for detecting defects in the transmission line dataset.To achieve this, ablation experiments are conducted to assess the impact of these improvement methods on the experimental results.The experimental data in Table 4  are 3.
G-YOLO, C-YOLO, S-YOLO, GC-YOLO, CS-YOLO, GS-YOLO, and IL-YOLO all achieve higher mAP values than the baseline YOLOv5 model, reaching 88.3%, 88.8%, 89.4%, 88%, 88.3%, 89.4%, and 91.2%, respectively.These improved models increase the average precision of defect detection by 0.7%, 1.2%, 1.8%, 0.4%, 0.7%, 1.2%, and 3.6%, respectively.For insulator detection, the average precision of these models is higher than that of the basic YOLOv5 model.For G-YOLO, the addition of the IL-GAM attention mechanism module has increased attention to small target objects to some extent.Compared to the basic model, the IL-GAM module increases the focus on small target information while improving model accuracy.Experimental results show that G-YOLO's F1 score has increased by 1.4%, and the detection accuracy of small target information has increased by 1.2%.C-YOLO replaces the original C3 module in the YOLOv5 model with the IL-C3 module, resulting in a 1.2% improvement in mAP compared to the original YOLOv5.S-YOLO adds the IL-SPPFCSPC module to replace the original SPPF module, significantly improving the network's detection accuracy by 1.8%.
From the GC-YOLO, CS-YOLO, and GS-YOLO models, we can observe that the combination of various modules has improved the model's detection accuracy to a certain extent.However, due to the increase in network layers and the introduction of complex modules, the inference time of the models has significantly increased, leading to a decrease in detection speed compared to the YOLOv5 baseline model.A comparison between the IL-YOLO model and the original YOLOv5 model training results is shown in Fig 12.
From Fig 12(a), (b), and (c), it can be observed that IL-YOLO shows a significant improvement in detection accuracy compared to YOLOv5.The mAP has increased by 3.6%, precision has improved by 0.5%, recall has increased by 6.3%, and the F1 score has risen by 4%.The results indicate that introducing the three mentioned modules simultaneously into our IL-YOLO model leads to a significant enhancement in both detection accuracy and precision.
However, this improvement comes with certain costs, as the introduction of additional attention mechanisms and complex modules increases the depth of the network, thereby raising the computational cost for detection time.Although IL-YOLO experiences a decrease in detection speed by 21.7%, the model still achieves a detection speed of 90 frames per second, which remains suitable for real-time detection needs.

2) COMPARATIVE EXPERIMENTS
To further validate the detection performance of IL-YOLO, we compared its performance with other models on the CPLID public dataset.The comparative experimental results are presented in Table 4.
IL-YOLO exhibits a significant improvement in mAP compared to other models.In comparison to the latest YOLO-S model, IL-YOLO achieved a 3.1% improvement.Relative to YOLOv3 and YOLOv4, IL-YOLO improved by 5.4% and 6.2%, respectively.Compared to YOLOv6 and YOLOv7, IL-YOLO improved by 3.9% and 0.1%, respectively.Furthermore, when compared to the base model YOLOv5, IL-YOLO demonstrated a 3.6% performance improvement.These results highlight the enhanced accuracy of IL-YOLO in detecting insulators and their defects.
Similarly, FPS is a crucial indicator for evaluating model performance.YOLOv7 achieves a detection speed of only 72 frames per second due to its memory consumption and computational complexity.In contrast, IL-YOLO excels with a detection speed of 90 frames per second, representing a 4.7%, 15.3%, and 24.3% increase compared to YOLOv3, YOLOv4, and YOLOv7, respectively.However, IL-YOLO's detection speed decreased by 11.8% compared to YOLOv6.YOLOv5, the base model, exhibits the fastest detection speed among all models, reaching 115 frames per second, which is 11.3% higher than IL-YOLO.Nevertheless, IL-YOLO's 90 frames per second FPS remain sufficient for real-time detection.
In terms of model computational load, YOLOv3, YOLOv4, and YOLOv7 are relatively large, reaching 193.9 GFLOPS, 119.8 GFLOPS, and 103.2 GFLOPS, respectively, which may pose challenges when testing on mobile devices.In contrast, other YOLO series models have relatively smaller computational loads, with the latest YOLO-S model only occupying 14.9 GFLOPS, making it the model with the smallest computational load.IL-YOLO's model computational load is 21.9 GFLOPS, representing a 37.9% increase compared to the baseline model.
In terms of precision, recall, and F1 score, IL-YOLO achieves significant improvements.In terms of accuracy, IL-YOLO's performance is only 1.4% below YOLOv7, slightly higher in recall by 0.1%, and the F1 score is only 0.4% below the YOLOv7 model.However, when compared to other models, IL-YOLO consistently demonstrates notable improvements.Particularly, compared to the base model YOLOv5, precision increased by 0.5%, recall increased by 6.3%, and the F1 score increased by 4%.
Through the comparative experiments on the above models, we find that the IL-YOLO network structure has a significant advantage in detection speed when facing models with larger computational loads, such as YOLOv3, YOLOv4, and YOLOv7.Additionally, it demonstrates comparable detection accuracy to the latest YOLOv7 model.When facing models with relatively smaller computational loads, such as YOLOv5, YOLOv6, and YOLO-S, IL-YOLO lags slightly in detection speed but exhibits a clear advantage in detection accuracy.In summary, the proposed IL-YOLO detection model achieves a good balance between detection speed and accuracy.In the comparison of the obtained detection results, it is evident that the proposed IL-YOLO insulator defect detection network has significantly improved the effectiveness of addressing insulators and their defects in power transmission lines.The results demonstrate noticeable enhancements in both the localization of insulators and the recognition of defects.In scenarios involving long distances and multiple targets, the original YOLOv5 network may encounter issues such as low detection accuracy, missed detections, and false positives.In contrast, the proposed network excels in accurately detecting and identifying insulators and their on power transmission lines under similar conditions, showcasing a substantial improvement in accuracy compared to the original network.
In a detailed comparison of the experimental results, when examining

IV. MODEL ROBUSTNESS TESTING AND ANALYSIS
In practical applications, power transmission lines typically involve high voltage levels.Consequently, drones may face      detection performance of the IL-YOLO model is more significant, with a defect detection accuracy of 0.94, higher than the baseline network's 0.93.In the presence of noise in the image, the baseline network misses distant insulator targets, while IL-YOLO accurately identifies them, with detection accuracy for insulators being 0.97, 0.82, and 0.78, respectively.Finally, in the comparison of Fig 16(d) and Fig 15(d), IL-YOLO's defect detection accuracy is 0.86, slightly lower than the baseline network's 0.87.However, when facing occluded insulator target information, the baseline network misses detections, while IL-YOLO still accurately identifies them, with detection accuracy of 0.98, 0.65, and 0.64, respectively.
Based on the comprehensive analysis of detection images in various complex backgrounds on power transmission lines, we conclude that the proposed IL-YOLO detection network for insulator defects in power transmission lines demonstrates excellent performance in the presence of external factors such as noise, shadows, and rotation.Moreover, when encountering scenarios involving multiple targets, small targets, or occlusions, the IL-YOLO detection model excels in accurately identifying insulators and their defects, showcasing its remarkable robustness.90 frames per second, slightly reduced but still sufficient for real-time detection.e) Ablation experiment results show that incorporating IL-GAM, IL-C3, and IL-SPPFCSPC modules into the network structure significantly improves detection results, with an increase of 3.6%.
f) The designed IL-YOLO in this paper performs well in detecting insulators and their defects in both normal and complex backgrounds, handling scenarios involving multiple targets, small targets, and occlusions.The analysis of detection results and robustness testing indicates an improvement in detection accuracy of up to 5%.
In future research, we will focus on lightweight improvements to the model to increase target detection speed while maintaining detection accuracy.Edge deployment considerations will also be explored, applying the network to unmanned aerial vehicles (UAVs) for real-time and efficient detection of power transmission lines.

FIGURE 4 .
FIGURE 4. Structure of the channel attention module.

FIGURE 5 .
FIGURE 5. Structure of the spatial attention module.
Fig 7 and Fig 8. IL-Bottleneck1 addresses the relationship between input and output through a residual connection, alleviating information loss issues and consequently enhancing model performance and generalization ability.IL-Bottleneck2 sequentially transmits information.
based on the results of 200 training rounds.G-YOLO represents the model with the IL-GAM module.C-YOLO represents the model with only the IL-C3 module.S-YOLO represents the model with only the IL-SPPFCSPC module.GC-YOLO represents the model containing both IL-GAM and IL-C3 modules.CS-YOLO represents the model with both IL-C3 and IL-SPPFCSPC modules.GS-YOLO represents the model with both IL-GAM and IL-SPPFCSPC modules.IL-YOLO includes all three modules: IL-GAM, IL-C3, and IL-SPPFCSPC.The results of the ablation experiments are shown in Table

FIGURE 12 .
FIGURE 12.Comparison of model training results.
Fig 14(a) and Fig 13(a), IL-YOLO successfully achieved comprehensive recognition of insulators, attaining a defect detection accuracy of 0.88.Although slightly lower than the original network's 0.95, it is sufficient to ensure accurate identification of insulators without missed detections.Moving on to Fig 14(b) and Fig 13(b), IL-YOLO exhibits a defect detection accuracy of 0.91 when facing the multi-target problem, surpassing the original network's 0.9.IL-YOLO demonstrates good detection accuracy in the presence of multiple insulators, with respective accuracies of 0.96, 0.90, and 0.89.It is noteworthy that the original network exhibits false positives for insulators in multi-target scenarios, while IL-YOLO shows more reliable performance in this regard.Regarding Fig 14(c) and Fig 13(c), IL-YOLO achieves a defect detection accuracy of 0.92, slightly lower than the original network's 0.95.Due to the similarity between the background color and insulators, the original network has false positives for the defect positions, while IL-YOLO performs better in such backgrounds.Finally, in the comparison of Fig 14(d) and Fig 13(d), IL-YOLO's defect detection accuracy is 0.93, significantly higher than the original network's 0.88.
Through the detection of insulator defect images in power transmission lines under complex backgrounds, we found that IL-YOLO proposed in this study exhibits better detection performance in complex scenarios.The comparison of Fig16(a)andFig 15(a)  shows that IL-YOLO achieves a defect detection accuracy of 0.87, significantly higher than the baseline network's 0.81, and the detection accuracy for insulators is 0.92, also higher than the baseline network's 0.87.In the comparison of Fig 16(b) and Fig 15(b), IL-YOLO's defect detection accuracy is 0.87, lower than the baseline model's

TABLE 1 .
Composition of dataset.

TABLE 4 .
Comparison results of model performance.