Global-Similarity Local-Salience Network for Traffic Weather Recognition

Recognizing the current weather conditions from a single image is of great theoretical significance. It also has potential practical value for daily life and traffic scheduling. To achieve that, typical weather recognition methods focus on learning a general weather description, e.g., sunny, cloudy, foggy, rainy and snowy etc, for the overall weather condition. However, it is far away from being sufficient for many tasks especially traffic management and control. To solve this key problem, this paper proposes a Global-Similarity Local-Salience Network (abbreviated as GSLSNet) for traffic weather recognition. Specifically, a simple but effective Global-Similarity Module (GSM) is proposed to recognize the overall weather condition and a Local-Salience Module (LSM) is presented to restrict the network to focus on road weather details. Besides, this paper also provides a new traffic weather dataset, named TWData, which is the first fine categorized dataset especially for highway weather recognition. Experimental results compared with state-of-the-art methods on both public datasets and TWData demonstrate the superiority of the proposed GSLSNet.


I. INTRODUCTION
Weather recognition plays a fundamental role in daily applications, such as traffic management [1], street analysis [2], self-driver assistance [3]- [5] and robot navigation [6]. It is also of great significance for both computer vision and pattern recognition tasks [7]- [13].
Traditional weather recognition methods relies largely on the meteorological stations with expensive sensors and human observations. However, the recognized weathers are largely restricted by these sensors [14]. Recently, with the wide spread of web and mobile cameras, people prefer to obtain an accurate weather description from images. Recognizing the weather conditions from traffic cameras timely can also provide accurate traffic scheduling for transport agency.
Under the basic framework of discriminative feature extraction and effective pattern classification, a possible solution for weather recognition is treating it as image The associate editor coordinating the review of this manuscript and approving it for publication was Paolo Napoletano . classification, from the perspective of machine learning. Casting on this assumption, many research focused on extracting powerful features such as region histogram [15], region template [16], global histogram [17], Sobel edge [18] and power spectrum [4], [5] etc for weather description. There are also methods devoted to seeking more effective classification models such as Support Vector Machine (SVM) [7], k-Nearest Neighbor (KNN) [14] and Convolutional Neural Networks (CNN) [10], [19]- [23].
As a part of daily life, the traffic condition is easily affected by current weathers. Typical weather recognition methods tend to divide them into simple categories such as sunny, cloudy, foggy, rainy, snowy or combinations of them, which is inadequate for traffic management and control. Basically, people focus more on detailed road conditions such as ''the road is covered with snow'', ''the road is wet'', or ''the road is icing'', instead of simple descriptions ''it is snowy'' or ''it is rainy''. To resolve these issues, this paper provides a new model and a new dataset for traffic weather recognition. The contributions of this paper are summarized as follows. 1) Benefitting from both channel-wise and spatial-wise attention, a global similarity module (GSM) is proposed to capture the general weather condition. 2) A local salience module (LSM) with road prior is introduced, for the purpose of restricting the model to focus on road weather details. 3) Casting on GSM and LSM, a global-similarity localsalience network (GSLSNet) is presented for traffic weather recognition from single images. 4) A new traffic weather dataset (TWData) with accurate weather labels is provided. To the best of our knowledge, it is the first weather classification dataset especially collected for traffic weather recognition.

II. RELATED WORKS AND MOTIVATION
This paper aims to recognize the current road weather details given a single traffic image. Recognizing the outdoor weather condition is important to daily travel [24], [25] and industrial scheduling [26], [27]. Early weather classification methods simply assign the given image as either sunny or cloudy [7], [11], while some research enriches the labels into overcast [28], rainy, foggy [15] and haze [17].
One key ingredient of accurate weather recognition is how to extract the discriminative features. To achieve this, many handcrafted features are elaborately designed. Yan et al. [17] combines multiple elements including histogram of gradient amplitude, HSV color histogram and road information as the feature and employs AdaBoost for weather classification. Roser and Moosmann [15] proposes a new weather descriptor that can distinguish heavy rain and fog by taking the visibility affects into consideration. Considering the accessible daily weather condition, Lu et al. [7], [11] applies the corresponding daily weather cues as an additional complementation. The contrast, saturation, edge gradient and power spectral slop [29] are also proved to be effective to recognize the weather conditions. With the overwhelming successes of deep learning among computer vision tasks, many CNN models have been designed recently. Specifically, Elhoseiny et al. [10] first proposes to employ AlexNet, An et al. [20] employs ResNet for single image weather recognition. Guerra et al. [30] exploits multiple architectures and demonstrates the superiority of CNN feature to handcrafted features. Basically, several weather conditions tend to occur simultaneously, e.g., foggy and cloudy, therefore, Zhao et al. [22], [23] extends weather recognition from single-label classification to multi-label learning. To reduce parameter redundancy, Liu et al. [2] takes the advantage of sparse decomposition and cuts down the CNN computation dramatically. There are also methods employing multiple kernel learning and active learning for weather recognition.
To our motivation. Early research has demonstrated the feasibility of classifying weathers into sunny, cloudy, snowy, rainy or foggy from outdoor or vehicle images. However, to provide more accurate traffic scheduling, recognizing the road weather details is more imperative. This mechanism motivates us to build a model that can not only identify the general weather condition (e.g., sunny, rainy, foggy) but also distinguish the road weather details (e.g., the road is wet, the road is covered with snow).

III. GLOBAL-SIMILARITY LOCAL-SALIENCE NETWORK
To obtain an accurate weather description for traffic images, this paper proposes a Global-Similarity Local-Salience Network (GSLSNet). An intuitive illustration of GSLSNet can be found in Figure 1. Basically, GSLSNet comprises two modules, i.e., Global Similarity Module (GSM) and Local Salience Module (LSM), to obtain a general weather description accompanied by an accurate road weather detail.

A. GLOBAL SIMILARITY MODULE
Global Similarity Module (GSM) is designed to recognize the overall weather conditions of given images. The motivation behind this design is that the weathers within a single image tend to be consistent. To achieve that, both a channel-wise branch and a spatial-wise branch are employed (see Figure. 2 for details).
Specifically, suppose the input feature is denoted as F ∈ R W ×H ×C , channel-wise branch first obtains a global description G C ∈ R 1×1×C among channels via Global Average Pooling (GAP). One-dimension convolution (Conv1d) is employed to transform this descriptor into a latent space, and 4608 VOLUME 9, 2021 the transformed descriptor G C ∈ R 1×1×C is denoted as To distinguish the importance of different channels, the Sigmoid activation is utilized and the final output G C ∈ R W ×H ×C of the channel-wise branch can be denoted as where represents vector-tensor multiplication. Basically, channel-wise branch can obtain a global description of given feature maps while ignore spatial correlation. To take these information into consideration, a simple spatialwise branch is employed. Similarly, the input feature F ∈ R W ×H ×C is first transformed to a latent space via twodimension convolution (Conv2d). After that, a spatial-aware description G S ∈ R W ×H ×C is obtained via where ⊗ represents matrix-tensor multiplication. Finally, the output G ∈ R W ×H ×C of GSM is defined as G = G C + G S , and G comprises the global information of both channel-wise and spatial-wise accordingly. Benefitting from both of these two branches, GSM is capable of obtaining a general descriptor of integral weather condition.

B. LOCAL SALIENCE MODULE
Nevertheless, the road weather details many vary from its corresponding surroundings in some cases. For example, the road surroundings may covered with snow while the road itself not due to manual cleanup, and the road surroundings might be wet while the road is dry due to different heat capacities. Consequently, a Local Salience Module (LSM), which can distinguish the road weather details is imperative. Basically, the commonly used lane detection techniques, such as ENet [31], SegNet [32] or DeepLab [33] can be exploited.
However, the purpose of LSM is to exploit the road weather details instead of the road itself. Inspired by the principle of ENet [31], each LSM block (LSB) comprises four convolutional layers with different strides.
Formally, suppose the input of LSB is denoted as F ∈ R W ×H ×C , the corresponding output can be represented as In experiments, the LSBs are pre-trained on road detection tasks and the outputs of LSBs are capable of highlighting the road information (see Figure. 7 for details). In order to restrict the weather recognition network to focus on road weather details, one additional convolutional layer with sigmoid activation is implemented, i.e., The final output of LSM is represented as Typically, the guiding information from multiple layers is necessary. A more detailed illustration about LSM can be found in Figure. 3. Also note that the key point of LSM is not a new road detection block but to pour road priors to weather recognition networks, which can promote the network to focus on road weather details and improves the weather recognition accuracy.

C. GSLSNet
GSM and LSM can be added to any state-of-the-art networks, e.g., MobileNet [37], ShuffleNet [38], VGG [39] and ResNet [40], resulting to the proposed GSLSNet. A detailed comparison can be found in section V. And an intuitive illustration of GSLSNet can be found in Figure. 1. Also note that GSLSNet is mainly inspired by the prevalent attention mechanism, e.g., Soft Attention Mechanism [41], Weak Sematic Attention [42] and Efficient Channel Attention (ECA) [43]. However, GSLSNet differs from these attention frameworks in many aspects and has the following advantages.
1) The GSM has two branches, i.e., channel-wise branch and spatial-wise branch, while other attention methods, e.g., Efficient Channel Attention [43] or Weak Semantic Attention [42], contains only a single channel-or spatial-wise branch. Benefiting from both of the two branches, GSM is capable of fetching not only more detailed spatial-relevance but also channel-dependent information.  2) As illustrated in section I, the purpose of GSLSNet is to recognize the road weather details. The road priors are dexterously represented by a road detection network, which promotes the network for road weather recognition. Experimental results on Table. 3 and Figure. 7 also demonstrate the superiority of this mechanism. 3) GSLSNet is the first weather recognition network that especially designed for highways. Different from typical object recognition or outdoor weather recognition networks, GSLSNet is more specific that can be implemented to traffic management.

IV. TRAFFIC WEATHER DATASET
Basically, the traffic management agency pays more attention to road weather details, e.g., "the road is icing" or "the visibility of current road is lower than 50m", instead of a general and simple description "sunny" or "foggy". To illustrate this key issue and to demonstrate the effectiveness of the proposed GSLSNet for traffic weather recognition, this paper provides a new dataset named TWData (abbreviation of Traffic Weather Dataset). Examples can be found in Figure 4.

A. DATASET CONSTRUCTION AND LABELING
Specifically, the traffic images are first obtained via multiple traffic cameras and 73,861 traffic images are obtained at this stage. Nevertheless, most of the taken images and their corresponding weather conditions tend to be similar due to the high shot frequency of traffic cameras. After similarity eliminating and quality control, 2,491 images are finally preserved for further precise annotation. Generally, the traffic conditions are easily affected by the visibility and road weather condition. Taking the high-effect weather into consideration, TWData is categorized into seven classes delicately, i.e., Fog50, Fog200, Fog500, RoadIce, RoadSnow, RoadWet and Sunny. The corresponding descriptions of these seven categories are illustrated as follows.
• Fog50 -The current road visibility is lower than 50m.
• Fog200 -The current road visibility is upper than 50m and lower than 200m.
• Fog500 -The current road visibility is upper than 200m and lower than 500m.
• RoadIce -The road is partly or all covered with ice.
• RoadSnow -The road is partly or all covered with snow.
• RoadWet -The road is wet.
• Sunny -The current weather and road surroundings are sunny.
During label annotation, the images are first labeled by the meteorological stations automatically and then elaborately calibrated by domain experts. Specifically, given a traffic image, the corresponding weather conditions are retrieved in view of the nearest neighbor meteorological stations. These meteorological stations are especially suitable for visibility and snow annotation. However, there might be biases between the given traffic image and the nearest neighbor meteorological station. Manual calibration by domain experts is therefore necessary. Three domain experts are required to check the rationality of the given labels.

B. COMPARISON WITH OTHER WEATHER DATASET
Certainly there are many other weather datasets (abbreviated as PWDatas) in public. Compared with these datasets, TWData has the following advantages. 1) Different from most PWDatas that the images are obtained simply from online web sources, the images of TWData are real world traffic images acquired from traffic management agency. Therefore, TWData is more practicable than other PWDatas. 2) Distinct from PWData, where the weather labels are manually labeled via volunteers, the labels of TWData are firstly labeled automatically via meteorological stations and then rectified by domain experts. It is relatively rough for volunteers to judge the current visibility without any meteorological observation instrument. Consequently, the labels of TWData are more accurate.  3) TWData has more precise weather labels, e.g., Fog50, Fog200, Fog500, RoadIce, RoadSnow and RoadWet etc, while other PWDatas simply categorize weathers into sunny, cloudy, rainy, foggy or snowy. As a result, the categories of TWData is more precise. 4) TWData is especially constructed for traffic weather recognition while the other PWDatas are designed for either general outdoor scene or in-vehicle images. Hence the applications of TWData is more specific.
A more detailed comparison among TWData and other PWDatas can also be found in Table. 1.

V. EXPERIMENTS A. DATASET AND TRAINING DETAILS
Taking both of the accessibility and feasibility of PWDatas into consideration, two PWDatas, i.e., WeatherImage [7], and Multi-Class weather [36], and the proposed TWData are employed to demonstrate the effectiveness of the proposed GSM, LSM and GSLSNet. During model training, the widely used SGD optimizer with an initial learning rate of 1e −4 is employed, and the learning rate is decreased by 0.1 each 10 epochs. The training process stops after 20 epochs without specific illustration. Table. 2 provides a detailed comparison of various backbones without/with GSM, for demonstrating the effectiveness of the proposed Global Similarity Module. Basically, adding GSM to state-of-the-art architectures, e.g., MobileNet [37], ShuffleNet [38], VGG [39] and ResNet [40], improves the weather recognition accuracy obviously. The reason is that the learned feature maps typically comprises a lot of redundant information [42], GSM reduced these disturbances via both channel-wise and spatial-wise filtering.

2) QUALITATIVE ANALYSIS
In order to investigate whether the proposed GSM is capable of enhancing the discriminant of learned features, Figure 5 presents an intuitive comparison of without/with GSM for feature embedding. Specifically, the powerful t-SNE [44] is employed to embed the final fc feature into 2D space. Each sample is visualized as a scatter point and the points with same colors belong to the same class. Results show that the feature embedding with GSM is semantically more separable.

C. EVALUATION OF LOCAL SALIENCE MODULE 1) QUANTITATIVE ANALYSIS
As illustrated in section III-B, the purpose of LSM is designed to recognize the weather details on road, therefore, Table. 3 presents a detailed comparison of without/with LSM on  TWData. From Table. 3, LSM improves the road weather recognition accuracy. The reasons are also straightforward. Generally, there will be discrepancies between the road weather condition and its corresponding surroundings due to their different heat capacities. Taking the samples of the following Figure 6 for example, the surroundings of both samples are covered with snow. In other words, their global weather conditions are similar. Nevertheless, the left sample is annotated as Road Ice while the right sample is labeled as Road Snow, and the proposed LSM is capable of recognizing these minor differences.

2) QUALITATIVE ANALYSIS
Furthermore, Figure 7 demonstrates the learned feature maps of without/with LSM restriction considering TWData. Note that there is a minor difference between the input image and the feature map due to the fact that the input image is typically cropped randomly for accurate and ensemble prediction. Specifically, each column represents a given image and its corresponding feature maps. And the brighter the pixel is, the greater the weight it holds. From the results of Figure 7, LSM restricts the network to focusing more on the road conditions. Consequently, the road weather recognition accuracy can be increased especially when there is discrepancy between road itself and its surroundings.

D. EVALUATION OF GSLSNet
Finally, this subsection demonstrates the effectiveness of the proposed GSLSNet compared with other state-of-the-art algorithms. Specifically, Table 4 and Table 5 presents the results on WeatherImage [7], Multi-Class Weather [36] and TWData, resepectively.
Basically, WeatherImage [7] is a commonly used dataset to evaluate the effectiveness of newly proposed methods. As illustrated in Table 1, WeatherImage remains a challenging task even though it contains only two weather classes, i.e., sunny and cloudy. Table 4 and Table 5 provide a  detailed comparison of state-of-the-art weather recognition methods and the proposed GSLSNet. From Table 4 and Table 5, the recently proposed convolutional methods outperforms typical machine learning techniques, e.g., Adaboost [7], SVM [7] and Collaborative learning [7], accompanied with hand-crafted weather features. Besides, this section also re-implements other state-of-the-art classification networks, e.g., MobileNet [37], ShuffleNet [38], VGG [39] and ResNet [40] for comparison. Results show that the proposed GSLSNet achieves better recognition accuracy.
Finally, Table 6 illustrates a detailed comparison of the proposed GSM, LSM and GSLSNet considering both of the network parameters and the time consumption per sample (in terms of TWData). Generally, GSM consists of a oneand a two-dimension convolutional layers. The corresponding parameters are k 2 C in C out and k 2 C in (here C out = 1 for the second term), which is far less than the entire network size (refer to Table 6, there is no evident parameter increment with GSM). For LSM, each LSM block comprises five   convolutional layers (four in block and one for dimension reduction) with parameters of size k 2 C in C out × 4 + k 2 C in . In experiments, LSM stacks four blocks with a slight parameter increase (approximately 0.35M according to Table 6). Additionally, the computation complexity of both GSM and LSM is O(WHk 2 C in C out ), and a more straightforward time consumption can also be found in Table 6. Figure 8 also presents an intuitive comparison of GSLSNet with various backbones. In conclusion, the proposed GSLSNet obtains higher recognition accuracy with comparable model complexity and time consumption.

VI. CONCLUSION
This paper proposed a new Global Similarity Local Salience network especially for traffic image weather recognition.
To achieve that, a Global Similarity module is proposed to identify the general weather description, and a Local Salience module is presented for digging out road weather details. Furthermore, a new weather classification dataset labeled elaborately with accurate weather cues is released. Experimental results on both public dataset and the newly proposed dataset demonstrate the effectiveness of the proposed method.
The future work will consider a joint multi-view learning strategy for weather recognition, due to the large variation of illuminations and viewpoints among images. Also, extracting an effective and robust weather descriptor for images remains a challenging problem. An auxiliary low-rank regularized network is under consideration, for the reason that lowrank regularization has been proved naturally appropriate for robust representation. JIANGPING ZHENG is currently a Senior Engineer, mainly engaged in public meteorological services, emergency warning information release system planning, and related technical research. In order to serve the 2022 Beijing Winter Olympic Games, he carried out the research on traffic meteorological service for the Winter Olympic Games.
XIAOYONG LI received bachelor's degree, in 1990. He is currently an Associate Professor with Luzhou Meteorological Bureau. His main research interest includes meteorological information technology. VOLUME 9, 2021