Estimation of PM2.5 Concentration Based on Support Vector Regression with Improved Dark Channel Prior and High Frequency Information in Images

The rapid development of the petrochemical industry has caused great harm to the environment. Fine suspended particles with a diameter of less than 2.5 microns cause serious health problems when inhaled air of high concentration of PM2.5. Therefore, an estimate of the concentration of PM2.5 is sought. However, it generally requires expensive instruments installed in an air quality monitoring station and a professional to operate them. In addition to the expensive cost, the instruments require a high maintenance fee and are restricted by geographical location. To eliminate the difficulty, this paper presents a low-cost and effective approach to estimate the concentration of PM2.5 using image processing schemes. The proposed approach consists of four stages. First, images with different concentrations of PM2.5 were taken and the related relative humidity (RH) was collected. Second, an automatically selected region of interest (RoI) was used to extract two features from an image, namely high-frequency information and transmittance by an improved dark channel. Third, the two extracted features, together with the RH measurement, were used to build a support vector regression (SVR) model. Fourth, the SVR model was applied to estimate the concentration of PM2.5, whose performance was then evaluated and compared with four simple regression models and a modified reported SVR method. In the given data set, the proposed method outperforms the comparison methods in terms of R2 and root mean squared error. The best performance of our method reaches R2=0.816 which is generally satisfactory in related applications.


I. INTRODUCTION
The petrochemical industry has led to rapid economic development. It has been widely supported by other industries and therefore provides many employment opportunities. However, it also causes great harm to the environment by emitting toxic substances such as suspended total particulates, suspended particulate PM 10 , fine suspended particulate PM 2.5 , sulfur oxides, and nitrogen oxides. Among them, PM 2.5 has the most serious impact on the environment [1]. PM 2.5 refers to fine suspended particles with a diameter of less than 2.5 microns [2]. Inhalation of high concentrations of PM 2.5 will have a negative impact on human health [3], [4], such as respiratory diseases, physiological dysfunction, and irritation of the ocular and nasal mucosal tissues [5].
Currently, an accurate concentration of PM 2.5 is measured by an air quality monitoring station [6], [7]. However, the cost of measuring instruments is very high. In 2019, Taiwan established a total of 76 air quality monitoring stations. Each station spent nearly one million NT dollars. The most expensive instrument, which costs 600,000 NT dollars, is used to measure the PM 2.5 concentration. Moreover the maintenance fee costs about 800,000 NT dollars each year [8]. In addition to the high cost, the establishment of air quality monitoring stations is restricted by geographical location and professional operators are required. These factors hinder the widespread establishment of air quality monitoring stations. Furthermore, a measurement of PM 2.5 concentration is normally taken every hour at an air monitoring station. That is, a real-time measurement is not available, which may not meet the requirement in some applications. To solve the problems mentioned above, this paper presents a low-cost and effective method, which is possibly used in real-time applications.
In the past, visibility was visually estimated by experienced professionals in an air quality monitoring station. In other words, it was extremely labor intensive and human error could be involved. The human kind of error-prone measurements can be eliminated by an image-based approach, where human observation is replaced by a digital camera. Previous studies [9], [10] suggest that the visibility of distant targets decreases as the concentration of PM 2.5 and/or relative humidity (RH) increases. That is, PM 2.5 concentration and RH significantly affect visibility. Visibility is the distance at which the outline of the target can be clearly seen. Researchers attempted to estimate the visibility of images taken with digital cameras [11], [12]. In this method, features were extracted by a series of image processing schemes. Then visibility was estimated. The above research indicates that image processing is feasible in estimating visibility.
Note that visibility and PM 2.5 concentration are highly corelated. Recently, researchers have attempted to estimate the PM 2.5 concentration using image processing techniques. In [13], the features of an image were extracted by a convolutional neural network. Then, a support vector regression (SVR) model was trained and used to estimate the PM 2.5 concentration. In [14], Liu et al. proposed an image analysis method to estimate PM 2.5 concentration where characteristics such as contrast, transmittance, and entropy were used. A SVR model was then used to relate the characteristics and PM 2.5 concentration. In addition, Liu et al. introduced the concept of a region of interest (RoI) to improve the performance of estimation. However, the RoI was manually selected and therefore may not be the most informative. In [15], based on a haze image model, a measure, called the normalized first-order absolute sum of the highfrequency spectrum, was used to estimate the concentration of PM 2.5 . These reports suggest that SVR, RoI, and frequency information may be appropriate to estimate the concentration of PM 2.5 . This paper will propose an approach to estimate the PM 2.5 concentration. The proposed method uses transmittance, high-frequency information (HFI), extracted from our automatically selected RoI in images, to build an SVR model with RH measurement. To obtain the transmittance feature, an improved dark channel prior (IDCP) presented in [16] is used. The dark channel prior (DCP) was originally derived from He et al. in [17]. It is observed that a zero or very low pixel value exists in at least one of the RGB components in a haze-free image, except for sky regions and white objects.
Otherwise, higher pixel values of the dark channel are obtained in a hazy image. In [17], the atmospheric scattering model in Eq. (1) is used.
( , ) = ( , ) ( , ) + [1 − ( , )] , (1) where ( , ) represents the observed image, ( , ) the haze-free image, the atmospheric light, and ( , ) the transmittance which refers to the portion of non-scattered light that reaches the camera. For details, see [17]. Since the level of haze or visibility is inversely proportional to ( , ), ( , ) is used in our method. Note that visibility is affected by PM 2.5 concentration. And visibility is related to HFI. Consequently, HFI is used as a feature in this study. Moreover, an edge detector, for example, a Sobel edge detector, can extract HFI from an image, where the value of edge pixels is an indicator of HFI. That is, a strong edge means more HFI and vice versa. Thus, we used the Sobel edge detector to exploit the HFI in an image.
It is observed that not all ( , ) and HFI in an image are suitable for the estimation of PM 2.5 concentration, because they could be extracted from a region that is not the most informative. Thus, an automatic selection of RoI is required to find RoI that is most informative for extracting ( , ) and HFI. In this study, we used the difference in HFI between images pairs with the highest and lowest concentrations of PM 2.5 to find the RoI that is the most informative. The selected RoI is then used to extract ( , ) and HFI.
A kernel-based SVR model, such as a radial-based function (RBF) kernel, has been proven to have advantages over simple or multiple linear regression models. Therefore, in our method an SVR model with an RBF kernel is used. Together with the RH measurement and the two extracted features, ( , ) and HFI, the SVR model with RBF kernel is trained and used to estimate the PM 2.5 concentration in this study. In the given data set, the proposed method achieves best performance 2 = 0.816 that is generally satisfied in applications. The result of our method is superior to the comparison methods, including four simple regression models and an SVR model modified from [14].
There are at least four contributions in this study, as described below.  Our method provides a low-cost and effective imagebased alternative to the conventional method currently used in air quality monitoring stations. The cost of our method is low because a consumer digital camera is good enough to take images, and common image processing schemes are used. Thus, our method can be possibly applied in real-time applications.  Our method does not require professional operations. Furthermore, the location of the installation of our method is flexible, because only a camera and a personal computer for image processing are required. Thus, our method eliminates the geographic restriction for an air quality monitoring station.  Our method presents a scheme for automatically selecting RoI that is more informative. It can eliminate the potential manual error in [14] and improve the selection of RoI in [20]. In the given data set, our proposed RoI selection shows its effectiveness in feature extraction. It may also be beneficial to extract characteristic information for image-based methods to estimate the concentration of PM 2.5 .  We use two effective features in the SVR modelling, together with RH measurement, that are extracted from the RoI selected by our methodology. They are the HFI feature and the transmittance feature of an improved dark channel in [16]. The experimental results indicate that they perform better than a modified SVR from [14] that uses four features, including measurement of RH, transmittance from [17], entropy, and contrast. It implies that the two proposed features might help the performance of SVR models for the estimation of PM 2.5 concentration. This paper is organized as follows. Section II provides a brief review of IDCP in [16]. Section III introduces the proposed approach, in which automatic RoI selection is mentioned; feature extraction is described; and the SVR model used in this study is given. Section IV justifies the proposed method, which will be compared with four simple regression models and an SVR model. Finally, Section V concludes this study.

II. REVIEW OF IMPROVED DCP
In this section, an improved DCP (IDCP) in [16] is briefly reviewed. Due to its simplicity and effectiveness, the DCP scheme, originally developed by He et al. in [17], prevails in the community of single image haze removal. However, there are four problems with DCP, including artifact, halo, color distortion, and computational cost. Originally, IDCP proposed an improvement on DCP to eliminate the problems. In this article, we will use the improved dark channel to estimate the transmittance of an image. The dark channel can be obtained through a minimum filter and is related to the haze in the images.   1 shows an example to demonstrate the relation between haze and dark channel, where the 15×15 minimum filter was used. Fig. 1(a) shows a haze-free image taken from an air quality monitoring station in Taiwan, and the corresponding dark channel is shown in Fig. 1(b). As mentioned, the pixel values in Fig. 1(b) are very low, i.e., dark, except for sky regions and some white objects. A corresponding hazy image of Fig. 1(a) is shown in Fig. 1(c), whose dark channel is given in Fig. 1(d), which shows a brighter dark channel than that in Fig. 1(b), due to the haze in the image. Fig. 1 suggests that the haze can be measured by the dark channel and transmittance accordingly.
The visual quality of the dehazed image can be used to assess the quality of the dark channel in a haze removal scheme. It is well-known that the DCP scheme in [17] introduces halo, color distortion, and artifacts in the dehazing process. It has been shown in [18] that the problems result from an incorrect estimation of the model parameters. In [16], it is observed that the problems come mainly from and ( , ) with fixed scaling factors. To eliminate the problems, an improved DCP was proposed in [16], where scaling factors for , ( , ) and the parameter setting for the guided image filter (GIF) [19] were introduced. The result of the experiment indicates that the scheme proposed in [16] is capable of alleviating the problems in the DCP scheme. For more details, see [16]. In other words, a better estimate of the dark channel is obtained in [16]. Thus, the transmittance ( , ) in [16] is used in our method. Given an image in the RGB color space, the implementation steps for estimating and ( , ) in [16] are given as follows.
Step 4. Obtain the initial transmittance as Step 5. Find the final transmittance ( , ) through refining ( , ) by the GIF with the guide image 1 ( , ), the window size = 75 , and the smoothing parameter = 0.25. Fig. 2 gives an example with its initial transmittance ( , ) and the final transmittance ( , ) by the above steps. As seen in Fig. 2(b), the block effect is found in ( , ), especially in the contour of buildings due to the 15 × 15 minimum filter. In a single-image haze removal, this will cause halos in a dehazed image. The halos will degrade the estimation performance of our method due to the fact that uncorrelated transmittance pixels are involved in the calculation of the transmittance feature. Therefore, ( , ) is further refined by a GIF so that the edges of the buildings can be retained, since they will be used in the automatic search for the final RoI. As shown in Fig. 2 (c), the edges are recovered after the GIF refinement. To see the relation of ( , ) and PM 2.5 concentration. Fig.  3 shows the difference of transmittance ( , ) for images with high and low concentrations of PM 2.5 . Fig. 3(a) is an image with low PM 2.5 concentration whose ( , ) is given in Fig. 3(b). Fig. 3(c) shows an image with high concentration of PM 2.5 concentration whose ( , ) is shown in Fig. 3(d). Fig. 3 indicates that a brighter ( , ) is for the image with low PM 2.5 concentration and a darker one for the image with high PM 2.5 concentration. That is, different concentrations of PM 2.5 results in different ( , ) . Thus, ( , ) will be adopted in this study as a feature to build an SVR model.

III. THE PROPOSED APPROACH
The proposed approach is described in detail in this section. The approach consists of the following five stages. First, the original data set was preprocessed to align images and to exclude inappropriate data. Then the input data set was formed. Second, the data set was divided into training data set and testing data set . Third, the training data set was used to automatically select the RoI to extract features HFI and ( , ). Fourth, an SVR model was built with measurement of PM 2.5 concentration, RH measurement, and the two extracted features, that is, transmittance ( , ) and HFI in images. Fifth, the testing data set was used to justify the performance of the trained SVR. Fig. 4 shows the overall flow chart of the proposed approach. Details of each stage are described below.

A. IMAGE ALIGNMENT AND DATA EXCLUSION
In the original data set , images taken in the same scene at different times may be translated vertically and/or horizontally. Thus, image alignment is required before images can be used in the following steps. Details will be given in Section IV.A. After the image alignment, the unreliable data exclusion in follows.
In [20], two factors are observed to affect the performance of the PM 2.5 concentration estimation. One is RH and the other is the time difference between the time to take images and the time to measure the concentration of PM 2.5 . In this study, six images were taken in one hour from an air quality monitoring station, while the PM 2.5 concentration was collected hourly. In other words, six images were related to only one concentration of PM 2.5 for each hour. When the PM 2.5 concentration changes within one hour, it will degrade the performance of the estimate. To solve this problem, the variance of the transmittance feature was calculated in six images taken in the same hour. When the variance is greater than a threshold, the six-image set is considered unreliable and is discarded in this study. The details will be given in Section IV.A.

B. AUTOMATIC SELECTION OF ROI
In light of [14], RoI helps improve the performance of PM 2.5 concentration estimation. Consequently, in this study, a VOLUME XX, 2017 5 scheme for automatic selection of RoI was developed. Note that the contour or edge of near and distant objects in images taken from a fixed point, e.g., an air quality monitoring station, changes its clarity as the concentration of PM 2.5 changes. Furthermore, the change is smaller for near objects than for distant ones. Thus, it can be used to select the RoI that has the largest difference, that is, it is the most informative for PM 2.5 concentration. Fig. 5 explains the above idea. When the PM 2.5 concentration is low, the edges are clear for the marked near object (1 km away) and the marked distant object (2.5 km away) as shown in the first row. As the concentration of PM 2.5 increases, the edges of the near object become vague, and those of the distant objects are hardly visible, as shown in the last row. Based on this observation, images with high and low concentrations of PM 2.5 will be used to locate the most informative RoI. In this study, the Sobel edge detector was used to find the HFI. The Sobel edge images for Fig. 5 are shown in Fig. 6, which indicates that HFI decreases as the concentration of PM 2.5 increases. Based on the observation above, the proposed automatic selection is described below. Given the input data set , the implementation steps for the proposed automatic selection of RoI are given below.
Step 1. Select images with the highest PM 2.5 concentration and images with the lowest PM 2.5 concentration from data set .
Step 2. Convert all selected images to grayscale images. For each image, perform Sobel edge detection to obtain HFI.
Step 3. Randomly combine with an image of high PM 2.5 concentration and one of low PM 2.5 concentration.
The total number of image pairs is × .
Step 5. Perform a morphological dilation with × structuring elements in the binarized image pair from Step 4.
Step 6. Perform a pixel-to-pixel XOR operation on the binarized image pair to obtain a high-frequency difference image.
Step 7. Find connected regions in the high-frequency difference image using the labeling algorithm [22]. Step 8. Mark the first three objects with the largest connected regions, which are considered as three candidate RoIs.
Step 9. Repeat Step 4 to Step 8 for all × image pairs.
Step 10. Count the number of selections for each candidate RoI. The one having the highest number of selections is the selected RoI.
Step 11. Exclude the sky area in the selected RoI, and the resulting region is the final RoI, denoted as * . An example of the proposed automatic RoI selection is depicted in Fig. 7, where the intermediate results are also shown. There are three points that should be discussed in the above steps: determination of in step 1, discrimination of HFI in step 4, and determination of in step 5.

Determination of in Step 1
The total number of image pairs, which is × , with the highest and lowest concentrations of PM 2.5 is discussed here because it may affect the performance of our method. Since image pairs are used to obtain HFI, it suggests that a greater difference between the image pairs is desired. To be robust, should not be small. On the other hand, it is not good to have a large because a degraded HFI will result. In our experiments, about 0.6% of the total number of images generally have satisfactory performance. In other words, about 0.3% is for the highest and lowest concentrations of PM 2.5 , respectively. In the experiments of Section IV, = 48 was used and took 0.34% of the total number of images, that is, 14,046.

HFI discrimination in Step 4
The Sobel edge detection in Step 2 is to extract the HFI that needs to be identified. Note that the pixel value in the Sobel edge image is proportional to its HFI. Thus, we discern low-frequency and high-frequency information by the pixel value. That is, a higher pixel value means stronger HFI. The two parts can be binarized by a threshold. It is well-known that the Otsu algorithm can provide an appropriate threshold. Consequently, we use the Otsu algorithm to find binary images in Step 4. Fig. 8 gives an example. Fig. 8(a) is a Sobel edge image; Fig. 8(b) is its binary image through the Otsu algorithm; and Fig. 8(c) shows the histogram of Fig. 8(a) and the threshold obtained by the Otsu algorithm. In Fig. 8(c), pixel values less than the threshold were assigned to zero, whereas the other pixels were assigned to one. The resulting binary image is shown in Fig. 8(b).

Determination of in Step 5
In Step 2, we obtain a Sobel edge image that generally has unconnected pixels. In experiments, we also find that images in may have little shift in the coordinates of the binary images obtained in Step 4, due to imaging conditions, such as light conditions that vary the intensity of the pixels. To link the unconnected pixels, we perform a morphological dilation on the images obtained by Step 4. The results with different sizes of structuring elements are shown in Fig. 9. Fig. 9(a) is an original image block cut from an image in whose Sobel edge image by Step 2 is given in Fig. 9(b) that shows unconnected pixels around the contour of the building and its interior. As increases, the contour pixels are connected, and the interior of the building is filled. When = 7, the dilated contour and the interior region are appropriate compared to Fig. 9 (a), while in the case of = 9 , it includes more pixels in the sky region that will degrade the results of the following steps. Therefore, = 7 is used in our method.

C. THREE FEATURES IN OUR METHOD
This section describes the three features that will be used in the SVR modeling, that is, the HR measurement , transmittance feature * , and HFI feature * .

RH measurement
As described previously, RH was measured hourly by an air quality monitoring station. In [20], it has been shown that RH significantly affects the estimation of PM 2.5 concentration. Thus, we include RH measurement in SVR modeling to enhance estimation performance. Transmittance feature * As shown in Fig. 3, images of different concentrations of PM 2.5 have different transmittance. Furthermore, the improved dark channel in [16] is better than the original dark channel in [17], because it results in a much better dehazing performance. Therefore, this article used the transmittance of IDCP as a feature in the proposed approach to estimate the concentration of PM 2.5 . By the steps in Section II, the transmittance of an image was found. Then the transmittance feature within the RoI * , which is obtained in Section III.B, is calculated as follows. * = mean ( , )∈ * [ ( , )] HFI feature * Note that an edge detector can be considered as a highpass filter to obtain HFI. Therefore, the 3 × 3 Sobel edge detector was applied to find the HFI. Two directions are involved in the detection of Sobel edges, that is, vertical and horizontal. To find the vertical edge, the 3 × 3 mask is applied whereas 3 × 3 mask is used for horizontal edges. The corresponding masks are given below.

D. SUPPORT VECTOR REGRESSION
The SVR used in this study is briefly described in the following. For details, see [23]. SVR is a generalized support vector machine (SVM). That is, SVR is derived from SVM, a classifier, to an estimation of a real value function. Since kernel-based SVR has excellent performance, this article adopts it to estimate the PM 2.5 concentration. Assume a training data set = {( , )} for 1 ≤ ≤ , where ∈ is the input vector and ∈ is the desired output; subscript denotes the ℎ pattern; is the total number of patterns. The kernel-based SVR used in this study is formulated as where is a weight; Φ(•) is a radial basis function (RBF) kernel; and is a bias. In this study, the training data set had a three-dimensional input vector , which includes features * , * , and the measured RH feature ; the desired output is the measured PM 2.5 concentration . In a trained SVR, the output ( ) is used to estimate in the testing stage.

IV. RESULTS AND DISCUSSION
This section will verify the proposed approach using a given data set obtained from the Taiwan government. In the following, data preparation, data exclusion, and comparison with other regression models are described in order.

Original data set
The experiment used an image data set taken from the Kaohsiung Renwu Air Quality Monitoring Station, which is under the Environmental Protection Agency of the Executive Yuan of Taiwan. The images were collected from 7:00 am to 17:00 pm from August 2018 to July 2019. For every 10 minutes, an image was taken. The total number of images in data set is 21,720. Furthermore, the data set included the hourly PM 2.5 concentration and RH measured. Fig. 11 shows the histogram of the measured PM 2.5 concentration, whereas Fig. 12 shows the histogram of the RH measurements. This data set was considered as the original data set in the following experiment.

Image alignment
Since the images in data set S were taken manually, it usually happens that the same scene was taken at different times under different shooting conditions. In other words, the same scene taken at different times may be translated vertically and/or horizontally. Fig. 13 shows an example for this case. Figs. 13(a) and 13(b) are two images of the same scene taken at different times with a vertical translation. Obviously, it will significantly affect to locate * and feature extraction accordingly, if we use them as is. Consequently, the images in the S data set should be aligned before being used in the following experiment. The image alignment uses the first image taken in as a reference since it was taken with a just-calibrated camera. Then the alignment was performed on the rest of the images. Fig. 13(c) gives the adjusted result of Fig. 13(b). The aligned image set then takes the place of the original image set in .

Data exclusion
As described previously, the data set consists of three parts. For each hour, there are six images, one PM 2.5 concentration measurement and one RH measurement. The hourly data is considered as a subset of . When the subset is discarded (retained), it means that six images and two measurements are discarded (retained). Note that a measurement of the concentration of PM 2.5 is associated with six images every hour. Furthermore, the PM 2.5 concentration can vary in one hour as the weather changes. When the difference is large, it will degrade the estimation performance. Thus, the data in this case should be excluded from the SVR modeling. Table 1 shows an example, where time, * , and the estimated PM 2.5 concentration are given. As shown in Table 1, the feature of transmittance * decreases from 0.837 to 0.348 in one hour. Consequently, the differences in the estimated concentrations of PM 2.5 by the proposed approach in each case are large. In other words, the data in this case are not reliable and should be discarded.
To exclude unreliable data, for each subset, the following three steps are performed.
Step 1. Calculate the standard deviation of * in six images within a subset, which is denoted as . This continues until all images in each subset in are processed.
Step 2. Obtain the mean of all in each subset and denote as .
Step 3. Use as a threshold, discard the subset, if > .
Otherwise, retain the subset. This process applies to all subsets in . Fig. 14 shows the box plot of for data set where = 0.0181 is indicated as well. According to the criterion in Step 3, approximately 35% of the subsets in were considered not reliable and therefore discarded. After data exclusion, the total number of images is reduced from 21,720 to 14,046. Training set and testing set After data exclusion, the retained data in were the input data set for the following experiments. The was divided into training set and testing set . Each training pattern consists of a three-dimensional input vector of features , * , * , and its corresponding desired output , that is, the measured PM 2.5 concentration. By the training set , a SVR model was built. The trained SVR model was then used to estimate in . The experiments were conducted with the ratios of to from 1:9 to 9:1. The number of patterns in and for different ratios is shown in Table 2.
For each ratio, three experiments with randomly selected examples were performed. Then the average performance indices, 2 and root mean squared error (RMSE) were recorded and compared.

B. PERFORMANCE INDICES
In the experiment, two performance indices were used to evaluate the proposed approach and comparison methods. They were the root mean squared error (RMSE) and the coefficient of determination 2 . The RMSE is calculated as where is the total number of ; ̂ is the estimate of PM 2.5 concentration in the ℎ example; and is the ℎ measured PM 2.5 concentration. The smaller RMSE means that ̂ is closer to , i.e., better estimation performance is achieved.
The second performance index 2 is calculated as where ̅ is the mean of . 2 , ranging from 0 to 1, represents the goodness of a model to explain the output variation by the input of the model. A higher 2 means better estimation performance for a model.

C. PERFORMANCE COMPARISON
In this subsection, the performance of the proposed approach is investigated and compared with a modified Liu method in [14], the Liaw method in [20] and three simple regression models, including linear regression (LR), polynomial regression (PR) and exponential regression (ER) models, using the feature * . The original Liu method [14] used the features of distance, transmittance by [17], entropy, contrast, sky color, and solar zenith angle to build an SVR model. However, features such as distance, sky color, and solar zenith angle are not available in this study. Consequently, these features were not considered in the comparison. Instead, we modified Liu's method using features relative humidity , transmittance by [17] * , [17] , entropy * , contrast * , calculated within * . The modified Liu method is abbreviated as the mLiu method.
The original Liaw method [20] used HFI as a characteristic in an LR model to estimate the concentration of PM 2.5 . The differences between Liaw's method and LR are the exclusion of data described in Section IV.A and the automatic selection of RoI without exclusion of the sky region, except for using different features. The models and features used in the proposed approach and the comparison methods are summarized in Table 3.

Average performance of feature sub-setting in our method
In this study, our method uses three features, that is, , * , and * . In this experiment, we will investigate their effect on performance in the proposed method. Table 4 shows the average performance of feature subsetting for the ratios in Table 2. In Table 4, it indicates that has the poorest performance 2 = 0.017 whereas * and * have much higher 2 than that for , that is, 0.616 and 0.629, respectively. When is respectively added to * and * , 2 for * and * have improved by 0.122 and 0.136, respectively. The 2 reaches 0.793 when all three features are used. A similar result is obtained for the RMSE performance. The results suggest that is able to enhance the performance of * and * . The three selected features are appropriate, since the estimation performance is improved as additional features are used. Average performance for different ratios of training and testing data The average performance of the three experiments for our method and the comparison methods with each ratio in Table  2 are given in Fig. 15 for 2 and Fig. 16 for RMSE. The average performance of all ratios in 2 and RMSE is recorded in Table 5.  In Table 5, in 2 our method is superior to the Liaw method by 0.110 and the mLiu method by 0.059 and 0.169, 0.169, 0.190 to LR, PR and ER, respectively. For RMSE, the proposed approach has less RMSE than Liaw, mLiu, LR, PR, and ER by 0.371, 0.959, 2.59, 2.569, and 3.503, respectively. Table 5 indicates that the LR, PR, and ER models have similar performance and the Liaw method, an LR model, has better performance than the LR, PR, and ER methods with the help of data exclusion and automatic RoI selection. However, it is much inferior to the SVR models, that is, our method and the mLiu method. Additionally, our SVR model, which uses three features, is better than the mLiu method with four features. It suggests that features * and * are more effective than * , [17] , * , and * in the estimation of PM 2.5 concentration.

Best performance comparison
To further investigate the performance of the comparison methods, we discuss the best case of each method. Fig. 15 shows that the best performance for our method and the mLiu method occurred in an 8:2 ratio, whereas 7:3 was for the rest of methods. In the 8:2 case, our method reaches the highest 2 performance 0.816, whereas the mLiu method has 0.803. The Liaw method, LR, PR and ER yields 0.700, 0.627, 0.630, 0.601, respectively, in 2 . Therefore, the ratio 8:2 is recommended when our method and the mLiu method are used in the application, while the ratio 7:3 should be used when the simple regression is considered.
The scatter plots of the best performance for our method and the comparison methods are depicted in Fig. 17. Figs. 17(a) to 17(c) indicate that LR, PR, and ER show a weak positive relationship. In other words, LR, PR and ER are not good enough to estimate the PM 2.5 concentration in the given data set. LR has an underestimation problem in high-level PM 2.5 concentration. A similar result is found in PR, while ER could not adequately estimate the concentration of PM 2.5 for all levels, due to the wide spread in the scatter plot. Regarding the Liaw method, Fig. 17(d) shows that only images of less than 70 were retained after data exclusion. Additionally, it has an underestimation problem for cases with > 58. The mLiu method is much better than the simple regression models, since it shows a strong positive relationship as in Fig. 17(e). However, it indicates that there was a large variance happened when > 40 . On the contrary, our method shows a strong positive relationship with a more confined variation in the cases > 40 as in Fig. 17(f). However, our method had several poor estimates of low-level PM 2.5 concentration. It will be considered for further improvement in the future.

V. CONCLUSION
This article has presented an image-based approach to estimate the PM 2.5 concentration, where the features of relative humidity and transmittance, together with highfrequency information from the proposed automatically selected RoI, were used to build an SVR model. In essence, our method consists of four main stages: data exclusion, automatic selection of RoI, feature extraction, and SVR modeling. For a given data set obtained from the Taiwan government, the proposed method was justified and compared with a modified Liu method and four simple regression models, that is, the Liaw method, linear, polynomial, and exponential regression. The result indicates that our method was superior to the comparison methods, in terms of 2 and RMSE. The best performance obtained by our method was 2 = 0.816 in the given data set. The result suggests that the proposed method could be an effective and low-cost alternative to estimating the PM 2.5 concentration for the method used in an air quality monitoring station. Additionally, our method could alleviate the restriction of installation location and professional operation in air quality monitoring stations and could possibly be used in real-time applications. For further research, an improvement will be made in the estimation of the low-level PM 2.5 concentration. In addition, our effort will be to find more effective features in the estimation of PM 2.5 concentration.