Measuring Particulate Matter: An Investigation on Sensor Technology, Particle Size, Monitoring Site

Light-scattering particulate matter sensors can be very inexpensive, but also unreliable. They are often presented as a solution for the creation of dense monitoring networks, offering an alternative approach to mathematical modeling and interpolation techniques applied to data from institutional monitoring stations. The purpose of this article is to study the benefits provided by light-scattering sensors in measuring daily concentrations of particulate matter, when they are used to complement the existing institutional network of high-quality instrumentation. A one-year experiment campaign was performed, placing 56 light-scattering sensors close to an official monitoring station. The effectiveness of the sensors is evaluated by comparing the correlation of the sensors with the official data of the nearby station, with respect to the correlation between the same station and other stations in the official monitoring network. Finally, the different data sources are calibrated on a single reference station, in order to act as predictors of the PM concentration in the point of interest. Estimates are analyzed using root mean square error (RMSE) comparison.


I. INTRODUCTION
Particulate matter (PM) is a relevant air pollutant that can have a negative impact on human health and the environment.PM is made up of a mixture of solid particles and liquid droplets that are suspended in the air and can be inhaled.Particles are usually classified according to their size: PM10 consists of particles with a diameter smaller than 10 micrometers (µm), whereas in PM2.5 the maximum diameter is 2.5 µm.The latter is of particular concern, as smaller particles can penetrate deep into the lungs and enter the bloodstream [1].Exposure to PM has been linked to a range of health effects, including respiratory and cardiovascular disease [2], lung cancer [3], and premature death.PM can also harm the environment, damaging crops and ecosystems [4].
The associate editor coordinating the review of this manuscript and approving it for publication was Nadeem Iqbal.
The sources of PM are both anthropic, e.g., vehicle exhausts and industrial processes, and natural, e.g., dust storms and wildfires.The World Health Organization (WHO) states that 92% of the world's population lives in places where air pollution exceeds safe levels [5].In order to reduce exposure to PM, it is important to control and reduce emissions from these sources through regulations and policies, as well as promote public awareness and education.Overall, PM is an important issue that requires attention and action to protect human health and the environment.
Currently, institutional monitoring networks employ very expensive and bulky devices.As a result, the number of air quality monitoring stations is limited.Mathematical modeling, interpolation techniques, neural networks, and clustering algorithms can be used to analyze and extend the data from existing monitoring stations throughout the area [6], [7].By using meteorological data, topography, and other environmental factors, a model can be created for predicting the distribution of PM concentrations [8], [9].
Another way to improve the spatial density of the collected data is to use low-cost light-scattering sensors, which are a miniaturization of traditional optical particle counters (OPC) [10], [11], [12] and nephelometers, making them cheap and small in size.Measurements from these sensors could be integrated with the ones provided by the institutional network, increasing the spatial density of information.Furthermore, these sensors normally provide instantaneous sampling, unlike institutional sensors, which only provide hourly or daily averages of PM concentrations.
Although this approach is described in many scientific papers [13], [14], [15], [16], [17], [18], [19], low-cost lightscattering sensors are considered not to be very reliable and precise.This is due to the fact that the evaluation of PM mass concentrations involves different approximations and assumptions, such as the Refractive Index (RI) of the particles and their density.In addition, the process of miniaturization introduces limitations on the percentage of detected particles and size ranges.Low-cost particulate matter sensors are also negatively affected by high levels of relative humidity [20], [21].
Indeed, these defects can reduce the usefulness of such sensors.Consequently, current research has evaluated the performance of light-scattering sensors [21], [22], [23], [24], [25], [26], [27], [28].Most of the evaluations performed in the literature, however, only provide a limited comparison with official instruments, which is often just a single device, without fully considering their integration in official networks.Therefore, the purpose of this paper is to investigate the possible benefits provided by light-scattering sensors in the measurement of daily averages of PM concentration, when integrated into existing monitoring networks.Even if a coarse time granularity is considered, this approach could be useful for the monitoring of areas where official stations are not present.
In this study, a one-year experimental campaign was performed by placing 56 PM2.5 sensors near the institutional monitoring station of T. Rubino, located in the city of Turin (north of Italy).Data from other official PM monitoring stations in a large surrounding area, the metropolitan city of Turin, was also collected and analyzed.The correlation of the sensors with the official data of the nearby station was compared to the correlation between the same station and other stations in the official monitoring network, in order to identify the best data source to predict the PM concentration in the place of interest.
It should be noted that adopting correlation as a metric to evaluate how good a PM data source might be for estimating the PM at another point is a simplification.Indeed, the correlation evaluates the linear relationship between two numerical series, without considering the actual estimation error.For this reason, the most relevant data sources were calibrated to act as predictors of PM2.5 concentrations measured at T. Rubino, and benchmarked by comparing the root mean square error (RMSE) of their estimates.
The organization of the paper is the following.Section II surveys related research on the spatial concentration of particulate matter, while Section III presents the background on PM sensor technologies involved in this work.Section IV describes the characteristics of the area under examination and of the monitoring stations of the institutional network.In Section V the low-cost monitoring system is presented, together with the data collection and data pre-processing phases.In Section VI, the methodology and metrics used in the analysis are described.In Section VII, the results of the comparative analysis based on correlation are presented and discussed.Finally, Section VIII draws some conclusions about the outcomes of the analysis.

II. RELATED WORK
This section presents and overview of the literature, discussing spatial and size correlation of PM and light-scattering technology.The limitations that this work tries to address are also highlighted.

A. SPATIAL AND SIZE CORRELATION OF PM
A deeply discussed topic in particulate matter monitoring concerns the relationship between PM2.5 and PM10.Table 1 surveys the findings of some previous research.Commonly, daily measurements of PM are considered, but more frequent (e.g., hourly [34]) or sporadic data (e.g., weekly or annual [32]) have been also analyzed.It can be noted that there is a general agreement about a positive correlation ρ or a high coefficient of determination R 2 between PM2.5 and PM10.The present work adds more data to support this theory.It also evaluates the correlation between the two sizes of particulate matter measured at distinct points: this analysis was missing in the studies listed in Table 1.
In August 2013, hourly mass concentrations of PM were collected in 13 cities of the North China Plain region and in 20 cities of the Yangtze River Delta region in China [35].The correlation coefficient was computed for each pair of cities in the same region.A dependency from distance was found: the correlation coefficients between cities in the North China Plain region were usually lower than 0.6 for both PM2.5 and PM10 when cities were 250 km away.In the Yangtze River Delta region, the correlation was usually lower than 0.6 when cities were 250 km away for PM2.5 and 180 km for PM10.There were exceptions with cities exposed to local phenomena.In the present work, the same approach is followed, but the correlation has been computed between pairs of closer locations (i.e, in the same metropolitan area).In fact, the significance of the analysis increases as the distance is reduced, due to the lower impact of external causes.
PM2.5 and PM10 were measured by means of gravimetric and beta attenuation sensors at a suburban site in Athens, Greece, over a period of 4 years (2009-2012) [36].The data revealed a good correlation between the two sensors.The correlation for PM2.5 was 0.79 for the whole length of the trial, and 0.72 and 0.84 for the warm and cold seasons, respectively.Similar values were computed for the PM10 correlation: 0.72 overall, 0.80 in the warm seasons, 0.88 in the cold ones.The present work extends this kind of analysis by considering more locations for the comparison of gravimetric and beta attenuation sensors; furthermore it adds light-scattering sensors to the comparison.
An approach for the estimation of PM2.5 was developed based on the correlation between PM2.5 and PM10 [37].Data were collected from May 2010 to December 2011 at one PM2.5 monitoring station and at 18 PM10 stations in Beijing, China.A technique for spatially extending the PM2.5 value to all the locations of the PM10 stations was developed.The spatial correlation between PM2.5 concentrations is approximated according to the spatial correlation of PM10 concentrations.However, in the validation of the technique, PM2.5 estimates can not be compared to real data, as the latter are missing.Instead, the present work analyzes the spatial correlation of PM2.5 and PM10, and relies on their simultaneous measurement at the same place.

B. LOW-COST SENSOR EVALUATIONS
Light-scattering sensors are a promising technology for increasing the spatial and temporal resolution of PM measurements, since their affordability, low energy consumption, and small dimensions facilitate the development of widely dispersed sensor networks.The studies presented in this section range from a theoretical and laboratory perspective to field evaluations closely related to our work.
A survey on the design and the technology of low-cost light-scattering sensors suggested good practices for calibration models and metrics for performance evaluation [26].
A simulation software was developed to study the behavior of light-scattering sensors, exploring the limits of the technology [24].According to the analysis, the most important factors affecting the quality of the measurements are the optical properties of PM, such as light absorption, the distribution of particulate size, and humidity.
Low-cost light-scattering sensors were evaluated in laboratory using an acrylic glass chamber [23].The study assesses the linearity of sensor response, their precision, and limits of detection.It also analyzes the influence of PM composition, particle size, relative humidity, and temperature.
Most of the works about field evaluations of low-cost lightscattering sensors study their effectiveness by means of a comparison with a limited number of reference instruments, without considering their integration in the context of an existing monitoring network [21], [22], [25], [38], [39].The present work, instead, tries to evaluate whether the quality of their measurement is sufficient to provide benefits in a scenario where a network of high-precision instruments is present.
A long-term field evaluation was carried out in the city of Bologna, considering the effect of seasonal variability, time resolutions, and meteorological conditions [25].The adopted reference instrument was the MetOne Profiler 212, a high-precision light-scattering monitoring device.The study concludes that data from low-cost sensors can be precious and extremely informative, but care must be taken during high humidity conditions and in the presence of mineral dust.
Low-cost light-scattering sensors from different manufacturers were tested for a period of over a year in the city of Southampton, UK, by placing them in several locations [22].However, only two official monitoring stations were present in the area under consideration.
Similarly, low-cost light-scattering sensors were placed at three different official monitoring stations in Santiago, Chile [38].The study also analyzed the effect of relative humidity and the quality of relative humidity measurements.However, it does not compare inter-station correlations and errors with correlation and errors of the low-cost sensors at the reference stations.
Four low-cost light-scattering sensors were compared with the TEOM 1400a gravimetric sensor [39].It was observed that low-cost sensors generally followed the trend of PM2.5 measured by the gravimetric sensors, although they are likely to overestimate the PM2.5 concentrations and they need a new calibration in the environment of measurements, instead of relying on the manufacturer's calibration.Furthermore, several failures of the sensors were noticed, so four units of each kind of sensor were deployed.In the present study, individual differences between the sensors were investigated by deploying more units (56 instead of 4) and tested with a longer campaign (one year instead of six months), to better account for seasonal variability.
In a similar study, high-precision instruments of different technologies are used as a reference for evaluating lowcost light-scattering sensors [21].In addition, a model for humidity correction has been developed.

III. PM SENSOR TECHNOLOGIES
In this section, the principal technologies used for Particulate Matter monitoring [40] are briefly discussed.

A. GRAVIMETRIC MEASUREMENTS
Gravimetric particulate matter sensors typically consist of a vacuum pump, a filter, and a weighing system.The vacuum pump draws in a known volume of air, which then passes through the filter.The filter is typically made of glass fiber or Teflon, and is used to capture the PM dispersed in the air.After a certain period of time, the filter can be weighed to determine the mass of PM that was collected.The weighting procedure can be performed either manually or automatically, and often requires filter conditioning both before and after the exposure.Particle size selection can be performed in multiple stages by the air inlet, inertial impactors, and the filter itself.
Automatic gravimetric instruments can employ tapered element oscillating microbalances (TEOM) technology for continuous measurements.In these instruments, the filter is positioned on top of a glass tube that is maintained in oscillation.Changes in the frequency of oscillation depend on the weight of the PM deposit on the filter.
Gravimetric particulate matter sensors are widely used in a variety of applications, including monitoring air quality in industrial settings, measuring PM emissions from vehicles, and monitoring PM levels in ambient air.They are known for their high accuracy and precision, and are used for regulatory monitoring1,2 of PM concentrations.However, they are also relatively expensive and require regular maintenance, such as changing filters and calibrating the balance [41].

B. BETA ATTENUATION MONITORS
Beta attenuation monitors are very similar to gravimetric sensors.The main difference is the method used for measuring the mass of PM captured by the filter.A low-energy radioactive source generates beta radiation whose intensity is measured by a Geiger counter positioned on the other side of the PM filter.Since the attenuation of beta radiation from Low-cost light-scattering sensors are a miniaturization of traditional optical particle counters and nephelometers, being more suited for IoT applications due to their smaller sizes and lower power consumption.The air is drawn inside the sensors via a small fan.A laser beam is shined on the air volume, and the intensity of the scattered light is measured at a specific angle on the opposite side of the air sample by a photodiode.Employing Mie Theory, which models the light scattered by a perfect sphere, sensors can identify the presence of particles in the analyzed air volume.OPCs are able to count single particles for different size intervals, expressed in terms of particle diameter.Particle counts in the different size bins are then converted to standard mass concentrations, such as PM1.0, PM2.5, and PM10.Nephelometers, instead, correlate the intensity of the light scattered by the entire air sample, usually measured at a wider angle range, with a reference mass measurement defined during calibration.Low-cost light-scattering sensor manufacturers often do not declare whether their devices are able to count individual particles.
The measurement process introduces multiple approximations and assumptions.Mie Theory assumes particles to be perfectly spherical and of known refractive index (RI), which instead is unknown and depends on their chemical composition.In addition, only a small percentage of the particles in the air sample is detected.Since there are few particles with bigger sizes, such as the ones that compose PM10, in order to have meaningful data to estimate their concentrations the sensors should integrate over long time intervals (hours or days).For this reason PM10 is often not measured directly but derived from PM2.5 or PM1.0.In order to evaluate the PM mass from the particle size counts, assumptions must be made about their density and distribution in each size interval.To evaluate concentrations, the air volume should be known: instead of being measured directly, it is considered constant given the speed of the fan.
Low-cost particulate matter sensors are also negatively affected by high levels of relative humidity.PM particles are subjected to hygroscopic growth, due to water condensation forming on them.Since the detected particle size increases, sensors overestimate the mass of PM.Water intake can also influence the optical properties of the particles.Full-size particle counters solve this problem by heating the air before performing the measurement.They also use better optics and provide lower size thresholds for particle detection.

IV. AREA UNDER CONSIDERATION
The area considered for the analysis corresponds to the administrative division called Metropolitan City of Turin.This area comprises 312 municipalities, is 6827 km 2 wide and its population is more than 2 million.The metropolitan city is composed of a mountainous part in the west and north, and a flat or hilly part in the south and east.The most populous municipality is Turin, which has more than 800,000 inhabitants, with an area of 130 km 2 .To the north, west, and east, it borders other large municipalities.

A. OFFICIAL MONITORING NETWORK
Fig. 1 shows the distribution of the PM institutional monitoring stations, which are managed by ARPA Piemonte, a regional agency for the protection of the environment. 3he locations of the stations differ in terms of the level of urbanization and the main source of pollution.Regarding the level of urbanization, three kinds of monitoring stations can be recognized: • urban: the station is located in a continuous or at least predominantly built-up area; • suburban: the station is located in a mostly built-up area, which contains also not-urbanized fields; • rural: the station is located in a sparsely built-up area.With respect to the source of pollution, two categories of stations can be identified: • traffic station: the level of pollution around the station is mainly influenced by emissions from traffic, coming from neighboring roads with medium-high traffic intensity; • background station: the level of pollution around the station is not influenced by any specific sources (industry, traffic, residential heating, etc.).Instead, it is due to the integrated contribution of all the sources located upwind of the station, considering the predominant wind direction at the site.
Official monitoring stations in the Metropolitan City of Turin exclusively use gravimetric and beta attenuation instruments for measuring PM levels.Furthermore, according to the European Directive 2008/50/EC, PM2.5 and PM10 are the only PM sizes used for regulatory monitoring in Europe, and therefore the only sizes measured by the monitoring network.

B. PM SOURCES
According to studies conducted by ARPA Piemonte [43], [44], [45], the main sources of PM10 in the city of Turin are traffic and domestic heating.For what concerns domestic heating, PM is generated by the incineration of wood and pellet fuel.These sources are mainly located outside the city of Turin, where district heating systems are less common.Traffic, instead, is responsible for the emission of NOx, which acts as a precursor of PM formation in the atmosphere.Direct emissions from exhaust systems and tyre wear are also common.
During the colder months, PM levels in the city of Turin tend to be much higher than the rest of the year, due to the use of domestic heating and meteorological phenomena such as thermal inversion that prevents air circulation.PM concentration is also greatly affected by the morphology of the territory, which favors air stagnation.
Another relevant meteorological phenomenon of the area under examination is the Fohn, which is characterized by strong wind (with speeds greater than 1.5 m/s), temperature increasing, and low relative humidity (less than 40%).Fohn, together with other wind and precipitation phenomena, greatly reduces PM concentrations in the air, especially in winter months.

V. EXPERIMENT
This section presents the data collection and data preprocessing phases.The resulting data, which is used in the analysis presented in this work, is provided as supplementary material.

A. DATA COLLECTION
The low cost-monitoring system is composed of 14 monitoring stations, described in Fig. 2. Each station contains four low-cost PM sensors (Honeywell HPMA115S0-XXX), one temperature and relative humidity sensor (DHT22), and one atmospheric pressure sensor (BME/BMP280).The sampling time of the PM sensors was set to one second, while the other sensors generated measurements every 3-4 seconds.
The Honeywell sensors provide measurements for both PM10 and PM2.5, but only PM2.5 was considered, since PM10 is not measured directly but estimated from PM2.5 concentrations with a proprietary algorithm.Also, it is not specified by the manufacturer whether the sensor is able to count single particles.Multiple PM sensors are used in the same station in order to provide redundancy against failure.
The datasheet of the sensor [46] provides the accuracy for PM2.5 readings: • ± 15 µg/m 3 from 0 µg/m 3 to 100 µg/m 3 • ± 15% from 100 µg/m 3 to 1000 µg/m 3 .However, these accuracy values are only valid at 25 • C ± 5 • C, with relative humidity from 0% to 95% and non-condensing.Indeed, the experiment was affected by a wider range of environmental conditions, with temperatures ranging from −4 • C to 35 • C degrees, heavy rains, and condensing humidity.In addition, the PM composition used for factory calibration is cigarette smoke, which differs from the urban air analyzed during the experiment.Therefore, an independent evaluation of the sensors' performance is required.
The monitoring systems was positioned at the background urban monitoring station in T. Rubino, at 1.5 meters from the air inlets of official gravimetric and beta attenuation instruments.The data used in this work were collected from November 1, 2020, for a period of 12 months.Official PM2.5 and PM10 data 4 were also collected during the same period from all the institutional monitoring stations of the metropolitan city of Turin, shown in Fig. 1.ARPA stations usually provide daily averages of PM concentration, with the exception of T. Rubino where data is also provided with hour granularity.In order to have comparable results on the same temporal granularity, all the data produced by the 56 PM sensors and the official stations was averaged over 24 hours.

B. DATA PRE-PROCESSING
The 56 light-scattering sensors, being placed close to each other, should have generated similar data, or at least highly correlated data.
However, multiple sensor failures and malfunctions occurred during the experiment.The most common malfunction that was observed is point anomalies, during which a sensor generates extremely high readings for a short period.These are temporary and affect individual sensors, and therefore can be easily corrected by taking the median of more than two sensors.
For what concerns permanent sensor failures, most of the sensors stopped working by the end of the 1-year experiment campaign.This could be attributed to being exposed to temperatures below freezing and 100% humidity levels.The main type of permanent failure that occurred was the sensor being stuck to values close to zero.Some sensors also exhibited non-deterministic behavior before getting stuck.
Data related to extensive sensor malfunctions was removed from the analysis, due to the fact that, in an ideal system, these behaviors should be identified and the sensors replaced.In addition, if considered, the evaluation of data quality would not reflect the actual performance of functioning lightscattering devices.
Each one of the 56 sensors was compared to the median of all of them, which represents the idealized behavior.Sensors with a correlation with the median lower than 0.8 were discarded, since this would indicate extensive sensor malfunction.According to this criterion, 22 sensors were discarded.
The remaining 34 sensors started worked correctly, but most of them got stuck at a certain point providing low values close to zero.Consequently, starting from the last data detected by each sensor, all previous PM data was discarded up to the first value higher than a threshold, i.e., 2 µg/m 3 .
Finally, the median was recalculated with the remaining values, in order to evaluate the reliability of the final sensors.

VI. METHODOLOGY
The purpose of the analysis is to compare PM measurements from low-cost light-scattering sensors with respect to the data provided by the institutional network.More in detail, the study wants to assess whether the use of low-cost lightscattering sensors can be useful for measuring daily average of PM2.5, as compared to using values provided by official instruments positioned in a different station and/or measuring concentrations of different PM sizes (e.g.PM10).

A. METRICS
In order to measure the correlation between two instruments, independently of their technology, position, or measured PM size, the Pearson Correlation Coefficient was evaluated for the whole year.This metric was chosen to asses whether the measurements of a sensor can provide indications about the measurements of a different one, in the case the latter was not available.It is important to note that the Pearson coefficient does not evaluate the quality of a prediction, because it does not take into account the error between two sensors, but only their correlation.For this reason, to evaluate a sensor as a possible predictor of a different one, the first sensor was calibrated with a simple linear regression model using the second as a reference.This process was performed on 30-day windows, evaluating the RMSE of the calibrated sensor over the entire year (also considering the calibration window).

B. ANALYSIS DESCRIPTION
The first step of the analysis is to evaluate the correlation between official sensors positioned at the same station.The correlation between gravimetric and β-attenuation sensors measuring the same size of PM is used as a benchmark value for other PM estimations.In addition, the correlation between 108766 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.different PM sizes, PM10 and PM2.5, is computed at the same station for official sensors.
The second step is to assess the correlation between nearby official stations.In this part of the analysis, only urban stations inside Turin are considered: T. Rubino, T. Consolata, T. Grassi, T. Lingotto, and T. Rebaudengo.As before, the analysis is performed both on the same and on different PM sizes.
The next step considers the station of T. Rubino, where the sensors are installed.The correlation of PM2.5 measured at T. Rubino with official sensors is evaluated with respect to PM2.5 and PM10 official sensors of stations positioned outside Turin.Then, the correlation between the working light-scattering sensors and the official β-attenuation sensors of T. Rubino is evaluated.
In the last part of the analysis, the correlation values obtained for the different stations, PM size, and sensor technologies are compared.Furthermore, the best lightscattering sensor, i.e., sensor 25, is calibrated using a simple linear regression model.The model uses as the independent variable the daily averages of PM2.5 measurements provided by the low-cost light-scattering sensor, and the β-attenuation sensor of T. Rubino as the target value.A 30-day rolling window with one-day shifts is used to select different training datasets.In this way, the calibration process is performed multiple times to evaluate how the training period influences the calibration results.
The same procedure is applied for calibrating the PM2.5 β-attenuation sensors of the other stations of Turin, T. Lingotto, and T. Rebaudengo, to target the T. Rubino PM2.5 β-attenuation sensor.Finally, the RMSE values of the lowcost sensor and of the two official ones are evaluated with respect to the T. Rubino reference, and the results compared.
For what concerns the sensitivity to environmental factors, a previous study [27], analyzes how the performance of the same low-cost monitoring system varies during different days and months due to changing environmental conditions.

VII. ANALYSIS
In this section, the results of the analysis are presented and discussed.In the presentation of the results, the stations are cited in the following order: T. Rubino as the first one, then the stations in the municipality of Turin, and finally those in the other municipalities of the metropolitan city of Turin.Stations in the same category are listed in alphabetical order.

A. CORRELATION BETWEEN GRAVIMETRIC AND β-ATTENUATION MEASUREMENTS AT THE SAME SITE
As a first element of comparison, we observe the correlation between the values of particulate matter of the same size (either PM10 or PM2.5) collected by official measuring instruments based on different technologies at the same place.Given that the devices monitor the same quantities, the expected result should be the same.However, errors specific to the technology, depending on its inherent characteristics, and random errors, which are implicit in the measurement, can generate different results.
During the experiment, there were six stations equipped with both gravimetric and β-attenuation instruments measuring the same size of particulate matter: three in the municipality of Turin, and three in the metropolitan city.Table 2 lists the stations, the size of the particulate matter under examination, and the correlation.The correlation is between 0.984 and 0.993 with an average of 0.989.These values can be considered the optimum for a measuring instrument and are used during the whole analysis as a benchmark.
Fig. 3 shows the scatter plot between the two PM10 measurement devices at the T. Rubino station.Since both instruments are correctly calibrated, the slope of the data corresponds to a straight line at 45 • .

B. CORRELATION BETWEEN PM2.5 AND PM10 AT THE SAME SITE
When only one between the PM10 and PM2.5 measuring devices is present at a single location, a possible solution is to estimate the missing quantity starting from the measured one [37].
Fig. 4 shows the ratio of PM2.5 over PM10 detected at T. Rubino by β-attenuation sensors.The average value is 0.684 (with standard deviation σ = 0.108).The variations in the ratio may depend on different meteorological and atmospheric conditions, and on differences in the emission sources.For example, a previous study conducted in the southwestern area of Piedmont [47], close to the area monitored in the current study, identified a lowering in the PM2.5 fraction during the days of Fohn.In addition, the same previous study also identified an increase in the PM10 fraction in correspondence with the spreading of sandy material after snowfalls.Fig. 5 shows the scatter plot of PM10 over PM2.5 at the T. Rubino station.Both measuring instruments are based on the attenuation of Beta radiation.It is possible to note that the inclination of the points is no longer at 45 • , as in Fig. 3, since the average ratio between PM2.5 and PM10 at T. Rubino is 0.684 (the intercept is considered to be negligible).However, it does not affect the correlation and could be easily corrected with a linear transformation.On the other hand, some points are clearly far from the line of best fit.
During the period under examination, there were seven stations equipped with both PM2.5 and PM10 sensors: two in the municipality of Turin, and five more in the metropolitan city.Table 3 lists the stations, the technology of the devices, and the correlation between PM2.5 and PM10 values.
The correlation is between 0.875 and 0.987 with an average of 0.952.However, the minimum, recorded in the Ceresole, is an outlier, as the next low is 0.935.The reason why Ceresole is unrelated is that it is a monitoring station in the high mountains, located in a natural park, where the atmospheric conditions are significantly different.Overall, the values in Table 3 show a good approximation of the dust levels, but are lower than the benchmark identified in Section VII-A.

C. CORRELATION OF THE SAME SIZE OF PARTICULAR MATTER AT DIFFERENT URBAN SITES
In this part of the analysis, only urban stations located inside Turin are considered.T. Rubino and T. Lingotto are background stations, while T. Consolata, T. Grassi, and T. Rebaudengo are traffic ones.

1) PM2.5 STATIONS
Table 4 lists the three PM2.5 urban stations in Turin, the technology of the devices, and the correlation between PM2.5 values in every pair of stations.The correlation between the measurements of the gravimetric sensor and the β-attenuation sensor at T. Lingotto is not reported in Table 4 because the station is the same: this analysis has been already considered in Table 2.The correlation is between 0.955 and 0.969 with an average of 0.961.These results are comparable to the correlation between PM2.5 and PM10 reported in Table 3.
Table 5 shows the average value of PM2.5 collected at the three stations.It can be noted that, despite the good correlation, there are significant differences in the average values, in particular between the traffic station and the background stations.It can be seen that the two T. Lingotto sensors have a difference of 1 µg/m 3 .This quantity can be 108768 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.considered negligible.The average of T. Rubino is similar to the average of T. Lingotto, while T. Rebaudengo has the highest mean with a 4 µg/m 3 difference.This difference may seem small, but during the summer period, all values tend to zero, flattening the difference.Overall, these data show that although the areas are atmospherically different, with different PM2.5 values, the trends of PM2.5 follow the same behavior.
Fig. 6 shows the scatter plot of PM2.5 gravimetric at T. Lingotto against T. Rubino β.In Fig. 6, the PM2.5 values measured by the gravimetric sensor at T. Lingotto were preferred to the one measured by the β-attenuation sensor at the same station (although the best correlation of T. Rubino β is with T. Lingotto β), because during the first two weeks of the experiment, the β sensor was not working.

2) PM10 STATIONS
The same analysis has been conducted about the correlation of PM10 measured at the monitoring in Turin.In detail, there are 2 PM10 background stations (i.e., T. Rubino and T. Lingotto), and 3 traffic stations (T.Consolata, T. Grassi, and T. Rebaudengo).Table 6 lists the stations, the technology of the devices, and the correlation between PM10 values.The correlation is between 0.945 and 0.981: this range is larger than the one of PM2.5 (as reported in Table 4) due to the higher number of PM10 monitoring stations.However, the average correlation of 0.964 is close to the average for PM2.5: this confirms the high level of correlation of PM within the urban area.

D. CORRELATION BETWEEN PM2.5 AND PM10 AT DIFFERENT URBAN SITES
This section evaluates the correlation between the measurements of different sizes of particulate matter within the same urban area.In particular, PM2.5 values collected in a station are compared with PM10 values measured at different places.
Table 7 lists the stations, the technology of the devices, the size of PM, and the correlation values.
The correlation is between 0.913 and 0.959 with an average of 0.933.These results are slightly worse than using particulate matter sensors of the same size.In addition, it is possible to note some significant differences, in particular between the traffic station and the background stations.Nevertheless, the trends of PM10 and PM2.5 are similar.

E. CORRELATION BETWEEN PM2.5 AT T. RUBINO AND OUTSIDE TURIN
It is possible to extend the analysis to external stations.Given the large number of stations in the metropolitan city of Turin, the analysis in this section is focused on the comparison between the beta-ray attenuating PM2.5 sensor at T. Rubino and PM2.5 sensors outside the urban area of Turin.
Table 8 lists the stations outside Turin equipped with PM2.5 sensors, the technology of their devices, and the correlation values with T. Rubino.The correlation ranges between 0.032 and 0.968 with an average of 0.846.These results are significantly worse than the ones reported in Table 4, about the correlation of PM2.5 measurements between urban stations.However, it has to be noted that the correlation between T. Rubino an Ceresole is exceptionally low and out of scale compared to the others.Without considering Ceresole, the minimum would be 0.886.Moreover, some stations located in municipalities within the agglomeration of Turin show a good correlation level with T. Rubino.In detail, the station most correlated to T. Rubino is Baldissero, a background station in a rural area at the border of Turin.The second most correlated station is Borgaro, another background station in a suburban area bordering Turin.

F. CORRELATION BETWEEN PM2.5 AT T. RUBINO AND PM10 OUTSIDE TURIN
This section examines the correlation between the betaattenuating PM2.5 sensor at T. Rubino and PM10 sensors outside the urban area of Turin.
Table 9 lists the stations outside Turin equipped with PM10 sensors, the technology of the devices, and the correlation values with T. Rubino.The correlation is between 0.095 and 0.941 with an average of 0.769.These results are significantly worse than the ones shown in Table 8.Again, the station with the lowest correlation is Ceresole, which is completely unrelated.The best station is Beinasco, which is a background monitoring station in a suburban area bordering Turin.In general, the measurements of different sizes of particulate matter in urban and rural stations do not seem to have a meaningful relationship.34 selected sensors, as explained in Section V-B.The station of T. Rubino also provides hour measurements from the beta attenuation instrument, therefore it was possible to compute correlations at a higher sampling rate.

G. CORRELATION BETWEEN PM2.5 β-ATTENUATION AND LIGHT-SCATTERING MEASUREMENTS
The data shows that light-scattering sensors can provide daily measurements with a strong correlation with the official reference, being higher than 0.9 in most cases.However, the hour correlation with T. Rubino is substantially lower for all sensors, never reaching 0.9.An interesting aspect can be seen when comparing the correlation with the reference and the correlation with the median: the correlation with the median seems not to be affected by the change in sampling rate.This shows that the sensors still agree with each other at a higher sampling rate, but not with the reference.The observed behavior seems to indicate that the reduced correlation is due to a technological limitation, rather than the imprecision of the single sensors.This is also confirmed by the reduction of the correlation between the median and T. Rubino when increasing the sampling rate.Sensor 25 is the most correlated to the median and also one of the two most correlated with T. Rubino.Fig. 7 and Fig. 8 show the scatter plot of sensor 25 with respect to the T. Rubino beta monitor on day and hour averages.

H. OVERALL EVALUATION
This section compares all the analyzed data, in order to select the best approaches to determine the PM in the absence of a monitoring station on site.
Fig. 9 summarizes the correlations of all the analyzed sensors.For completeness, these values are also reported in Table 11.The highest level of correlation occurs between pair of sensors with different technologies (beta and gravimetric) at the same place.
The correlation between PM10 and PM2.5 in the same place ranges in a wide interval.A high correlation may be due to local phenomena, specific to a given location, that links PM10 values to PM2.5 values.However, the difference between PM10 and PM2.5 is largely due to environmental and human phenomena that are difficult to predict.Consequently, assessing PM2.5 with a PM10 sensor in the same location can give a good correlation, but with high uncertainties about possible changes in atmospheric conditions.
The use of sensors for the same size of particulate matter within the same urban area, but in different zones, leads to a very good correlation.Although the correlation never reaches the optimum, it is notably high.It should be noted that the range of correlation is higher in the case of PM10, but this can be explained by the greater number of existing monitoring stations of such size of particulate matter.
Using PM10 sensors in different locations within the same conurbation produces worse results.Even worse is the correlation between sensors, both PM2.5 and PM10, positioned outside the urban agglomeration.In all three cases, it is possible to find stations that give fair results, but inferior to the previous cases.
Light-scattering sensors may have a good level of correlation, but they do not reach the benchmark level of correlation between gravimetric and β-attenuation sensors.In terms of correlation, their performance is comparable with the estimations from PM sensors, also for different PM sizes, positioned in stations inside Turin.Therefore, if lots of nearby high-precision stations are present their usefulness is limited when considering daily averages.However, using low-cost sensors can be relevant in places where institutional stations are sparse or missing.
Furthermore, most institutional monitoring stations are limited to daily sampling: in the considered area, only one station has hourly sampling and none has instantaneous sampling.Instead, the instantaneous sampling is easily obtainable with light-scattering sensors.This can provide insightful information on local and short-lived emission phenomena, even if, as it was discussed in Section VII-G, their precision is lower at higher sampling rates.
Finally, the reliability of light-scattering sensors over extended use is limited.By positioning multiple low-cost devices at the same location, faulty sensors can be identified and replaced, so that malfunctions do not affect the reliability of the data.However, if a large amount of sensors are deployed on the territory, the maintenance effort can be high.

I. RMSE VALIDATION
For validation purposes, this section computes the RMSE that would be obtained by calibrating the data of the lightscattering sensor 25 and of the PM2.5 sensors in the Turin monitoring stations to target the PM2.5 sensor at T. Rubino.The adopted procedure is as follows: all days were discarded in which one of the four PM2.5 stations in Turin or sensor 25 did not produce data, except those directly at the beginning or end of the period under analysis.Linear regression models were trained for each sensor using a 30day rolling window with one-day shifts to select the training period.For each one of these training periods, the RMSE of the calibrated sensor was calculated over the whole year, including the 30 days of training.
Figures 10, 11, and 12 compare the RMSE of sensor 25 and the three PM2.5 stations in Turin, taking the sensor at T. Rubino as a reference.The last few days of sensor 25 are missing, due to planned maintenance.Also, the early days of the sensor T. Lingotto β are missing, because it was installed after the start of the trial.The three figures show the amount of error that would have been produced when carrying out a 30-day calibration in any part of the year.In particular, sensor 25, which is present in all figures, has a fairly constant RMSE with respect to T. Rubino regardless of the period in which the calibration is performed.The only exception, with higher RMSE, occurs in summer: this is compatible with the lower reliability of the sensors when measuring low PM concentrations, which characterize the summer period.This  is due to the fact that during the summer months the absolute error of the sensor, per datasheet 15 µg/m 3 , is comparable with the measured PM concentrations.
Figure 10 reveals that the RMSE of the beta sensor at T. Lingotto is almost constant and usually lower than the RMSE of sensor 25.Similarly, the RMSE of the gravimetric sensor, in Fig. 11, is constant and lower than the RMSE of sensor 25 most of the time, but it is higher than the RMSE of the β sensor.Finally, Fig. 12 shows that the β sensor at T. Rebaudengo has an error very similar to sensor 25, but without the summer negative peak.In all the figures it can be seen that a linearization carried out in May shows a peak in error, probably due to a local phenomenon that happened in the T. Rubino area.These images show that the choice of the calibration period of the sensors can have a meaningful impact on the measurement quality.
Figure 13 shows the accuracy and precision of hour PM2.5 measurements of sensor 25 w.r.t the β-attenuation instrument at T. Rubino, for both the calibrated and noncalibrated case.The calibration is performed during the first 30 days of the experiment.For each integer value of PM2.5 provided by the reference device, the accuracy is computed as the difference between the average of the sensor measurements corresponding to the reference value and the reference value itself.This is performed only when at least six measurements are available for the given PM concentration.In a similar way, the precision is computed as the standard deviation of the sensor measurements for each of the reference values.The calibration process improves both of these metrics.For what concerns accuracy, the calibrated sensor satisfies the accuracy of ± 15 (µg/m 3 ) provided by the manufacturer, even considering a wider range of environmental conditions.

VIII. CONCLUSION
This paper has investigated the reliability of light-scattering PM sensors to integrate institutional monitoring networks, which are based on gravimetric sensors or beta-ray attenuation sensors.The higher availability and mobility of light-scattering sensors allow direct measurements in place, becoming an alternative to modeling PM concentrations based on few measurements taken at the fixed locations of gravimetric or beta-ray attenuation sensors.
A year-long experiment has been conducted by placing 56 inexpensive PM2.5 light-scattering sensors close to an institutional monitoring station.The correlation between the two groups of sensors has been analyzed, as well as the correlation between institutional sensors of PM2.5 and PM10 deployed over a large area.The analysis has shown that sensors based on light scattering are easily subject to failures and malfunctions, therefore they require maintenance and a careful evaluation of their data.Nevertheless, properly working light-scattering sensors have a good correlation with official data, almost the same as the correlation found between institutional sensors placed in different areas but with similar characteristics.These results have been validated with calibration and RMSE verification.
All analyses have focused on daily averages, which represent the standard frequency of institutional sensors.The reliability of the data produced by the light-scattering sensors is sufficient to improve the spatial density of information in areas where official monitoring stations are missing.They could also help to increase the sampling frequency of the institutional network, with clear advantages for realtime applications.Results show that when measurement granularity is increased to hour averages, they still achieve a good correlation with the reference, even if it is lower with respect to daily measurements.
Future work should further verify their measurement accuracy at higher sampling rates by using a TEOM device or a high-precision light-scattering instrument as a reference.In addition, measurement quality should be analyzed considering shorter time intervals, to better understand the effect of seasonal and meteorological changes.A more detailed evaluation of the sensors' performance should also take into account sensitivity to environmental factors and particle size, using high-precision measurements of meteorological parameters.Finally, calibration intervals and methodology should be evaluated carefully, especially when high-precision references cannot be used for on-site calibration.

FIGURE 1 .
FIGURE 1.Institutional monitoring stations in the metropolitan city of turin[42].

FIGURE 9 .
FIGURE 9. Range of correlation and mean.

FIGURE 13 .
FIGURE 13.Precision and accuracy on hour averages of light-scattering sensor 25 over T. Rubino beta.

TABLE 1 .
Previous studies on the relationship between PM2.5 and PM10.

TABLE 2 .
Correlation between measurements of the same size of particulate matter collected at the same place by institutional monitoring devices.

TABLE 3 .
Correlation between institutional measurements of different size of particulate matter collected in the same place.

TABLE 4 .
Correlation between measurements of PM2.5 in the Turin monitoring stations.

TABLE 5 .
Average PM2.5 collected in the Turin monitoring stations.

Table 10
lists the eight light-scattering sensors that worked properly during the whole test period.In addition, it shows the correlation of each sensor with the institutional values provided at T. Rubino and with the median computed over the

TABLE 6 .
Correlation between measurements of PM10 collected at different places.

TABLE 7 .
Correlation between measurements of PM2.5 and PM10 collected at different places.

TABLE 8 .
Correlation between PM2.5 collected in the metropolitan city of turin and in T. Rubino.

TABLE 9 .
Correlation between PM10 collected in the metropolitan city of turin and PM2.5 in T. Rubino.

TABLE 10 .
Correlation of the PM2.5 light-scattering sensors that worked for the entire experiment.

TABLE 11 .
Range of correlation and mean for all considered particulate sizes and relative locations.