Application of a Low-Cost Electronic Nose to Detect of Forest Tree Pathogens: Fusarium oxysporum and Phytophthora plurivora

The presence of forest tree pathogens may lead to substantial problems and their early detection during seed storage or in nurseries may be critical for the choice of appropriate management strategy. A new construction of a low-cost electronic nose was tested on the samples of pathogenic fungi and oomycetes of Fusarium oxysporum and Phytopthora plurivora. The electronic nose uses Figaro Inc. TGS series sensors with applied heater voltage modulation. Such a mode of electronic nose operation may be more appropriate for application for constant monitoring of seeds storage, when we compare it to the method making use of modulation of the gas concentration. A rectangular shape of the sensors’ heater voltage modulation pattern, with a shallow drop of the heater voltage from the nominal voltage, was proposed. Data visualization using the principal component analysis method and the random forest machine learning technique was used to build classification models. A classification accuracy of 97% was obtained by a fusion of data collected by TGS 2610 and TGS 2602 sensors.

cost of equipment and the need for highly skilled personnel, 28 The associate editor coordinating the review of this manuscript and approving it for publication was Fan Zhang . this method is usually limited to applications under laboratory 29 conditions. 30 The concept of the electronic nose is to use a series of 31 nonspecific gas sensors [1], [2], [3]. In this approach, the 32 individual chemical components of the measured gas are not 33 identified; instead, odors are classified and recognized using 34 pattern recognition techniques supported by machine learn- 35 ing algorithms. 36 Several applications for electronic noses have been pro-37 posed, focusing on forestry and agriculture [4], [5], [6], 38 [7], [8], [9]. In addition, applications for fungal species 39 detection and identification have recently been explored by 40 Mota et al. [10]. 41 Gas sensors based on various physical phenomena can 42 be used to construct electronic noses, such as electro-43 chemical [11], gravimetric [12], optical [13]. However, 44 when simple, low-cost devices are proposed, they are usu- 45 ally based on commercially available metal oxide (MOX) 46 sensors.
its nominal working conditions. 98 The new electronic nose, presented in this manuscript, was 99 applied to classify measured samples of fungi and oomycetes 100 and differentiate them from non infested medium. 101 To achieve the above objectives, data visualization using 102 principal component analysis and random forest technique 103 of machine learning was used to build classification models 104 showing the best classification accuracy of the modeling fea-105 tures based on the data extracted from the sensors. Different 106 subsets of the data used for classification were tested to opti-107 mize the list of sensors in the electronic nose sensor array 108 and the depth of the voltage modulation profile of the sensor 109 heating.

112
As international trade in plants and plant materials increases, 113 the accidental introduction of insects or pathogens into new 114 areas becomes a serious problem. Their spread can lead to 115 forest health problems at a very early stage, such as damage 116 to plants in forest nurseries, resulting in a significant reduc-117 tion in the number of seedlings and economic losses. One 118 of the most frequently observed problems is caused by the 119 pathogens of the so-called ''damping-off seedlings disease''. 120 Damping-off is a disease that causes death of germinating 121 seeds and young seedlings, especially in forest nurseries [33]. 122 This disease is caused by several organisms, such as: fungi 123 Fusarium, Rhizoctonia, Cylindrocarpon, and oomycetes Phy-124 tophthora and Pythium. In Poland the genera Fusarium and 125 Phytophthora represent the most numerous pathogens in for-126 est nurseries. Their pathogenic soil-borne strains are among 127 the most harmful microorganisms in the world due to their 128 potential adaptability. They cause root rot, tuber blight and 129 wilt [34], [35]. Pathogenic strains of the fungal species F. 130 oxysporum particularly affect seedlings of coniferous species 131 in nurseries. The most commonly observed symptoms are 132 needle wilt and, in some cases, small root and stem rots. 133 Seedlings lose their fine roots (in which case they are easily 134 pulled out of the ground) or fall over due to infected stem tis-135 sue, which is usually damaged near the ground. The pathogen 136 moves upwards from the roots to the stems and hinders water 137 uptake, gradually clogging the xylem tissue, which leads to 138 wilting of the plant, yellowing of the needles and death. 139 Pathogenic oomycetes of the genus Phytophthora pose 140 an even threat to plants. When these organisms destroy the 141 fine roots (< 2 mm), the plants die quickly. If the seedlings 142 are raised in a water regime suitable for them, they often 143 do not show disease symptoms (are asymptomatic), and 144 chlamydospores are formed in the rhizosphere of the soil, 145 which become active only when the plants are planted in 146 moist habitats. Since they do not show external signs of 147 disease, visual selection of seedlings is not effective, and 148 the problem is shifted from the nursery to the forest planta-149 tion. Molecular diagnostic tests conducted in many countries 150 have shown that infestation of plants prepared for planting in 151 nurseries in Europe is high, sometimes reaching 80% [36]. 152 In addition, fungicides are intended to control fungi, not 153 oomycetes. Thus, if used improperly, they only mask the 154 disease, which is usually the case in nurseries. Therefore, it is 155 critical to identify the organisms that foresters and arborists 156 are dealing with there, and accurate and rapid analysis with 157 e-nose would be very useful for this purpose. Currently, the 158 recomended method for detecting oomycetes in soil is the 159 baiting using plant material [37]. The currently recommended 160 method for detecting oomycetes in soil is baiting with plant 161 material [37]. In this approach, the organisms sought grow 162 on oak, beech, or rhododendron leaves as bait, and the 163 infected leaf pieces are usually placed on selective media 164 (e.g., PARP) [38] [40], [41], [42]. However, that approach also requires that the repeatable [43]. That requires advanced designs of sensor 214 array chamber [44] and precise pneumatic gas supply.

215
Another approach is to capture sensor resistance character-216 istics while sensor temperature is modulated. That approach 217 allows the construction of a simpler low-cost electronic nose 218 as it doesn't require as precise and advanced pneumatic 219 modulation of the supplied gas. Modulation of the sensor 220 temperature with the required time profile is much eas-221 ier to achieve by a relatively simple electronic circuit. The 222 modulation is performed when the sensor already reached 223 the stationary state in the presence of the measured gas 224 conditions.

225
An important aspect of the construction of the electronic 226 nose based on heater temperature modulation is the choice 227 of the modulation profile. Various approaches were proposed 228 in this domain. As we reviewed in the Introduction section, 229 patterns or modulation such as sinusoidal, stair-like, or rect-230 angular were demonstrated. Furthermore, most researchers 231 investigated cases when the sensors heater voltage explores a 232 wide range. Also in most cases, it was demonstrated that mul-233 tiple periods of modulation were used to collect data required 234 for sample classification. In some cases also variation or tun-235 ing of modulation frequency was required.  The PW7 electronic nose is designed as low-cost equip-242 ment to detect smells emitted by various types of fungus. 243 Each construction of an electronic nose consists of two main 244 parts: the sensors probe and the main electronic unit con-245 nected to the computer. The probe is the round aluminum 246 block in which the sensors are placed. Similar to earlier 247 devices we used various types of metal oxide sensors made 248 by Figaro co., Japan. The sensor types are listed in Table 1. 249 Additionally, there were also placed HIH 4031 humidity sen-250 sor (Honeywell, Charlotte, NC, USA) and LM35 temperature 251 sensor (Texas Instruments, Austin, TX, USA).  [45], [46]. That may suggest, that rather operation 296 and modulation at higher heater voltages could give better 297 results of gas recognition. In our approach, we decided to  The profile of the modulation is presented in Fig. 3(a).  in the Forest Protection Department of the Forest Research 307 Institute in Sękocin Stary (Poland). Two organisms Phy-308 tophthora plurivora and Fusarium oxysporum as the most 309 frequently responsible for the occurrence of damping-off 310 symptoms in Polish forest nurseries were selected for detailed 311 analysis [33]. The pathogen isolates were cultured on clas-312 sical PDA agar media (20 g dextrose, 15 g agar, 4 g potato 313 starch, and 1 L distilled water) in 9 cm Petri dishes. They 314 were kept at room temperature until the mycelium completely 315 covered the surface of the dishes.  The whole experiment lasted two weeks, and during this 321 time we had 7 days of measurements. In total, we prepared 322 six Petri dishes for each category of samples. One set of three 323 dishes of each category was used during the first week of 324 the experiment. Since the samples could be contaminated by 325 other species, another set of dishes was used in the second 326 week of the experiment. Each Petri dish was measured only 327 once per day. Since the fungal and oomycete samples were 328 to ensure that the sensors response was flat, which means 355 that they already recovered to the baseline and were not 356 contaminated by residues from the previous measurement. 357 Then, the sensor array was placed over a Petri dish containing 358 the measured sample. It was waited for 100 measurements 359 (1 minute 15 seconds) and during this time the resistance of 360 the sensors reached a steady state. At this moment modu-361 lation the heating voltage of the sensors started, with three 362 rectangular steps with a length of 50 sensor readings started 363 and a different modulation depth, as schematically shown in 364 Fig. 3(a). For the voltage modulation steps of voltage drop 365 steps of -0.3, -0.6, and -0.9 V, each from the nominal heater 366 voltage of 5 V were applied. After the temperature modu-367 lation, the sensor array was manually moved to clean air, 368 where the residues of the measured gas could be desorbed, the 369 sensors were cleaned and prepared for the next measurement 370 process.

372
The data preparation, statistical analysis, and machine learn-373 ing models presented in this manuscript were performed 374 using computer codes developed in the Python 3.8 language. 375 The statsmodels package [47] was used for statistical tests 376 and scikit-learn package [48] for machine learning modeling. 377   transformation is an intuitive interpretation, as the rotation 402 of the coordinate system, which gives the new coordinates 403 in order of the amount of variability captured from the data 404 set. We used as input for the PCA transformation all features 405 extracted from the response curves of the sensors. Since the 406 input features represent non-comparable quantities expressed 407 in different units and have different ranges of values, we used 408 initial normalization of the input dataset to equal variance. 409 In our analysis, the PCA method was used only to visualize 410 the patterns of the data points in the two-dimensional space 411 of the two main principal components.

413
One goal of the electronic nose measurements is to apply 414 the collected data to create classification models that are able 415 to discriminate between the samples studied. Different types 416 of machine learning models have been applied to the data 417 collected by the sensors, and in the present work we have 418 chosen the Random Forest model [49]. This method has been 419 successfully used by other authors for classification tasks of 420 electronic noses [

434
The Random Forest models used in this analysis offer 435 several important advantages. Since during Random Forest 436 training the individual decision tree models are fit using a 437 subset of the entire training data set, the remaining portion of 438 the data can be used to estimate model performance. The so-439 called out-of-bag score (OOB) can be calculated as the model 440 classification accuracy based on the observations that were 441 not used to fit the decision tree. The score calculated from 442 each tree is then averaged and used as a fair estimate of model 443 performance. The advantage of such an approach is that fewer 444 computations are required and the model can be tested while 445 it is trained. The OOB score is similar to the commonly used 446 cross-validation method for estimating the performance of 447 classification models. The OOB score converges to the leave-   Interestingly, different patterns can be observed when look-504 ing at Fig. 5. For example, looking at the features extracted 505 from the TGS 2610 sensor data, it is possible to distinguish 506 between Fusarium and two other categories if we use the 507 feature SEnd, but this feature does not allow us to distinguish 508 between Phytophthora and medium samples. Other features 509 extracted from this sensor allow us to distinguish between 510 medium and other samples, but not between infested samples. 511 This results in the ability to distinguish between all sample 512 categories by extracting at least two features from the data 513 collected by this sensor. We mention here the example of the 514 TGS 2610 sensor because, as we will show in the following 515 sections, the data acquired by this sensor exhibited the best 516 performance in terms of classification accuracy.

518
As shown in the previous section, the distributions of the 519 extracted modeling features differ significantly among the 520 three categories of samples considered. The data presented 521 above also show very similar behavior for different features 522 either extracted using different techniques or obtained from 523 data collected by different sensors. Further insight can be 524 gained by using more advanced data visualization techniques 525 and plotting the data after transforming the modeling features 526 VOLUME 10, 2022   In Fig. 7, we compared the performance of several classifi-552 cation models trained on different sets of modeling features, 553 with the goal of assessing whether there are ways to reduce 554 the sensor array to a smaller number of sensors or reduce the 555 time of data collection without significantly decreasing the 556 classification accuracy.  Fig. 7(a) shows the comparison of the classification per-558 formance for the case when the models were trained with the 559 data from only one sensor and for the case when the modeling 560 features were extracted from the data from all sensors. The 561 results in this figure show that the models trained with the 562 data from all sensors gave the best classification performance, 563 while the models trained with the data from only one sensor 564 93482 VOLUME 10, 2022 In our opinion, there is an additional argument for choosing 598 the stage with the lowest heater voltage drop (-0.3 V). This 599 depth of modulation results in the least disturbance in the 600 operation of the sensors and should be preferred since the 601 time to reach a steady state should be the shortest. Also, 602 that allows for avoiding sensor operation at low heater tem-603 perature, at which gas composition does not influence the 604 electrical resistance. 605

606
An interesting output of the Random Forest classification 607 model is the ranking of the importance of the modeling fea-608 tures. Fig. 9 shows such data for the models based on fea-609 tures extracted from data from a single sensor and the -0.3 V 610 voltage drop of the sensor heater. An interesting observation 611 for the TGS 2610 sensor is that the most important feature is 612 SEnd (the slope of the sensor response curve at the end of 613 the observation range). For other sensors, SEnd is also often 614 the most important feature identified by the Random Forest 615 model. This may indicate that the observation time necessary 616 to differentiate between the studied sample categories cannot 617 be reduced to a much shorter period, as it is necessary to 618 achieve the region of the sensor response where the linear 619 slope is detected after the minimum value is reached. 620 VOLUME 10, 2022 FIGURE 8. Out of bag score (accuracy) of Random Forest classification models. Models that use data from a single stage of heater voltage drop compared to models that use features from all stages. Model uses data from a single sensor, sensor type is specified in the subframes.  the data from multiple sensors, we were able to improve the 629 classification accuracy up to 95%. 630 We trained a series of Random Forest models with the goal 631 of the optimization of the electronic nose sensors. In this task 632 we chose to use only data collected during the first step of 633 the heater voltage drop (-0.3 V). In addition to the previously 634 presented models based on data collected from a single sen-635 sor, we evaluated the models based on all combinations of 636 two, three, etc. sensors evaluated. For each number of sensors, 637 we selected the model with the best performance and the 638 results are shown in Fig. 10. As it can be noticed, merging 639 the data from the TGS 2610 and TGS 2602 sensors resulted in 640 a much better classification accuracy of nearly 97%. Adding 641 data from additional sensors may slightly improve the esti-642 mated performance, but in our opinion the electronic nose 643 with two sensors is sufficient.

645
Our results can be compared to other reports for the cases 646 when measurements and classification of similar types of 647 samples were reported. Lebanska et al. [59]   reported constructions of electronic noses, we decided to use 718 sensor modulation at the heater voltage range close to the 719 sensor nominal operating conditions. We also used relatively 720 shallow modulation magnitude, which shows faster opera-721 tion and especially faster sensor accommodation to changes 722 in environmental conditions. In addition, small change of 723 the sensor's heater voltage, starting from the nominal one 724 (shallow modulation) allows for avoiding sensor opera-725 tion at lower temperatures, where their resistance does not 726 depend on the composition of the gas in which the sensor is 727 immersed.

728
Experiments with measurements were performed on 729 samples of pathogenic fungi -Fusarium oxysporum and 730 oomycetes -Phytophthora plurivora cultured on classical 731 PDA agar media, with the aim of using the collected sensor 732 responses to discriminate between sample categories.

733
Five types of features describing the obtained curves were 734 extracted from the sensor responses and used for further anal-735 ysis. The Principal Component Analysis method was used 736 after transforming the modeling features for data visualiza-737 tion. Random Forest machine learning models were trained 738 with the data and the out-of-bag score was used as a mea-739 sure of the performance of the models, which corresponds to 740 the accuracy of classification between three categories under 741 study.

742
It was found that the rectangular step of the heater voltage 743 drop by -0.3 V from the nominal voltage of 5 V allowed col-744 lection of data that gave the best classification performance. 745 The analysis also allowed us to estimate the time (duration of 746 the rectangular heater voltage modulation steps) necessary to 747 collect the data required for sample classification.

748
The fusion of the data collected by the two sensors, TGS 749 2610 and TGS 2602, was the optimal configuration of the 750 electronic nose sensor array, which allowed a classification 751 accuracy of 97%. This result is very promising as the obtained 752 accuracy is higher than in the case of our previous construc-753 tions of low-cost electronic noses without dynamic sensors 754 temperature modulation.

756
The authors are grateful to Przemysław Wacławik for the 757 assembly of the electronic nose device.