Abstract:
Bamboo, a grass, belongs to the Poaceae family, with 1642 species from 116 genera worldwide. It has exceptional physical, chemical, and mechanical qualities, which allow ...Show MoreMetadata
Abstract:
Bamboo, a grass, belongs to the Poaceae family, with 1642 species from 116 genera worldwide. It has exceptional physical, chemical, and mechanical qualities, which allow it to be employed in over a thousand different ways and contribute to a trade value of USD 2.76 billion. Bamboo is grown using rhizomes, tissue culture, or short branch cuttings without any other checks resulting in incorrect species identification and categorisation. Therefore, the classification or identification of these bamboo use its DNA barcode sequences with a K-mer based method, and machine learning (ML) is the most excellent strategy for resolving issues with the conventional or traditional categorisation of the species. A DNA barcode is a brief genetic signature that helps identify the species to which an organism belongs. It is possible to extract a useful feature from genome sequences using K-mer based approaches, which may then be used to increase comparison accuracy. In this research, we evaluate the classification performance of four supervised ML models on the DNA-barcode sequence of six Indian commercial bamboo species with a different K-mer combination. For this classification, we choose matK barcode region and supervised ML models such as Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM) and Gradient Boosting Machine (GBM). The results analysis of these models on the matK DNA sequence with different K-mers demonstrates that the classification capabilities of the GBM approaches are quite promising, and it has an accuracy of 95.3% on average.
Published in: 2023 International Conference on Computer, Electronics & Electrical Engineering & their Applications (IC2E3)
Date of Conference: 08-09 June 2023
Date Added to IEEE Xplore: 29 September 2023
ISBN Information: