Skip to Main Content
Machine Learning (ML) techniques are being employed in bioinformatics with increasing success. However, two problems are still prohibitive for symbolic ML methods: huge amount of data and lack of examples for training purposes. Thus, this paper introduces the use of reinforcement learning (RL), with the objective of dealing with these two drawbacks. Our work proposes and implement a RL method for the Bio Agents system, in order to improve the annotation of biological sequences in genome sequencing projects. Experiments were done with real data from two different genome sequencing projects: Paracoccidioides brasiliensis - Pb fungus and Paullinia cupana - Guaraná plant. To assign reinforcement signals we have used reference genomes with curated annotations that are considered correct, these signals tackle specific databases and alignment algorithms. The results obtained with the inclusion of a RL layer in Bio Agents were better compared with the system without the proposed method. Also, to the best of our knowledge, this is the first attempt to apply RL techniques to annotation in bioinformatics projects.