Abstract:
Symbolic regression (SR) is the process of finding an unknown mathematical expression given the input and output and has important applications in interpretable machine l...Show MoreMetadata
Abstract:
Symbolic regression (SR) is the process of finding an unknown mathematical expression given the input and output and has important applications in interpretable machine learning and knowledge discovery. The major difficulty of SR is that finding the expression structure is an NP-hard problem, which makes the entire process time-consuming. In this study, the solution of expression structures was regarded as a classification problem and solved by supervised learning such that SR can be solved quickly by using the solving experience. Techniques for classification tasks, such as equivalent label merging and sample balance, were used to enhance the robustness of the algorithm. We proposed a symbolic network called DeepSymNet to represent symbolic expressions to improve the performance of the algorithm. DeepSymNet has been proven to have a strong representation ability with a shorter label compared to the current popular representation methods, reducing the search space when predicting. Moreover, DeepSymNet conveniently decomposes SR into two smaller subproblems, which makes solving the problem easier. The proposed algorithm was tested on artificially generated expressions and public datasets and compared with other algorithms. The results demonstrate the effectiveness of the proposed algorithm.
Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 36, Issue: 1, January 2025)