Abstract:
De novo molecular design and generation are frequently prescribed in the field of chemistry and biology, for it plays a critical role in maintaining the prosperity of the...Show MoreMetadata
Abstract:
De novo molecular design and generation are frequently prescribed in the field of chemistry and biology, for it plays a critical role in maintaining the prosperity of the chemical industry and benefiting the drug discovery. In recent years, reinforcement learning-based methods leverage graphs to represent molecules and generate molecules as a decision making process. However, this vanilla graph representation may neglect the intrinsic context information with molecules and limits the generation performance accordingly. In this paper, we propose to augment the original graph states with the SMILES context vectors. As a result, SMILES representations are easily processed by a simple language model such that the general semantic features of a molecule can be extracted; and the graph representations perform better in handling the topology relationship of each atom. Moreover, we propose a framework that combines supervised learning and reinforcement learning algorithm to take a solid consideration of these two heterogeneous state representations of a molecule, which can fuse the information from both of them and extract more comprehensive features so that more sophisticated decisions can be made by the policy network. Our model also introduces two attention mechanisms, i.e., action-attention, and graph-attention, to further improve the performance. We conduct our experiments on a practical dataset, ZINC, and the experiment results demonstrate that our framework can outperform other baselines in the learning performance of molecule generation and chemical property optimization.
Published in: 2019 IEEE International Conference on Data Mining (ICDM)
Date of Conference: 08-11 November 2019
Date Added to IEEE Xplore: 30 January 2020
ISBN Information: