September, 13 2017
David Steinberg
A major challenge in de novo drug design is to identify molecules that are effective in attacking causes of disease. Computational strategies can be an effective tool to generate novel molecules with strong affinity to the biological target. This work explores the use of recurrent neural networks that are trained as generative models for molecular structures. The authors demonstrate high correlation between the properties of the generated molecules and the properties of the molecules used to train the neural network. They propose a strategy to enrich molecule libraries with new molecules that are active towards a given biological target, in which the model is fine-tunes with small sets of molecules that are known to be active against the target. Actual test cases are used to examine performance of the algorithm. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed. Against Plasmodium falciparum (Malaria) it reproduced 28% of 1240 test molecules. The authors describe how their model, when coupled with a scoring function, can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery.
Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
Segler, M.H.S., Kogej, T., Tyrchan, C. and Waller, M.P.