Coarse-Grained Models for Automated Fragmentation and Parametrization of Molecular Databases by Johannes G. E. M. Fraaije*†‡, Jan van Male‡, Paul Becherer‡, and Rubèn Serral Gracià‡
November 2, 2016, Journal of Chemical Information and Modeling, Copyright © 2016 American Chemical Society
We calibrate coarse-grained interaction potentials suitable for screening large data sets in a top-down fashion. Three new algorithms are introduced: (i) automated decomposition of molecules into coarse-grained units (fragmentation); (ii) Coarse-Grained Reference Interaction Site Model—Hypernetted Chain (CG RISM-HNC) as an intermediate proxy for dissipative particle dynamics (DPD); and (iii) a simple top-down coarse-grained interaction potential/model based on activity coefficient theories from engineering (using COSMO-RS). We find that the fragment distribution follows Zipf and Heaps scaling laws. The accuracy in Gibbs energy of mixing calculations is a few tenths of a kilocalorie per mole. As a final proof of principle, we use full coarse-grained sampling through DPD thermodynamics integration to calculate log POW for 4627 compounds with an average error of 0.84 log unit. The computational speeds per calculation are a few seconds for CG RISM-HNC and a few minutes for DPD thermodynamic integration.