|

Leveraging DFT and Molecular Fragmentation for Chemically Accurate p Prediction Using Machine Learning.

Researchers

Journal

Modalities

Models

Abstract

We present a quantum mechanical/machine learning (ML) framework based on random forest to accurately predict the pKas of complex organic molecules using inexpensive density functional theory (DFT) calculations. By including physics-based features from low-level DFT calculations and structural features from our connectivity-based hierarchy (CBH) fragmentation protocol, we can correct the systematic error associated with DFT. The generalizability and performance of our model are evaluated on two benchmark sets (SAMPL6 and Novartis). We believe the carefully curated input of physics-based features lessens the model’s data dependence and need for complex deep learning architectures, without compromising the accuracy of the test sets. As a point of novelty, our work extends the applicability of CBH, employing it for the generation of viable molecular descriptors for ML.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *