|

pH-dependent solubility prediction for optimized drug absorption and compound uptake by plants.

Researchers

Journal

Modalities

Models

Abstract

Aqueous solubility is the most important physicochemical property for agrochemical and drug candidates and a prerequisite for uptake, distribution, transport, and finally the bioavailability in living species. We here present the first-ever direct machine learning models for pH-dependent solubility in water. For this, we combined almost 300000 data points from 11 solubility assays performed over 24 years and over one million data points from lipophilicity and melting point experiments. Data were split into three pH-classes - acidic, neutral and basic - , representing the conditions of stomach and intestinal tract for animals and humans, and phloem and xylem for plants. We find that multi-task neural networks using ECFP-6 fingerprints outperform baseline random forests and single-task neural networks on the individual tasks. Our final model with three solubility tasks using the pH-class combined data from different assays and five helper tasks results in root mean square errors of 0.56 log units overall (acidic 0.61; neutral 0.52; basic 0.54) and Spearman rank correlations of 0.83 (acidic 0.78; neutral 0.86; basic 0.86), making it a valuable tool for profiling of compounds in pharmaceutical and agrochemical research. The model allows for the prediction of compound pH profiles with mean and median RMSE per molecule of 0.62 and 0.56 log units.© 2023. The Author(s), under exclusive licence to Springer Nature Switzerland AG.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *