Application of machine learning and deep learning methods for hydrated electron rate constant prediction.

Researchers

Journal

Modalities

Models

Abstract

Accurately determining the second-order rate constant with eaq- (keaq-) for organic compounds (OCs) is crucial in the eaq- induced advanced reduction processes (ARPs). In this study, we collected 867 keaq- values at different pHs from peer-reviewed publications and applied machine learning (ML) algorithm-XGBoost and deep learning (DL) algorithm-convolutional neural network (CNN) to predict keaq-. Our results demonstrated that the CNN model with transfer learning and data augmentation (CNN-TL&DA) greatly improved the prediction results and overcame over-fitting. Furthermore, we compared the ML/DL modeling methods and found that the CNN-TL&DA, which combined molecular images (MI), achieved the best overall performance (R2test = 0.896, RMSEtest = 0.362, MAEtest = 0.261) when compared to the XGBoost algorithm combined with Mordred descriptors (MD) (0.692, RMSEtest = 0.622, MAEtest = 0.399) and Morgan fingerprint (MF) (R2test = 0.512, RMSEtest = 0.783, MAEtest = 0.520). Moreover, the interpretation of the MD-XGBoost and MF-XGBoost models using the SHAP method revealed the significance of MDs (e.g., molecular size, branching, electron distribution, polarizability, and bond types), MFs (e.g, aromatic carbon, carbonyl oxygen, nitrogen, and halogen) and environmental conditions (e.g., pH) that effectively influence the keaq- prediction. The interpretation of the 2D molecular image-CNN (MI-CNN) models using the Grad-CAM method showed that they correctly identified key functional groups such as -CN, -NO2, and -X functional groups that can increase the keaq- values. Additionally, almost all electron-withdrawing groups and a small part of electron-donating groups for the MI-CNN model can be highlighted for estimating keaq-. Overall, our results suggest that the CNN approach has smaller errors when compared to ML algorithms, making it a promising candidate for other rate constant predictions.Copyright © 2023. Published by Elsevier Inc.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *