Predicting the stereoselectivity of chemical reactions by composite machine learning method
Jihoon Chung1, Justin Li2, Amirul Islam Saimon3, Pengyu Hong4, & Zhenyu Kong3
1Department of Industrial Engineering, Pusan National University, Busan, Korea.
2Management, Entrepreneurship, and Technology, University of California, Berkeley, CA, USA.
3Grado Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA, USA.
4Department of Computer Science, Brandeis University, Waltham, MA, USA.
Stereoselective reactions have played a vital role in the emergence of life, evolution, human biology, and medicine. However, for a long time, most industrial and academic efforts followed a trial-and-error approach for asymmetric synthesis in stereoselective reactions. In addition, most previous studies have been qualitatively focused on the influence of steric and electronic effects on stereoselective reactions. Therefore, quantitatively understanding the stereoselectivity of a given chemical reaction is extremely difficult. As proof of principle, this paper develops a novel composite machine learning method for quantitatively predicting the enantioselectivity representing the degree to which one enantiomer is preferentially produced from the reactions. Specifically, machine learning methods that are widely used in data analytics, including Random Forest, Support Vector Regression, and LASSO, are utilized. In addition, the Bayesian optimization and permutation importance tests are provided for an in-depth understanding of reactions and accurate prediction. Finally, the proposed composite method approximates the key features of the available reactions by using Gaussian mixture models, which provide suitable machine learning methods for new reactions. The case studies using the real stereoselective reactions show that the proposed method is effective and provides a solid foundation for further application to other chemical reactions.
No related projects available
File Name | File Description | File Type | File Size | File URL |
---|---|---|---|---|
Reaction information | This file provides details on different Chiral Phosphoric Acid (CPA) reactions. | zip | 175.57 KB | Login to download |
Others | This file contains different combinations of the dependent and independent variables (more specifically, CPA reaction features and response variables) related to the analysis in the published paper (DOI: https://doi.org/10.1038/s41598-024-62158-0). | zip | 7.57 MB | Login to download |
Machine Learning code scripts | This file contains machine learning (ML) code scripts. Each script name in the zipped folder includes method names, table numbers, and figure numbers that correspond to those in the paper (DOI: https://doi.org/10.1038/s41598-024-62158-0). | zip | 402.62 KB | Login to download |