Machine Learning for Structure Search of Ligand-protected Nanoclusters

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Science | Doctoral thesis (article-based) | Defence date: 2024-02-23
Degree programme
90 + app. 36
Aalto University publication series DOCTORAL THESES, 45/2024
Understanding the atomic structures of ligand-protected nanoclusters is essential for their application in various fields. These structures not only determine the physical and chemical properties of ligand-protected nanoclusters but also play a crucial role in their stability and reactivity. Knowing the precise atomic structures allows us to tailor nanoclusters for specific functions. However, because of the extraordinarily high dimensionality of the search space which encompasses an exceptionally large number of all potential structures, it is difficult to use quantum mechanical methods, such as the density functional theory, to find the low-energy structures of ligand-protected nanoclusters. On this point, the structure search of ligand-protected nanoclusters could be more efficient and accurate by utilizing machine learning methods. In this dissertation, I developed machine learning methods to search the atomic structures of ligand-protected nanoclusters by decomposing the problem into three steps. For the first step, I developed a molecular conformer search procedure based on Bayesian optimization to search the structures of isolated molecules. Using four amino acids as examples, I showed that the procedure is both efficient and accurate. For the second step, I modified the procedure to search the structures of a single ligand on a nanocluster. I also developed and tested strategies to avoid steric clashes between a ligand and cluster parts. Moreover, I tested and demonstrated our modified procedure by searching structures for a cysteine molecule on a well-studied gold-thiolate cluster. As a result, I found that cysteine conformers in a cluster inherit the hydrogen bond types from isolated conformers, while the energy rankings and spacings between the conformers are reordered. In the final step, I applied a machine learning method based on kernel rigid regression (KRR) models to relax the structures of ligand-protected nanoclusters. Moreover, I used an active learning workflow to enhance the relaxation performance of the KRR models. To test and demonstrate our method, I applied it to search structures of Au25(Cys)18 -. We found that the low-energy structures with IItype hydrogen bonds (OH- -N, OH from trans-COOH and N from NH2) are dominant and the different configurations of the ligand layer indeed influence the properties of the clusters.
Supervising professor
Rinke, Patrick, Prof., Aalto University, Department of Applied Physics, Finland
Thesis advisor
Xi, Chen, Prof., Lanzhou University, China
machine learning, Bayesian optimization, density-functional theory, active learning, nanocluster, structure search
Other note
  • [Publication 1]: Lincan Fang, Esko Makkonen, Milica Todorovi´c, Patrick Rinke, and Xi Chen. Efficient Amino Acid Conformer Search with Bayesian Optimization. Journal of Chemical Theory and Computation, 17, 3, 1955-1966, doi: 10.1021/acs.jctc.0c00648, February 2021.
    DOI: 10.1021/acs.jctc.0c00648 View at publisher
  • [Publication 2]: Lincan Fang, Xiaomi Guo, Milica Todorovi´c, Patrick Rinke, and Xi Chen. Exploring the Conformers of an Organic Molecule on a Metal Cluster with Bayesian Optimization. Journal of Chemical Information and Modeling, 63, 3, 745-752, January 2023.
    DOI: 10.1021/acs.jcim.2c01120 View at publisher
  • [Publication 3]: Lincan Fang, Jarno Laakso, Patrick Rinke, and Xi Chen. Machinelearning accelerated structure search for ligand-protected clusters. Journal of Chemical Physics, Submitted.