Foundation machine-learning interatomic potentials (MLIPs)—pre-trained on large, chemically and structurally diverse datasets—are transforming atomistic simulation. While these models have been shown to give physically reasonable qualitative predictions for a remarkably wide range of chemistries and physical phenomena, they are still far from providing the level of quantitative accuracy on specific materials/tasks that custom trained MLIPs provide. However, fine-tuning the foundation models to problem-specific datasets promises MLIPs that rival the accuracy of these custom, from-scratch potentials, while requiring orders of magnitude less training data.
This proposal requests computing time to train and benchmark such fine-tuned models on a set of complex materials science problems, as well as to generate carefully chosen key reference training data using density functional theory (DFT) and beyond-DFT techinques.
We will adopt publicly available foundation models based on the equivariant message passing graph neural network architecture MACE, and evaluate three distinct fine-tuning scenarios:
1. Single-material models, where the aim is to rapidly obtain highly accurate MLIPs for individual compounds using GGA/metaGGA-level DFT reference data. A key objective will be to identify the smallest amount of reference data needed for this task.
2. Materials-Family level models, where the task is to obtain transferable MLIPs that span chemically related materials, also at the GGA/metaGGA-level DFT data.
3. Beyond DFT level models. Here we aim to use reference data at a beyond-DFT level, specifically the random phase approximation (RPA)-level. We will fine-tune on small, carefully selected RPA-datasets, guided by data-efficiency insights from task 1.
After fine-tuning, the models will be repeatedly inferenced in molecular dynamics (MD) simulations, to evaluate their performance in predicting a range of key physical quantities. We will have a focus on phase transformations in materials, this is a particularly challenging task for MLIPs as it requires accurately describing the relative energetics of multiple, potentially quite different phases of a material. A key application area that we will pay attention to is prospective barocaloric materials for solid-state cooling applications.