In this proposal, we request compute time on Alvis and Dardel for ongoing and expanded research projects in the AI Laboratory for Molecular Engineering, headed by Dr. Rocío Mercado Oropeza in the CSE Department at Chalmers. This proposal supports 7 PhD students (not including 2 incoming), 4 postdoctoral researchers, 1-6 MSc students (depending on the term), and one faculty member, all working on AI-driven molecular engineering and molecular dynamics. These projects span various aspects of data-driven molecular engineering, including single-cell image and omics analysis, molecular dynamics simulations, and AI methods for biochemical applications such as generative AI for drug and materials discovery. We aim to address challenges in representation learning and predictive modeling for molecular systems.
Our research objectives are to: (1) Develop large-scale multi-modal neural networks for single-cell data and cell image analysis. (2) Train advanced generative models for molecular design and optimization. (3) Apply molecular dynamics simulations to study biomolecular interactions, which will be used to design surrogate models for important biochemical properties in our generative models. (4) Train language models for synthesizability-constrained molecular generation, metabolite prediction, and molecular representation.
These efforts have led to novel tools and methods applicable to AI-driven molecular engineering (as evidenced by our team's recent publication history https://ailab.bio/publications) as well as work currently in progress, with anticipated publications in leading computer science and bioinformatics journals. All code, models, and datasets built will be released open-source and our findings published in primarily computer science venues.
For this project, we want to extend our current allocation (5K GPU-h monthly on Alvis, 15x1000 core-h monthly on Dardel) to a bigger Medium Compute allocation (10K GPU-h monthly on Alvis, 50x1000 core-h monthly on Dardel, 15x1000 core-h monthly on Tetralith) as evidenced by our teams increased computational needs.