SUPR
Storage for PReSTO pilot NAISS 2023-22-811
Dnr:

NAISS 2024/6-3

Type:

NAISS Medium Storage

Principal Investigator:

Martin Moche

Affiliation:

Karolinska Institutet

Start Date:

2024-01-03

End Date:

2025-02-01

Primary Classification:

10601: Structural Biology

Secondary Classification:

20908: Medical Biotechnology

Tertiary Classification:

20906: Biocatalysis and Enzyme Technology

Allocation

Abstract

In 2013, Protein Science Facility from Karolinska Institutet and National Supercomputer Centre (NSC) started a pilot project to evaluate Macromolecular X-ray crystallography applications performance at NSC Triolith. In 2015, the Swedish light source MAX IV decided to fund an extension called PReSTO for integrated structural biology calculations and in 2018, the Swedish Research Council granted funds to this project (dnr. 2018-06479) Members of this project have access to the PReSTO installation via NAISS and may have additional complementary NAISS compute and storage allocations. Thinlinc software from Cendio supports the structural biology workflow by enabling remote graphic applications such as coot/chimera/ccp4mg/pymol for interactive model building and visualization of protein structure and surface properties. Since 2017, we use easybuild when installing PReSTO with several advantages to a standard HPC installation such as A) software environments can be sent to compute nodes B) software build and runtime dependencies are made explicit in easyconfigs and easyblocks C) the version-controlled software stack can be moved into new hardware with minor effort. PReSTO is now available at NSC Tetralith, LUNARC Cosmos, the MAX IV online and offline clusters and the NSC local resource Berzelius. The PReSTO homepage is designed for newcomers pointing towards software developer manuals, default option batch scripts and slurm configurations for certain graphical user interfaces. We also developed a PReSTO menu that enable users to A) launch software at login or compute nodes where appropriate B) select compute node time and core number C) select output directory for some software such as hkl2map. Code optimizations are made to adapt the forkxds script of popular XDS package to slurm and CryoSPARC to run at HPC resources. We must contact CryoSPARC developers requesting selected code snippets to be improved, by them/us, to avoid power termination at NSC Berzelius. In 2020, MAX IV developed a fragment screening web application on top of the stable PReSTO installation (1) and PReSTO were acknowledged for making Serial X-ray crystallography software available to Swedish researchers (2) doing time resolved diffraction experiments using X-ray Free Electron Lasers. In 2024, we will involve SciLifeLab National Bioinformatics Infrastructure (NBIS) to support PReSTO and contribute to Cryo-EM and integrated structural biology courses as in 2022. NBIS will develop a training program for all branches of structural biology and the large infrastructures SciLifeLab Cryo-EM, Swedish NMR Centre (SNC) to develop branch specific PReSTO documentation as MAX IV already done. The collaboration between NAISS, MAX IV, SNC and SciLifeLab staff is key to project progress. We added instructions to include grant numbers when acknowledging NAISS and PReSTO on our homepage. 1. G. M. A. Lima et al., FragMAXapp: crystallographic fragment-screening data-analysis and project-management system. Acta Crystallogr D Struct Biol 77, 799-808 (2021). 2. V. Srinivas et al., High-Resolution XFEL Structure of the Soluble Methane Monooxygenase Hydroxylase Complex with its Regulatory Component at Ambient Temperature in Two Oxidation States. Journal of the American Chemical Society 142, 14249-14266 (2020).