SUPR
Building Foundation Models for Interactive BioImage Analysis
Dnr:

NAISS 2023/5-297

Type:

NAISS Medium Compute

Principal Investigator:

Wei Ouyang

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2023-06-29

End Date:

2024-07-01

Primary Classification:

10799: Other Natural Sciences not elsewhere specified

Secondary Classification:

10299: Other Computer and Information Science

Tertiary Classification:

10610: Bioinformatics and Systems Biology (methods development to be 10203)

Allocation

Abstract

In the rapidly evolving landscape of life sciences, the ability to effectively analyze and interpret large-scale image data has emerged as a crucial challenge. As the volume of biological image data increases, traditional methods of local data management and processing are increasingly insufficient to meet the demands of sophisticated tasks like AI-powered image analysis. This project proposes an innovative approach to tackle these challenges by leveraging recent advances in deep learning, cloud-based data management and model serving. Our primary objective is to train foundation models for interactive bioimage analysis. Building upon the recent developments in self-supervised learning, large language models, and diffusion models, we aim to construct comprehensive foundation models capable of performing various image transformation tasks in life science, including segmentation, denoising, and deconvolution. These models will be trained on both existing public bioimage datasets and data contributed by users of the BioImage.IO portal (https://bioimage.io), enabling the models to continually evolve and enhance their capabilities. Concurrent with this, we will develop an array of interactive annotation and training tools to facilitate the construction of these foundation models. The resultant models will possess the versatility to tackle a diverse range of tasks in biological image analysis, thereby serving as valuable tools for researchers and practitioners alike. To address the data management challenges, we propose the introduction of the BioEngine platform hosted at the BioImage Model Zoo (https://bioimage.io). Designed as a web-based platform built atop Hypha, BioEngine integrates containerized services for scalable data management and AI model serving. This platform offers flexible solutions for image data management and model serving in both private and public clouds. BioEngine will also support the test run feature in the BioImage Model Zoo website, as part of our wider AI4Life project. To facilitate user engagement, we plan to develop a deployment toolkit that enables users to set up their own servers, be it on an institutional Kubernetes cluster or a workstation. The ultimate goal is to establish a standard for managing and sharing image data, in collaboration with the BioImage Archive. This project is generously supported by the KAW Data-Driven Life Science Fellows program and EU Horizon 2020 Research Infrastructure project AI4Life. By combining cutting-edge deep learning techniques with flexible, cloud-based data management strategies, we aim to revolutionize biological image analysis and contribute to the wider goals of data-driven life science research.