Scalable Federated Learning For Privacy-Preserving Training of Machine Learning Models

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/22-1350

Type:

NAISS Small Compute

Principal Investigator:

Li Ju

Affiliation:

Uppsala universitet

Start Date:

2025-11-01

End Date:

2026-11-01

Primary Classification:

10105: Computational Mathematics

Webpage:

Allocation

Alvis at C3SE: 1000 GPU-h/month
Mimer at C3SE: 500 GiB

Abstract

This project's primary aim is to develop novel algorithms and tools at the intersection of federated machine learning (FL) and multi-modal language models (MMLM). FL is a distributed learning paradigm that addresses the privacy and logistical challenges of traditional centralized machine learning, particularly for data generated at the computational edge. While related to distributed optimization, FL introduces unique constraints not typically present in high-performance computing. These include a lack of control over data partitioning, leading to non-IID (non-independently and identically distributed) data distributions, and considerable heterogeneity across client systems. Concurrently, MMLMs have shown significant promise in processing and integrating information from diverse data types. A key challenge is their adaptation to specialized tasks or domains, which is often computationally prohibitive. This project will focus on post-hoc adaptation of MMLMs within a federated framework, enabling privacy-preserving model specialization on decentralized, multi-modal data. The core research objective is to design efficient and robust federated schemes that are resilient to the statistical and systemic heterogeneity inherent in this setting. Project outcomes will include novel algorithms, open-source software implementations, and comprehensive experimental validation.