Continual Self-Supervised Learning with Diffusion Models

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2025/22-1217

Type:

NAISS Small Compute

Principal Investigator:

Shuai Zhu

Affiliation:

Uppsala universitet

Start Date:

2025-09-17

End Date:

2026-10-01

Primary Classification:

10210: Artificial Intelligence

Webpage:

https://strategiska.se/en/research/ongoing-research/research-institute-phd-2022/project/12633/

Allocation

Alvis at C3SE: 1000 GPU-h/month
Mimer at C3SE: 500 GiB

Abstract

Advances in machine learning have driven remarkable progress across diverse domains, typically relying on cloud-scale training and deployment. However, real-world applications increasingly demand models that can adapt continuously to changing environments and evolving data distributions. This requires continual learning (CL) and self-supervised learning (SSL), two paradigms that together address the lack of labeled data and the risk of catastrophic forgetting. In this project, we focus on unifying SSL and CL to enable models that learn autonomously and adaptively over time. Self-supervised learning allows models to exploit vast amounts of unlabeled data, which is critical in many scenarios where annotation is costly or infeasible. Continual learning complements this by preserving past knowledge while integrating new information, making it possible to handle non-stationary or task-incremental data streams. Combining these two approaches is essential for long-term deployment, as models must both extract useful representations from unlabeled inputs and remain resilient to distribution drift. Despite rapid progress in each area separately, SSL and CL remain difficult to combine effectively: self-supervised models often degrade under sequential training, while continual learning methods struggle without labeled feedback. This research introduces generative models, particularly diffusion models, as a key enabler of continual self-supervised learning. Diffusion models have recently demonstrated state-of-the-art performance in generative modeling, with robust training dynamics and expressive capacity. Their U-Net backbones provide strong feature representations for SSL, while their generative ability makes them suitable for replay-based continual learning. By synthesizing past samples, diffusion models mitigate forgetting, and by learning representations from unlabeled inputs, they adapt to new tasks without supervision. This dual role positions diffusion models as an ideal candidate for advancing SSL-CL integration. We will design and evaluate a framework in which diffusion models serve as both representation learners and replay generators in a continual self-supervised pipeline. Experiments will be conducted on benchmark datasets such as CIFAR-100 and ImageNet subsets, under class-incremental and domain-shift scenarios. Evaluation will include standard CL metrics (accuracy, forgetting, transfer) and generation metrics (FID, SSIM, PSNR), enabling us to study the generation–representation trade-off: how the quality of generative replay influences downstream representation learning. Beyond algorithmic development, this project emphasizes resource efficiency and deployability. While much prior work assumes cloud-scale infrastructure, our methods will be designed with constrained settings in mind, such as IoT devices. We will study model compression, lightweight architectures, and efficient replay strategies to ensure feasibility under limited compute and memory. A real-world case study in IoT security will serve as the application domain, where devices must adapt continually to dynamic and adversarial environments without labeled data. The expected outcome of this project is twofold: (1) new insights into the interplay between generative modeling and continual self-supervised learning, and (2) practical methods for resource-constrained adaptive intelligence. By leveraging diffusion models as both learners and generators, this research will advance the theoretical foundations of SSL and CL, while also delivering deployable solutions for privacy-preserving, adaptive, and secure AI systems.