SUPR
Probabilistic multimodal synthesis with conditional flow matching
Dnr:

NAISS 2024/22-101

Type:

NAISS Small Compute

Principal Investigator:

Shivam Mehta

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2024-01-23

End Date:

2025-02-01

Primary Classification:

10209: Media and Communication Technology

Allocation

Abstract

Currently, Matcha-TTS is a fast TTS architecture based on conditional flow matching. Conditional Flow Matching can be considered a fast diffusion model, which opens the possibility to apply wherever diffusion has been successful including image synthesis and multimodal synthesis. The project aims to benefit from these advancements in the field of probabilistic synthesis and create a larger and stronger multimodal model.