SUPR
Collaborative filtering for non-intrusive speech quality assessment
Dnr:

NAISS 2025/22-438

Type:

NAISS Small Compute

Principal Investigator:

Fredrik Cumlin

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2025-05-07

End Date:

2026-06-01

Primary Classification:

20205: Signal Processing

Webpage:

Allocation

Abstract

We propose to preprocess the BVCC dataset, released in the VoiceMOS Challenge 2022, with algorithms inspired by collaborative filtering, to train deep neural networks (DNN) for mean-opinion-score (MOS) prediction. The sparse data will be manipulated with memory-based algorithms such as cosine similarity and Top-N recommendation. With the preprocessed data, we will train a smaller DNN with end-to-end learning and a larger DNN with self-supervised learning. The predicted MOS will be used for evaluating the accuracy of the trained model. The results will be published in standard venues like Interspeech, ICASSP, and IEEE TASLP journals. We require computational resources to train the models in our proposed project, and are therefore applying for resources.