Probabilistic Textual Time Series Modelling for Clinical Depression Detection

SUPR uses JavaScript for certain functions. We cannot guarantee that you will be able to use the system with JavaScript disabled.

Dnr:

NAISS 2026/4-103

Type:

NAISS Small

Principal Investigator:

Fabian Schmidt

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2026-03-05

End Date:

2027-01-01

Primary Classification:

10208: Natural Language Processing

Webpage:

Allocation

Mimer at C3SE: 500 GiB
Alvis at C3SE: 250 GPU-h/month

Abstract

Depression is a leading cause of global disability, yet its clinical assessment remains resource-intensive and prone to rater variability. Automated detection of depressive symptoms from clinical interview transcripts offers a scalable complement to traditional assessment — however, existing approaches produce point estimates without quantifying predictive uncertainty, limiting their trustworthiness in high-stakes clinical settings. This project develops Probabilistic Textual Time Series Depression Detection (PTTSD), a framework that models sequences of conversational utterances within a probabilistic deep learning paradigm. Rather than producing single predictions, PTTSD outputs calibrated distributions over depression severity scores (PHQ-8/PHQ-9), enabling clinicians to distinguish confident predictions from uncertain ones. The architecture combines transformer-based language encoders with probabilistic output heads trained under heteroscedastic objectives, capturing both aleatoric and epistemic uncertainty. Temporal structure across interview turns is modelled hierarchically, aggregating utterance-level representations over the session. Experiments benchmark PTTSD against deterministic baselines on established clinical interview datasets, evaluating predictive performance, calibration, and robustness — including on raw unprocessed transcripts as encountered in real clinical deployments. Training large transformer variants (RoBERTa-large, DeBERTa-v3) with ensemble-based uncertainty estimation across multiple configurations requires substantial GPU compute that cannot be met by local resources. Findings will be submitted to machine learning for health venues (ML4H, CHIL, EMNLP), contributing methodological advances toward the responsible deployment of AI in mental health care. Supervisor: Vladimir Vlassov, KTH