Depression is a leading cause of global disability, yet its clinical assessment remains resource-intensive and prone to rater variability. Automated detection of depressive symptoms from clinical interview transcripts offers a scalable complement to traditional assessment — however, existing approaches produce point estimates without quantifying predictive uncertainty, limiting their trustworthiness in high-stakes clinical settings.
This project develops Probabilistic Textual Time Series Depression Detection (PTTSD), a framework that models sequences of conversational utterances within a probabilistic deep learning paradigm. Rather than producing single predictions, PTTSD outputs calibrated distributions over depression severity scores (PHQ-8/PHQ-9), enabling clinicians to distinguish confident predictions from uncertain ones. The architecture combines transformer-based language encoders with probabilistic output heads trained under heteroscedastic objectives, capturing both aleatoric and epistemic uncertainty. Temporal structure across interview turns is modelled hierarchically, aggregating utterance-level representations over the session.
Experiments benchmark PTTSD against deterministic baselines on established clinical interview datasets, evaluating predictive performance, calibration, and robustness — including on raw unprocessed transcripts as encountered in real clinical deployments. Training large transformer variants (RoBERTa-large, DeBERTa-v3) with ensemble-based uncertainty estimation across multiple configurations requires substantial GPU compute that cannot be met by local resources.
Findings will be submitted to machine learning for health venues (ML4H, CHIL, EMNLP), contributing methodological advances toward the responsible deployment of AI in mental health care.
Supervisor: Vladimir Vlassov, KTH