NAISS
SUPR
NAISS Projects
SUPR
Probabilistic Word Embeddings with TensorFlow and Stan
Dnr:

NAISS 2025/22-1419

Type:

NAISS Small Compute

Principal Investigator:

Isac Boström

Affiliation:

Chalmers tekniska högskola

Start Date:

2025-11-01

End Date:

2026-11-01

Primary Classification:

10106: Probability Theory and Statistics (Statistics with medical aspects at 30118 and with social aspects at 50907)

Webpage:

Allocation

Abstract

Quantifying uncertainty in word embeddings is crucial for reliable inference from textual data, yet existing methods like bootstrap and mean-field variational inference are computationally intensive or make limiting assumptions. We explore alternative approaches, focusing particularly on Gibbs sampling via Polya-Gamma augmentation as our key contribution, alongside Laplace approximation and Hamiltonian Monte Carlo. Additionally, we address the challenge of non-identifiability in word embeddings. We study the effectiveness and accuracy of our method in simulation studies with known ground truth. Moreover, we use the MovieLens data set to study our methods' feasibility and accuracy on real data.