SUPR
evolang@uu
Dnr:

NAISS 2024/22-117

Type:

NAISS Small Compute

Principal Investigator:

Yingqi Jing

Affiliation:

Uppsala universitet

Start Date:

2024-04-01

End Date:

2025-01-01

Primary Classification:

60201: General Language Studies and Linguistics

Webpage:

Allocation

Abstract

This project is used to develop Bayesian phylogenetic models for detecting the evolutionary trajectories of typological features in the history of language families. We rely on cross-linguistically annotated corpora like Universal Dependencies and time-calibrated phylogenies from Indo-European or Uralic to model the character histories. To achieve this, we first need to validate our phylogenetic models via simulation-based experiments and then apply these models to real world data so as to examine the evolutionary biases in language use. Both the simulations and modelling processes are time and resource-expensive, which often requires the samplers (Hamiltonian Monte Carlo or No-U-Turn sampler) to explore the high-dimensional parameter space to optimize the loss or likelihood function. This can be more challenging when we want to make use of large-scale corpus data and thousands of phylogenetic trees to fully capture the uncertainty of tip data, tree topology and branch lengths. This kind of research work is not doable on a personal computer, and it is necessary to run the analyses on a cluster to parallel the whole procedures so that the results can be obtained in a reasonable time.