SUPR
Satellite DNA and centromere evolution across bird phylogeny
Dnr:

NAISS 2024/22-310

Type:

NAISS Small Compute

Principal Investigator:

Valentina Peona

Affiliation:

Uppsala universitet

Start Date:

2024-03-20

End Date:

2025-04-01

Primary Classification:

10615: Evolutionary Biology

Webpage:

Allocation

Abstract

Satellite DNA is a type of repetitive element that is organised in homogenous tandem arrays and can be megabases long. In many organisms, satellite DNA forms the building blocks of centromeres (chromosomal structures that allow proper segregation of chromosomes during cell division). Because of its repetitive and homogenous nature, satellite DNA is extremely difficult to correctly assemble it in genome assemblies. Indeed, satellite DNA is one of the major causes of assembly fragmentation (e.g., in birds). Consequently, the thorough characterisation of the sequence diversity, structure, and chromosomal location of satellite DNA arrays from assemblies and sequencing data has been hindered by such limitations. In the last couple of years, our lab started to investigate such difficult part of genomes using a combination of sequencing technologies. We first focused on how to best assemble satellite DNA with different technologies then we focused on the study of its evolution at relatively short timescales in avian genomes and species. Recently (and thanks to this SNIC/NAISS project) we managed to investigate the diversity and evolution of satellite DNA across the phylogeny of Passeriformes birds (the vast majority of bird species). While doing so we also discovered which satellite DNA sequences belong to centromeres in some bird species and these data helped us to develop methods to identify the position of evolutionary new centromeres. Evolutionary new centromeres are centromeres that newly form in a species and in time fix in the population by mechanisms like centromere driving (similar to meiotic drive). The most interesting feature of these centromeres is that they shift into a new location without the mediation of chromosomal rearrangements therefore synteny between species is maintained. We are then interested in understanding what are the features of the genomic loci that host the new centromeres and how the repositioning affects genomic features like recombination rate. In addition, we want to understand how satellite DNA arrays that previously were associated with centromeres evolve and how the arrays for the new centromeres are established. Now we are expanding the research about neocentromeres to Darwin finches where the availability of high-quality genome assemblies and hundreds of re-sequencing libraries give us the possibility to find polymorphic centromeres (at different loci and/or polymorphic in sequence) segregating in closely related species. This allows us to characterise the diversity of centromeres and put that in association (if any) with species diversification. On top of this, with population data, we can see how polymorphic centromeres can establish and fixate in the population through centromeric drive and how this can be confounded as signals of selection and adaptation. For this project to continue, we then kindly ask for the continuation of this NAISS project (22-217) in order to finish the analysis and publish the associated papers and we need this NAISS project to support our research. The bioinformatic analysis required for this project are extensive but all the tools necessary are already installed on Rackham and Snowy.