Satellite DNA and centromere evolution across bird phylogeny

NAISS 2023/22-217


NAISS Small Compute

Principal Investigator:

Valentina Peona


Uppsala universitet

Start Date:


End Date:


Primary Classification:

10615: Evolutionary Biology




Satellite DNA is a type of repetitive element that is organised in homogenous tandem arrays and can be megabases long. In many organisms, satellite DNA forms the building blocks of centromeres (chromosomal structures that allow proper segregation of chromosomes during cell division). Because of its repetitive and homogenous nature, satellite DNA is extremely difficult to correctly assemble it in genome assemblies. Indeed, satellite DNA is one of the major causes of assembly fragmentation (e.g., in birds). Consequently, the thorough characterisation of the sequence diversity, structure, and chromosomal location of satellite DNA arrays from assemblies and sequencing data has been hindered by such limitations. In the last couple of years, our lab started to investigate such difficult part of genomes using a combination of sequencing technologies. We first focused on how to best assemble satellite DNA with different technologies then we focused on the study of its evolution at relatively short timescales in avian genomes and species. Recently (and thanks to this SNIC project) we managed to investigate the diversity and evolution of satellite DNA across the phylogeny of Passeriformes birds (the vast majority of bird species). While doing so we also discovered which satellite DNA sequences belong to centromeres in some bird species and these data helped us to develop methods to identify the position of evolutionary new centromeres. Evolutionary new centromeres are centromeres that newly form in a species and in time fix in the population by mechanisms like centromere driving (similar to meiotic drive). The most interesting feature of these centromeres is that they shift into a new location without the mediation of chromosomal rearrangements therefore synteny between species is maintained. We are then interested in understanding what are the features of the genomic loci that host the new centromeres and how the repositioning affects genomic features like recombination rate. In addition, we want to understand how satellite DNA arrays that previously were associated with centromeres evolve and how the arrays for the new centromeres are established. To address these points, we want to expand the investigation of satellite DNA sequences and arrays from Passeriformes to a larger set of vertebrate genomes for which now we have very high-quality genome assemblies thanks to the Vertebrate Genome Project. We then kindly ask for the continuation of this SNIC project (22-116) because what we found in the previous round opened up many new exciting lines of research and we need this SNIC project to support our research. The new bioinformatic analysis required for this project are extensive but all the tools necessary are already installed on Rackham and Snowy.