SUPR
Pangenomes and structural variants in birds
Dnr:

NAISS 2023/6-254

Type:

NAISS Medium Storage

Principal Investigator:

Jacob Höglund

Affiliation:

Uppsala universitet

Start Date:

2023-09-26

End Date:

2024-10-01

Primary Classification:

10615: Evolutionary Biology

Secondary Classification:

10609: Genetics (medical to be 30107 and agricultural to be 40402)

Tertiary Classification:

10610: Bioinformatics and Systems Biology (methods development to be 10203)

Webpage:

Allocation

Abstract

This project is to investigate the conservation genomics and fitness effects of structural variation in different bird species across the world, using both new pangenomic approaches with multiple de novo long-read assemblies, as well as read-mapping-based discovery of SVs using short-reads. The importance of specific demographic histories in shaping patterns of mutational load conferred by deleterious single nucleotide polymorphisms (SNPs) has received considerable attention in the recent past, few studies have investigated the corresponding fitness consequences of structural variation in distinct evolutionary lineages. Long- and short-read data is already available for two globally distributed ptarmigan species (Lagopus species).We have already performed heuristic-based filtering and rapid automated curation of short-read-discovered SVs callsets from102 re-sequenced individuals across two recently (~2 million years) diverged ptarmigan (Lagopus) species. Population genetic analyses of the resulting high-confidence SV callsets (27,644 deletions, 108 duplications, 96 inversions) reveal that the relative proportion of deleterious structural variants is consistently greater in small effective population sizes, but that the relative frequency of deleterious variants differs between population having experienced recent bottlenecks versus longer-term low Ne. Similar to SNPs, ratios of non-synonymous to synonymous polymorphisms in SVs are higher for historically small versus large populations, suggesting that many SVs may largely conform to nearly-neutral expectations. Future pangenomic approaches will continue these investigations in multiple newly sequenced and publicly available datasets.