SUPR
Application of a Novel Telomere-to-Telomere Reference Genome in a cross-sectional Swedish Cohort
Dnr:

sens2022031

Type:

SNIC SENS

Principal Investigator:

├ůsa Johansson

Affiliation:

Uppsala universitet

Start Date:

2022-11-24

End Date:

2023-12-01

Primary Classification:

10203: Bioinformatics (Computational Biology) (applications to be 10610)

Allocation

  • Castor /proj/nobackup at UPPMAX: 20000 GiB
  • Cygnus /proj/nobackup at UPPMAX: 20000 GiB
  • Castor /proj at UPPMAX: 2000 GiB
  • Cygnus /proj at UPPMAX: 2000 GiB
  • Bianca at UPPMAX: 100 x 1000 core-h/month

Abstract

In March 2022, the Telomere-to-Telomere (T2T) Consortium released the first complete telomere-to-telomere assembly of a human genome: T2T-CHM13. Apart from filling in missing regions of the latest human genome assembly GRCh38 e.g., centromeres, the T2T Consortium promises better representation of variation in the human genome across ethnicities. SweGen is a cohort of 1000 people representing a cross section of the Swedish population. For all individuals, whole-genome sequencing (WGS) was performed. The WGS reads were mapped to GRCh37, the previous version of the human reference genome.In this study, we aim to explore the applicability of T2T-CHM13 as a reference for a cross-sectional cohort and compare both mapping and variant call quality to the existing dataset by adapting existing pipelines for use with the T2T-CHM13 reference. This will allow us to investigate the claims made by the T2T Consortium about the improvements their reference offers and provide information for labs who are looking to upgrade to a new reference from GRCh37, which is still widely used but mostly ignored in state-of-the-art research.