Evolutionary constraint non-coding regulatory driver mutations in pan-cancer genomes

NAISS 2024/6-147


NAISS Medium Storage

Principal Investigator:

Sergey Kozyrev


Uppsala universitet

Start Date:


End Date:


Primary Classification:

30107: Medical Genetics



The current project aims to evaluate the regulatory potential of the constraint non-coding driver mutations identified in the pan-cancer whole genome dataset. The majority of previously detected cancer mutations are protein-coding somatic driver mutations, while the role and impact of non-coding regulatory mutations has not been fully examined. The main challenge of studying non-coding mutations is to reliably identify functionally significant mutations among the numerous changes occurring in a large non-coding space. We analyzed WGS data from 2,539 cancer genomes collected by the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and intersected the non-coding regions of the genome with the Zoonomia phyloP scores, a constraint metric from sequence alignment of 240 mammalian species, to identify non-coding constraint mutations (NCCMs) with regulatory potential. We limited our analysis to introns, UTRs and +/- 100 kb of the flanking regions. We then selected 30,000 NCCMs for functional evaluation using a massively-parallel reporter assay (MPRA). To generate the MPRA reporter library, we ordered oligo synthesis from Agilent and cloned oligos into standard plasmid vectors. The MPRA library will be transfected into various cancer lines, and after RNA purification and cDNA synthesis, reporter expression will be analyzed by high-throughput sequencing on the NovaSeqX+ instrument. The NCCMs with the highest allelic differences between the reference allele and the mutation, will be further closely examined in the follow-up studies using cell culture and various assays with the goal to understand the biological consequences leading from mutations to tumorigenicity.