While hundreds of genetic loci have been associated with human complex diseases, >80% of these loci reside outside protein-coding genes. Understanding the human genome and the regulatory mechanisms likely affected by disease mutations, continues to be challenging. The Zoonomia Project (previously called 200 mammals) is investigating the genomics of shared and specialized traits in eutherian mammals. We have performed sequencing of 137 species and combined this data with ~110 already available species. From these assemblies we have generated a whole-genome alignment of 240 species of considerable phylogenetic diversity, comprising representatives from over 80% of mammalian families and including genome assemblies for 131 previously uncharacterized species. We find that regions of reduced genetic diversity are more abundant in species at a high risk of extinction, discern signals of evolutionary selection at high resolution and provide insights from individual reference genomes. By prioritizing phylogenetic diversity and making data available quickly and without restriction, the Zoonomia Project aims to support biological discovery, medical
research and the conservation of biodiversity.
Further analysis of this gigantic data set is now ongoing. We are comparing this new dataset of 200 mammals to systematically identify and study constrained (ultra-conserved elements (UCEs) and CTCF sites both of potential relevance for genome organization) and accelerated regions in humans, dogs and other key mammals. We will complement the constraint dataset with brain tissue and single cell RNA-seq as well as functional genomics data sets that will allow us to study the connection between functional constraint and genome regulation in a key tissue – the brain. We will also intersect the human and canine accelerated regions and key variants with human and canine disease loci affecting the brain. Starting with ALS, OCD and schizophrenia, we will use comparative genomics and functional assays, including CRISPR, to systematically delete/alter promoter and enhancer/insulator elements in/near candidate gene regions. We plan to elucidate causal variants and their downstream effects, including pleiotropic effects in other non-disease tissues. This work will lead to a deeper understanding of genome regulation, disease mechanisms and the effect of selection on disease.