The Zoonomia Project (previously called 200 mammals) is investigating the genomics of shared and specialized traits in eutherian mammals. We have performed sequencing of 137 species and combined this data with ~110 already available species. From these assemblies we have generated a whole-genome alignment of 240 species of considerable phylogenetic diversity, comprising representatives from over 80% of mammalian families and including genome assemblies for 131 previously uncharacterized species. We find that regions of reduced genetic diversity are more abundant in species at a high risk of extinction, discern signals of evolutionary selection at high resolution and provide insights from individual reference genomes. By prioritizing phylogenetic diversity and making data available quickly and without restriction, the Zoonomia Project aims to support biological discovery, medical research and the conservation of biodiversity.
Using the Zoonomia dataset, we identified and studied constrained and accelerated sites both
of potential relevance for genome organization and accelerated regions in dogs; we ran custom pipelines identifying dog-specific regions of divergence of the genome, or Canine Accelerated Regions (CAR), on an updated version with the newest canFam4 dog genome plus five other canid assemblies. The same method was also applied to identify potential accelerated region candidates for Narwhal and Cairo spiny mouse. Those results require further curation and we are now assessing the quality of the alignment, and correlating them with known information from the Dog10K project variants to substantiate our findings.
Lately, we have been working to update, improve and perform large scale multiple sequence alignments to increase the quality of our data. The project has also been used to test theories linked to the “unwanted transcript hypothesis” published in January 2024.
We will continue our work to find accelerated regions in the dog, narhwal and acomys as well as combining our Zoonomia phyloP scores with other projects such as our in progress canine brain atlas and disease gene mapping. For these projects improved phyloP scores including primate constraint will be beneficial. Generating novel alignment and constraint scores is compute/storage intense.