With more than 93,000 species and a vast still undescribed diversity, the molluscan class Gastropoda is one of the most species-rich animal lineages on Earth. Despite its huge economical and ecological importance, systematics of Gastropoda is in the state of flux, and a robust phylogenetic hypothesis of the class is critically needed to facilitate future research and conservation. I propose to fill this gap by resolving phylogeny of Gastropoda based on a representative dataset of 60 complete genomes, and a novel integrative approach. First, we will assemble de novo four genomes representing crucial missing deep Gastropoda lineages. Traditional phylogenomic approaches rely on sequence comparison and tend generate conflicting or inaccurate resolutions due to homoplasy. To overcome this limitation, we will use gene order information to complement concatenation- and coalescence-based phylogenies.
The order in which genes are arranged on a chromosome is generally conserved across evolutionary scales, and its modifications at chromosome and sub-chromosome levels often generate irreversible phylogenetically informative gene linkages. By tracking origin of these linkages across the Gastropoda radiation, we will find decisive evidence for a single topology at challenging deep nodes. In doing this, we will be first to combine dense taxon sampling, complete genome data, and all-in-one approach to phylogeny inference in a hyperdiverse yet largely neglected animal lineage.
The project corresponds to first work package of a planned research, that I am applying for VR funding this year. The planned research will be carried out even if the application does not get funded by VR, as I am going to seek alternative funding sources. The project at NAISS I apply for will handle first set of data analyses – annotation of published genomes, for which only genomic fasta is available at NCBI. It includes computationally extensive tasks that can only be handled efficiently on a HPC powers.