De novo assembly of the Nemertoderma westbladi (Xenacoelomorpha) genome

NAISS 2023/22-853


NAISS Small Compute

Principal Investigator:

Ulf Jondelius


Naturhistoriska riksmuseet, Stockholms universitet

Start Date:


End Date:


Primary Classification:

10612: Biological Systematics



This project is supported by the Swedish Research Council VR grant to UJ (2018-05191, Urmaskarnas evolution). We intend to assemble the genome of Nemertoderma westbladi from Illumina and PacBio data obtained in 2015 and 2016 at SciLife lab. At the time we had difficulties assembling the genome and did not achieve a satisfactory result in spite of bioinformatics support through a WABI grant. We have been working with SciLife Uppsala to generate Pacbio HiFi sequence data from this microscopic species, and we are expecting delivery of this data during September 2021. We intend to apply WGA tools to the combined Illumina and PacBio data to improve on the previous results. Nemertoderma westbladi is part of an ancient animal group, Xenacoelomorpha, that forms the sister clade of all bilaterian animals. There is no reference genome for Nemertoderma, and the assembly of its genome would enable us to study the genomic basis for the evolution of complex traits such as brain, circulatory, and excretory systems in animals, e.g. through comparative analyses of gene content in Nemertoderma vs other bilaterian animals and cnidarians. A main aim of the VR-project is to generate robust hypotheses regarding the phylogeny, rates of evolution and clade age of Xenacoelomorpha. Our approach to this is through sequencing transcriptomes from a wide sample of Xenacoelomorpha species and the data is then used for phyogenomic analysis. The first year of the project has been devoted to generating and refining the transcriptome data. We are now ready to start the phylogenomic analyses, and we foresee an increased need for processor hours as the parameter rich nucleotide substitution models implemented in e.g. PhyloBayes are computationally very demanding. Furthermore we will need to perform a set of sensitivity analyses where the dataset is modified in various ways and then reanalysed, which multiplies the resource requirements.