SUPR
Earth Biogenome Project collaborations
Dnr:

NAISS 2024/5-420

Type:

NAISS Medium Compute

Principal Investigator:

Henrik Lantz

Affiliation:

Uppsala universitet

Start Date:

2024-08-30

End Date:

2025-09-01

Primary Classification:

10610: Bioinformatics and Systems Biology (methods development to be 10203)

Secondary Classification:

10203: Bioinformatics (Computational Biology) (applications to be 10610)

Tertiary Classification:

10615: Evolutionary Biology

Webpage:

Allocation

Abstract

Our team is part of the bioinformatics platform at SciLifeLab (NBIS), and a large part of our work involves working with genome assembly in national and international projects. We assemble complete genomic sequences for organisms where this has not been done before, a task that is beyond most research groups, and we support the research community with this expertise. Our projects involve a huge variety of organisms, including fish, fungi, worms, insects, and mammals. We will be using this compute project to run genome assembly analyses in two funded projects: VR-EBP and Biodiversity Genomics Europe (BGE). VR-EBP received funding in the 2020 VR-call "Increased accessibility to existing infrastructures". The title of the application was "A Swedish Earth Biogenome Project platform: building a pipeline and proof of principle studies" and is driven by the NBIS and NGI platforms at SciLifeLab together with several researchers in Uppsala and Stockholm. There are also external partners in the project where the efforts of "increased accessibility" are aimed, including SVA and the Swedish Agency for Marine and Water management. The funding is for 4 years, ends at the end of 2024. BGE is a European project funded through a call in Horizon Europe. It consists of two streams of which we are involved in European Reference Genome Atlas (ERGA). We (through Uppsala University and SciLifeLab) are funded to assemble genomes of European species, most of which are threatened or found in Biodiversity hotspots. Runs until the end of 2025. The work will be performed by staff at the NBIS platform at SciLifeLab. All of the assembled genomes will also be reported in Earth Biogenome Project and will contribute to the global aim of assembling all eukaryote species on Earth. The data used will be mostly be long read PacBio Hifi data, and Illumina RNA-seq to be used in annotation. We are mostly working on species from Spain and Slovenia at the moment, of which the stone crayfish Austropotamobius torrentum is influencing our work quite a bit as the genome is huge at 16 Gbp, i.e., more than 5x the size of the human genome, and this greatly increases our need for compute resources and storage. In VR-EBP there is also a population genomics component, and we will be working with Swedish samples and use the results to determine population structure and also use them as a basis for decisions in conservation efforts. Note: Of importance for this proposal is that we have a deadline Sep 30 2024, so very soon. The deadline is for the EU-funded BGE project, where the whole project on a European level needs to deliver 100 Gbp of assembled genomes by Sep 30. For us to deliver what we have promised by Sep 30, we of course need compute resources, and we expect to burn a lot of hours as soon as the data is made available to us.