Boreal forests are the largest terrestrial biome and store more carbon than temperate and tropical forests combined, with Swedish boreal forests being dominated by Norway spruce and Scots pine. As such, any changes to the productivity and functioning of this biome could have global scale consequences.
Norway spruce and Scots pine are, by far, the two most economically and ecologically dominant species in Sweden, but reference genomes have been lacking due to the challenges presented in assembling their extremely large genomes (20-25 Gigabases, 7-8 times the human genome in size) composed predominantly of repetitive elements.
This application concerns a 6-year 80 MSEK KAW strategic investment project at Umeå Plant Science Centre (UPSC) and Science for Life Laboratory (SciLifeLab) to maintain Sweden’s world-leading position in conifer genomics research and to facilitate new genomics-based breeding strategies to meet the future demands for efficient and sustainable forestry.
Using previous SNIC resources, we have now generated a high-quality (N50 > 10Mb), chromosome-scale scaffolded assembly of the Norway spruce. Within the next months we expect to finish a genome assembly of similar quality for the Scots Pine. A major undertaking of our study will now be to generate population-scale re-sequencing to allow the development and incorporation of modern genomic-based breeding strategies into the unique Swedish breeding programs, through genomic selection and enabling genome wide association studies to identify adaptive alleles and associated genes. Sweden has world-unique plant material and controlled crosses for the two dominant boreal tree species of interest.
For Norway spruce, large genetic resources are available, including 650 field progeny trials. Similarly, an intensive breeding program has been initiated for Scots pine with an initial collection of about 6000 elite trees from across Europe and the development of 24 breeding populations targeting climate zones across Sweden.
In an ongoing phase of our project, the one that we are applying now for access to medium computing allocation, we are performing whole-genome re-sequencing of a total of 700 spruce and 300 pine trees, to (a) infer the genetic basis of important conifer traits by genome-wide association studies, and (b) perform population genomics studies, and (c) design high-quality genotyping arrays for genotyping the entire breeding population for Norway spruce and Scots pine in Sweden and develop models to reduce breeding cycles by genomic selection.
This will create an unprecedented detailed map of the genetic landscape in all Swedish conifer forests, which will aid conservation biology and provide the basis for a better understanding of the links between genotype and phenotype. We have so far generated whole-genome sequence data from the first 415 trees, which has been pre-processed and quality-controlled. In parallel, we have established efficient pipelines in preparation for the downstream full-scale population analyses of both species using the final genome assemblies, which will be conducted during 2021 and 2022.