Identifying the genetic basis for phenotypic features that define and differentiate us as individuals among other humans has been of interest for decades. Starting with using genetic information to predict the biological gender to other striking features, such as hair, eye and skin color are all of high interest. The prediction of phenotypic features based on genetic data finds applications in various fields, such as medicine, anthropology and forensics.
Despite being investigated for years, predicting most traits comes with low accuracy. This is the consequence of different genetic and environmental contributions interacting in complex and multi-faceted ways to form and modify phenotypic traits. These interactions differ between traits and can result in different levels of heritability and thus predictability among traits.
While more and more single features are investigated, the ultimate feature representing us individuals, and that we are recognized by, are our faces. Faces constitute of many connected and interacting features with different functions, making the face a particularly complex feature to investigate. Recent studies exploring the link between genotypes and faces are thus very limited and highly influenced by potential confounders, such as biogeographic ancestries of individuals.
In this pilot study, we want to conduct initial tests to contribute to the understanding of different phenotypic features, including complex facial features, and the applicability of trait prediction tools. For this, we want to (1) test and compare different genetic data preprocessing steps using publicly available genetic data from open databases (such as 1000 genomes project data). We want to test imputation performance when modelling datasets (using the public data) with missing data. We want to test imputation on computationally modelled DNA profiles from two-person mixtures or having artificially introduced sequencing errors. (2) We further want to make initial tests to compare different methods (using photogrammetry and 3D scanning) to generate 3D models of faces and obtain quantitative values describing faces. (3) We want to do preliminary tests of current phenotypic trait prediction models (including categorical traits, such as hair, eye and skin color, but also complex traits such as facial features) using the preprocessed publicly available genetic data.