Biobank Data Anonymization and Analysis




Principal Investigator:

Beatrice Melin


UmeƄ universitet

Start Date:


End Date:


Primary Classification:

30599: Other Medical and Health Sciences not elsewhere specified



With a focus on understanding disease development, identifying early detection biomarkers, and investigating comorbidities, PREDICT aims to foster scientific collaboration, establish a robust platform for precision medicine, and pave the way for early disease risk identification and improved treatments. At the heart of PREDICT lie two key multidisciplinary data processing projects: 1. Anonymization of Data in Precision Medicine Research: Sensitive personal data generated from human biobank samples and medical records poses a significant challenge due to the stringent regulations governing data protection, particularly the General Data Protection Regulation (GDPR). To address this challenge, PREDICT's "Anonymization of Data in Precision Medicine Research" project aims to develop advanced machine-learning methods for anonymizing biobank data while maintaining data quality and minimizing the risk of re-identification. This will enable unrestricted sharing and recycling of biobank data, maximizing its research potential. 2. Fast, Low-Distortion Interactive Visualization of Precision Medicine Data: Current visualization methods for hypothesis generation and testing often rely on algorithms that introduce data distortion, hindering reliable online interactive data exploration. To address this limitation, PREDICT's "Fast, Low-distortion Interactive Visualization of Precision Medicine" project focuses on developing fast, reliable, and intuitive visualization models that allow researchers to seamlessly explore biobank data without the need for tedious data movement, storage, and modeling. This will make biobank data more accessible to a broader research community, facilitating the identification of significant structural features and enabling hypothesis-driven biomarker discovery. To accelerate the training and evaluation of machine-learning models for both anonymization and visualization tasks, PREDICT intends to leverage the NAISS SENS Small supercomputer, enabling the handling of larger training datasets and more complex models with increased parameter counts. This will significantly enhance the project's capabilities and accelerate the advancement of precision medicine research.