22 March 2024

DKK 11 million for developing synthetic health datasets

Data science

A research project with researchers from the University of Copenhagen and Aalborg University has received DKK 11.3 million from the Novo Nordisk Foundation to develop and test methods for creating synthetic health datasets, a promising method that helps protect patient privacy.

Jennifer Bartell and Anders Krogh.
"Our project is about developing and testing methods for creating synthetic health datasets, a promising approach to sharing insights encoded in personal health data while protecting patient privacy," says researchers behind the project. Photos: UCPH.

The Novo Nordisk Foundation has granted DKK 11.3 million for a project that focuses on synthetic health data sets that will hopefully accelerate training and prototyping of computational models within health data research. Synthetic health data sets are sets of data that are not actually patient data but non-sensitive computer generated data that can be shared easier and be used for testing and developing models.

The project is called SE3D: Synthetic health data: ethical development and deployment via deep learning approaches and is a collaboration between the University of Copenhagen and Aalborg University. It is a part of the Novo Nordisk Foundation Collaborative Research Programme in data science.

About the project

What is your project about, and why is it important?

Our project is about developing and testing methods for creating synthetic health datasets, a promising approach to sharing insights encoded in personal health data while protecting patient privacy. While we do not expect synthetic datasets to replace sensitive datasets in support of research findings, we hope to facilitate early stage project design and hypothesis generation, provide methods for a more efficient application process for data access, and accelerate training and prototyping of computational models within health data research.

What can you do with the grant?

The grant supports a 4 year collaboration between researchers at Aalborg University and the University of Copenhagen who specialize in biostatistics and precision medicine, generative modeling, and regulatory concerns and the GDPR. We will be hiring several PhD and postdoctoral trainees to develop synthetic datasets and privacy evaluation metrics, assess privacy risk from a regulatory perspective, and perform thorough comparisons of these factors across different benchmark datasets to establish usage guidelines for synthetic health datasets.

What do you hope that your grant will change in the future/for future research?

We hope to provide new computational methods for synthetic dataset creation that better address the challenges of working with real-world health data, are well constructed for GDPR compliance, and produce fit-for-purpose synthetic datasets that are of sufficient quality for meaningful data exploration and model prototyping. We aim to protect patient privacy while realizing the potential of synthetic datasets in supporting data-driven research in the health sector.

 

 

Contact

Professor Anders Krogh
anders.krogh@sund.ku.dk

Data Scientist/Special Consultant Jennifer Bartell
bartell@sund.ku.dk

Press Officer Sascha Kael Rasmussen
sascha.kael.rasmussen@sund.ku.dk
+45 93 56 51 68

Topics