University of Oulu

Impacts of data synthesis : a metric for quantifiable data standards and performances

Saved in:
Author: Chandra, Gunjan1
Organizations: 1University of Oulu, Faculty of Information Technology and Electrical Engineering, Computer Science
Format: ebook
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 3.6 MB)
Pages: 67
Persistent link:
Language: English
Published: Oulu : G. Chandra, 2020
Publish Date: 2020-06-30
Thesis type: Master's thesis (tech)
Tutor: Siirtola, Pekka
Reviewer: Siirtola, Pekka
Tamminen, Satu


Publicly shared data could unfold a wide range of innovative pedagogical and learning techniques. In the case of healthcare, open data could save lives. Consolidating medical data with lifestyle information can support possibilities for further development of current approaches towards medical diagnoses and treatments. It is critical to note that healthcare data contains sensitive information about patients and therefore, could lead to harmful consequences if such details reach the wrong hands. The use of the concept of data anonymisation for reducing the risk of disclosure to share data publicly is the standard practice. However, current data anonymisation techniques have failed multiple times in the past. The goal of this study is to evaluate the performance of an emerging practice for data sharing, by utilising a tool for data synthesis, termed Synthpop. The synthetic data is generated by executing the multiple imputation methods, although differently. This study describes and analyses Synthpop by establishing the data standards and measuring the impacts of the data synthesis process based on the utilities and quality of information contained in the data. The analyses reveal that synthetic data simulates original data by adequately preserving the utilities and quality of the information content.

see all

Copyright information: © Gunjan Chandra, 2020. This publication is copyrighted. You may download, display and print it for your own personal use. Commercial use is prohibited.