Evidence Hub

Evaluating the Utility and Privacy of Synthetic Breast Cancer Clinical Trial Data Sets

Written by Admin | Nov 1, 2023

This paper published in the Journal of Clinical Oncology: Clinical Cancer Informatics describes a study evaluating synthetic data generation on diverse breast cancer clinical trial datasets. We present a quantitative methodology for evaluating the replicability of analyses using synthetic data. We evaluate two common/defensible privacy metrics: attribution and membership disclosure. We compare performance of three types of generative models. The results from replicating 8 clinical trial analyses show generative models can produce high utility and high privacy datasets. The study was performed with colleagues at the Ottawa Hospital and collaborators across Canada/US.