Skip to content
AdminNov 30, 20231 min read

Synthetic Data Summit 2023 | Session 3: Applications of Synthetic Data in the Life Sciences Industry I


Title: Applications of Synthetic Data in Supporting Data Products and Analytics for Real World Evidence

Speaker: Leonardo D’Ambrosi Senior Lead Data Scientist, Bayer

Abstract: Synthetic data generation is an emerging technology that offers several advantages over traditional “real” data. These advantages include preserving privacy, augmenting data to enhance machine learning model accuracy, simulating scenarios for algorithm testing, and even mitigating bias. This presentation will delve into the insights gathered from using synthetic data to facilitate the creation of healthcare data products and analytics in the context of Real World Evidence. It will also discuss its applicability, opportunities, and challenges, including resistance to adoption.

Title: Exploring the Potential of Synthetic Data in Clinical Research: Applications, Benefits, and Challenges

Speaker: Jan Seidel Principal Methodology Statistician, Boehringer Ingelheim Pharma

Abstract: In clinical research, obtaining adequate data quantity and quality is often a challenge. Synthetic data, which possesses the same statistical properties as a specific real patient population, can help address these issues. When used correctly, this type of data can serve as a valid representation of the target population, offering various analytical and scientific benefits. Although there are similarities between simulation methods, significant differences indicate that data synthesis should be viewed as a complement to, rather than a component of, such traditional techniques. Synthetic data can not only enhance the statistical power of analyses by including additional information but can also enrich specific patient subgroups, for instance extreme or rare cases. Furthermore, synthetic methods can be employed to extrapolate to similar yet distinct real patient populations. Since no actual patient information is involved, synthetic data can be shared with others, promoting result communication, increasing confidence in findings, and enhancing knowledge gained from clinical analyses. Various concepts and approaches exist for generating this type of data, namely from Bayesian statistics and, more recently, generative Machine Learning. This talk will provide an overview of both the potential and the pitfalls of synthetic data applications in clinical development.



Following the acquisition of Replica Analytics by Aetion, the generative AI technology previously known as Replica Synthesis is now Aetion® Generate and continues to create privacy-enhancing synthetic data.