Artificial Intelligence‐Generated Synthetic Data in Healthcare: A False Promise for Underserved Populations?

Stéphanie Baggio

Published online on May 19, 2026

Abstract

["Bioethics, EarlyView. ", "\nABSTRACT\nArtificial intelligence (AI)‐generated synthetic data has emerged as a promising solution to address the underrepresentation of underserved populations in medical AI systems. By artificially generating data that mimics real‐world patient information, proponents argue that AI‐generated synthetic data can fill data gaps, improve algorithmic fairness, and mitigate bias without requiring costly data collection or raising privacy concerns. However, in this article, I challenge this optimistic view. I argue that AI‐generated synthetic data may amplify existing biases rather than mitigate them, and complicate informed consent processes in ways that disproportionately harm underserved populations. I examine two critical ethical dimensions: (1) how the opacity inherent in synthetic data generation can reproduce and amplify discriminatory patterns present in the source data, and (2) how AI‐generated synthetic data complexifies informed consent. I conclude that while AI‐generated synthetic data offers apparent technical advantages, it fails to address, and may worsen, data disparities affecting underserved populations. Rather than pursuing synthetic solutions, I believe that we should address the structural barriers that prevent genuine inclusion of underserved populations in medical research.\n"]