Digital data

MITRE-Created Synthea™ Designated a "Digital Public Good"

By Mike Murphy

Originally designed to meet a pressing need within the health information technology community, MITRE's Synthea produces realistic, artificial patient data to power medical research and innovation. It's now been honored for its contributions to creating a more equitable world.

Using artificial patient data provides medical communities realistic data to innovate for better patient outcomes. The Digital Public Goods Alliance (DPGA) recognized the public interest impact and added Synthea™ to its DPG Registry. The DPGA’s mission is to promote digital public goods to create a more equitable world. 

Synthea was developed in 2018 as a MITRE independent research and development (IR&D) project to solve an issue plaguing the health IT community: the lack of realistic patient data to drive innovation. Synthea produces realistic, but not real, patient population data sets for testing and research.

Governed by an international board including the United Nations Development Programme and UNICEF, the DPGA works to accelerate the attainment of the Sustainable Development Goals in low- and middle-income countries by facilitating the discovery, development, use of, and investment in digital public goods. The designation of digital public good confirms Synthea meets the DPG Standard which includes open source principles and best practices, as well as its potential to tackle global challenges.

“We’re honored by the recognition that the Digital Public Goods Alliance has bestowed upon Synthea,” says MITRE’s Chief Medical and Technology Officer, Dr. Jay Schnitzer. “Creating this free, publicly available, and open source synthetic patient data generator supporting medical research and innovation underscores our commitment to protect and promote the well-being of the nation.”

“Synthea doesn’t produce deidentified health data,” says Jason Walonoski, MITRE’s senior principal, Health and Life Sciences, and co-creator of Synthea. “Instead, Synthea uses population and health statistics to create a realistic, but absolutely not real, population data set from the ground up.

"There is no risk of data spillage or privacy concerns with this method because as ‘real’ as these records appear, they’re created from a model that researchers can adjust to meet their specific needs.”

Creating this free, publicly available, and open source synthetic patient data generator underscores our commitment to protect and promote the well-being of the nation.

Dr. Jay Schnitzer, SVP, Chief Medical and Technology Officer

Open Source Synthea Embraced and Advanced by Broad Health Community

In 2020 the Veterans Health Administration Innovation Ecosystem and Food and Drug Administration used the Synthea Patient Generator to power the COVID-19 Risk Factor Modeling Challenge. The precisionFDA Challenge leveraged 147,000 synthetic patients “to develop and evaluate computational models to predict COVID-19 related health outcomes in Veterans.” Subsequently the U.S. Department of Veterans Affairs designated Synthea a Technical Reference Model.

In 2021 the Department of Health and Human Services’ Office of the National Coordinator for Health Information Technology’s Synthetic Health Data Challenge awarded $100,000 to teams that developed “novel solutions to further cultivate Synthea capabilities and the synthetic data it generates for healthcare and research purposes.”

Independent R&D Addresses Gaps in Research

Gartner estimates that by “2030, synthetic data will completely overshadow real data in AI models.”

Walonoski foresees a similar trend given “synthetic data has been shown to improve predictive models by providing supplemental training data for scarce data sets, such as modeling a rare health condition.

"But synthetic data can also be used where it is important not to embed individual information in the model, such as personal health data. Synthetic health data is getting better and better, and we’ll increasingly see these kinds of use cases.”

MITRE’s IR&D experts work with government agencies to identify their hardest problems that could be solved through R&D. Its research is often focused on the research gaps that no one else is filling. The IR&D program develops new ways to use technology, creating tools, processes, and approaches that help agencies carry out their missions and ensure a safer world.

Join our diverse community of innovators, learners, knowledge-sharers, and risk takers. View our Job Openings.