Female scientist using a touchscreen computer while working in a lab

For Patient Data, Synthea Is the "Missing Piece" in Health IT

MITRE-designed Synthea™ gives medical communities artificial yet realistic patient data and tools to innovate for better outcomes, including combating COVID-19.

Electronic health records (EHR) hold great promise for unlocking secrets to solve diseases and to improve our health in general. However, necessary privacy regulations and security concerns make it hard for software developers to obtain access to real patient data. Researchers and medical professionals need this data to evaluate new treatment models or develop clinical decision-support tools to improve medical results.

That's why MITRE created Synthea™. A contraction of "synthetic" and "health," Synthea is an open-source data generation and disease modeling tool. It fills a critical gap by giving medical communities access to the data they need to do discovery, analysis, and development, while meeting patient record privacy and security requirements. Synthea does this by generating a synthetic population of mock patients. The "histories" of these synthetic patients are based on clinical care data, academic research, demographic data, and disease incidence and prevalence statistics.

Synthea provides developers and researchers usable, realistic data and human disease models. It's helping them advance their work in finding treatments and cures for diseases.

Responding to COVID-19

Developers and researchers are employing Synthea in their efforts to address the COVID-19 epidemic.

"Members of the Synthea community clamored for it, so we built a COVID-19 module to generate data they can work with to respond to the virus," reports Jay Walonoski, who leads the Synthea team.

Synthea data is being used in COVID-19 virtual challenge hackathons across the country, including one held by the Massachusetts Institute of Technology. Another project uses Synthea data to model the progression of COVID-19 and track patients.

Open-source software makes it easy for community members to use each other's contributions, and the realistic synthetic data provided by Synthea fills the gap in access to real health records.

"We're making Synthea available for people who are trying to create health IT solutions to the epidemic," Walonoski says.

Enabling the Sharing of Health Records

One crucial element in achieving EHR interoperability is a standardized application programming interface (API). An API is a software intermediary that allows applications to interact with each other and exchange information.

"That's where we come in, from a testing perspective," says MITRE's Rob Scanlon, a software systems engineer who is using Synthea data to build healthcare API certification systems. "Synthea generates the test data that developers need to build and test APIs.

"If you're building a system you need to have test data, but patient data is very complex," he explains. "It's not something a developer can sit down and mock up in a few hours, given the work required and the privacy and security concerns. It helps if they have a tool that can generate test data for them.

"Synthea generates data in a standard way. That's driving developer interest in using it, because right out of the box they get the data they need."

Providing the "Missing Piece for Health IT"

"The question among health startups has always been, 'Where can we get high-quality data to understand how to create applications and test them once they are created?'" says Sunnie Southern, vice president of Health and Life Sciences for Onix.

Google Cloud asked Onix to use the Synthea platform to build tools that help startups, developers, and data scientists take advantage of Google's Cloud Healthcare API.

"It's an ideal use case," she says. "You have a platform that needs to be configured so that developers can leverage that platform with clinical data—but getting access to that data is difficult and takes time and approvals. Synthea is helping us to overcome those obstacles."

EHR interoperability also requires connectors between different systems, she notes. Being able to have this realistic data also allows developers working between systems to set up and test their tools.

Southern concludes: "Synthea has been the missing piece for health IT since health IT began."

Using Synthea within the Federal Health Sphere

Chris Brossart, MITRE's Health Transformation Portfolio Director, says agencies within the Department of Health and Human Services (HHS) are starting to use Synthea.

For example, the Centers for Disease Control and Prevention (CDC) has adopted Synthea for one of its key initiatives.

The CDC is on a mission to improve data capacity and address childhood obesity more effectively through its Childhood Obesity Data Initiative (CODI). To do this, the agency needs data to understand the problem and arrive at potential solutions.

MITRE's Andy Gregorowicz enhanced Synthea to create a data set that reflects pediatric growth curves. In the first pilot, he created representative data for children ages 2 to 19 in the Denver metropolitan area.

"We then created simulations of what happens when kids go into—or don't go into—clinical weight-loss programs, including the range of successful and unsuccessful outcomes," he says.

"Now, we have data that supports obesity researchers. This provides a starting point for them to develop research questions," he adds.

"When ready, researchers will take the work refined on synthetic data and run it using real-world data to develop evidence-based recommendations for interventions and programs that can help children with overweight or obesity."

"Medicare and Medicaid alone have hundreds of different types of IT systems," Brossart observes. "HHS needs good-quality data that isn't patient-sensitive for testing their applications and services.

"Synthea fits that need."

—by Jim Chido