Part 1: Ensuring Data Quality for HIE Querying

In this blog post, we’ll cover the quality processes that occur when patient data is retrieved from Health Information Exchanges (HIEs). This is the first post of a two part blog series. Part two will discuss quality processes that occur post data retrieval.

In the landscape of data-driven decision-making, the journey begins long before analysts delve into complex algorithms or impressive visualizations. It starts at the very foundation: data retrieval. Picture this stage as the excavation site where raw material is unearthed, awaiting refinement into a valuable commodity. But just as miners sift through tons of ore to find precious gems, data retrieval involves meticulous processes to ensure the purity and integrity of the information extracted.

As a company, our mission is to unlock the power of medical records in an intelligent platform that focuses health back on the patient. To realize this vision, we’ve made substantial investments in our data quality at each stage: retrieval, processing, and output. In this blog post, we’ll cover our quality processes that occur during data retrieval. This is part 1 of a 2 part blog series, where part 2 will discuss quality processes that occur post data retrieval.

Geographic and Facility Coverage
We’ll start with where we get our data from. Currently, we’re connected to the three national networks (Carequality, Commonwell, and eHealth Exchange). We recently expanded our coverage to include Healthix, the public HIE for downstate New York, after we noticed that some downstate New York facilities were not included in our coverage via the national networks. Over time, we will likely expand to include additional state HIEs to address critical gaps in geographic coverage. This wide net allows us to give healthcare providers a more complete view of a patient's medical history, regardless of where they received care.

Streamlined Data Retrieval with Record Locator Service
One of the core components of our data quality strategy is our proprietary Record Locator Service (RLS), which allows us to pinpoint the exact location of patient records across various networks. We invested heavily in this area to ensure we maximize our ability to strategically locate records, while being careful not to over-query the networks. In evaluations with customers, we typically capture about 10-15% more data than traditional solutions.

Enhanced Patient Identification with EMPI
The Enterprise Master Patient Index (EMPI) is a key service we use as part of the query process. By accurately identifying patients, the EMPI prevents duplicate records and ensures that all data retrieved is associated with the correct individual. This is essential for both the integrity of the data and the safety of patient care.

Address Verification
Accurate data starts with accurate inputs. We use a service that helps us verify and normalize patients' addresses. This ensures that all location data we collect and store is precise, which is especially important when tracking patient information across multiple care settings.

Record Validation
Historically, we’ve observed situations where the networks returned records incorrectly. This poses a risk to patient safety and confidentiality. As a result, we conduct an additional check to verify that the patient data demographics inbound from the customer match the patient data demographics in clinical documents obtained from the EMRs.

Network Health Monitoring
To ensure that we’re always getting all the necessary data, we closely monitor network error rates. We carefully tune our retries, and timeouts to ensure we get all the data we can. We also regularly communicate with other network implementers when we see issues.

Prompt Adoption of New Network APIs
To ensure we get the most from the networks, we participate in the committees of the networks, and adopt any new APIs promptly after release.

Robust Logging & Metrics
A data quality program encompasses more than just ensuring the integrity of data; it also involves promptly addressing concerns from customers and other network participants, as well as troubleshooting issues to enhance service quality. We also leverage internal dashboards and other tools that allow us to better understand performance metrics, which ultimately helps ensure good service.

Getting As Much Data As Possible
We tune our timeout and retries to ensure we get as much data as possible.

Although not directly related to data quality, availability holds paramount importance for our customers given the substantial volume of files we retrieve monthly, numbering in the tens of millions. Our system encounters various querying patterns, ranging from querying one patient at a time, to receiving batches of several hundred patients or more particularly when customers seek information on patients with upcoming appointments. Leveraging our event-based architecture, we ensure high scalability, effortlessly accommodating spikes in query volumes without compromising processing speeds.

When discussing data and data quality, it’s always important to note limitations. Here are a few:

  • While the overwhelming majority are, not all providers are part of health networks yet and not all EHRs support connectivity. This means we can’t always access every piece of data through our HIE connections.
  • Some data exists in less accessible formats like PDF or TIFF files, which can be challenging to integrate seamlessly.
  • Mobility of populations also poses a significant challenge, as patients moving from one location to another may have fragmented health records across various systems. Data on the networks is indexed, meaning you have to know exactly where to look to successfully find data. Since we can’t query every provider for each patient due to network limitations, that may cause gaps - even with our sophisticated Record Locator Service.

In conclusion, maintaining high-quality data when retrieving records from health information exchanges involves a combination of advanced technology, careful monitoring, and constant adaptation to new challenges.

Stay tuned for our next post to learn more about what quality measures we take once data has been retrieved.