Electronic Medical Records can provide valuable insight on patient health risk. What can be done to increase their accuracy as a risk prediction tool?

The healthcare system in the United States is notoriously fragmented. Information sharing across care teams is prevented due to the historical tendency of using paper to track patient conditions, the need to protect patient information, and the disconnect between specializations despite the holistic nature of the body.

Electronic Medical Records (EMR) were developed and later adopted as part of the Health Information Technology for Economic and Clinical Health Act (HITECH) in 2009 in an attempt to resolve these issues. By maintaining a holistic account of an individual’s medical history across time, care teams can better interoperate to provide whole-person care rather than the fragmented short-term care that characterizes many patient experiences.

Limitless Potential and Countless Hurdles

As the use of EMRs grows, new applications are emerging. EMR data may now be used as a risk predictor for the development of future medical conditions.

Risk prediction is based on analysis by artificial intelligence (AI) algorithms that use a person’s medical history to calculate their probability of developing a condition. High risk scores are assigned to individuals who the AI determines to have a likelihood of developing an illness above a particular threshold. In turn, care teams can direct early prevention measures towards those in high risk groups to slow the development of their condition.

However, turning the EMR into a tool for reliable risk prediction is proving to be a significant challenge.

Data governance regulations that caused problems for providers during the paper era were amplified and further convoluted with the switch to electronic records.¹ In addition, poor user design has made data entry a burden, even compared to the formerly written records, while structured templates limit a provider’s ability to capture a patient’s full unique experience.

From a technical perspective, incomplete records, mixes of structured and unstructured data, as well as data integrity issues make it difficult to depend on EMR for reliable risk stratification, prediction, and decision making.² Even when these issues are resolved, the challenge of integrating the data in one place for meaningful AI analysis remains.

Nevertheless, EMRs are a source of unprecedented scale and scope of patient data with the potential to help providers deliver value-based care.³

Historical Bias in Today’s Data-Models

When used effectively, risk-based early prevention interventions can save tremendous amounts of care costs by identifying who would benefit the most from interventions and by using measures to reduce the prevalence and severity of poor health outcomes. But the ways risks are determined and assigned are not always reliable or equitable.

AI tools use historical data to measure trends. They apply those patterns to identify the probability of a similar outcome in individuals with comparable indicators. To do this, the tools rely on proxies for medical conditions rather than the conditions themselves.

Algorithms don’t identify medical conditions in the same ways doctors do. While doctors have tests for identifying physiological abnormalities, algorithms have numeric representations of those abnormalities. They therefore identify the condition using data, like the prevalence of billing codes, that represent a diagnosis.

Risk prediction for stroke that relies on the prevalence of stroke billing codes as a proxy for the condition, measures the probability of stroke by aggregating data from multiple sources and using the patterns that emerge to identify who is likely to have a stroke billing code associated with them in the future. Since the billing code usually goes hand in hand with the real diagnosis, it’s taken as a reliable way to predict the probability of the condition. But the algorithms predict the probability of a billing code for a stroke rather than the stroke itself.

Understanding what the algorithms predict, or the criteria used to classify ‘risk’, is critical for deriving meaningful risk stratification.

Relying on historical data often leads to algorithmic outcomes that are imbued with systemic biases derived from the data upon which their predictions are based. These effects are either minimized or amplified by the criteria that determine ‘risk’.⁴ For example, when the cost of care is used to determine health risk, bias emerges. Acute and severe medical conditions cost the healthcare system more, so some prediction algorithms use future costs to determine risk.

In other words, the model uses the probability of incurring high medical expenses as a proxy for the probability of having a medical condition emerge or become more severe. Those who are predicted to require more spending are given a higher risk score. However, the healthcare system has historically spent less money treating marginalized groups. Not because they require less care, but because of systemic bias. In turn, the algorithms that rely on future cost as a proxy for risk give marginalized patients lower risk scores than their privileged counterparts, causing them to miss out on the potentially lifesaving early preventative interventions.

Minimizing Harm, Optimizing Solutions

Audits are one way to reduce the potential harm of biased AI outputs.

Regular audits of algorithmic outputs can be used to ensure that AI tools are measuring what they intended to and that the subsequent health risk stratification is not reproducing bias. When bias emerges in the outputs, it can be caught early and reduced before causing significant harm.

When Ziad Obermeyer and colleagues uncovered the biased outputs that were produced by using future cost as a proxy for health, they adjusted the criteria used to determine risk and saw an 84% reduction in output bias.⁴ Instead of using future cost as a proxy for health, they created an index variable that combines cost predictions with health predictions. This tweak alone improved the equitability of AI recommendations, directing early prevention interventions where they’re truly needed and saving long-term costs over time.

Another way to improve the risk stratification is to combine EMR data with behavioral, environmental, and patient-reported data as part of risk prediction.³ Each of these can provide information about factors that significantly impact the health of people and reduce the dependency on the healthcare system’s biased historical data.


Despite the challenges of data integration and analysis, EMRs facilitate interoperability among care providers by maintaining a shareable, secure record of a patient’s complete medical history. But beyond that, EMR data can be used to determine genuine health risks. To do so, the AI tools that derive predictions from EMR data must be trained and monitored so they don’t reproduce harmful bias, a key factor of which is choosing the appropriate criteria for classifying risk.