Why the FDA’s new real-world evidence guidance ends the era of structured-data-only submissions

Friday, June 19, 2026 · david talby

Source Summary

On February 17, 2026, the FDA’s final guidance on the use of real-world evidence to support regulatory decision-making for medical devices became operational. It asks sponsors to demonstrate that their real-world data is relevant, reliable, complete and traceable, for every clinical fact rather than each dataset as a whole. The first wave of submissions under the new rules is now landing at the agency, and a structural problem with how most secondary-use clinical data is built today is about to become visible. The premise behind those pipelines is that structured electronic health record (EHR) fields plus claims data offer a defensible foundation for evidence. They are easier to extract and standardize, and map cleanly to common data models like OMOP. The implicit assumption is that what is missing from the structured fields is either marginal or available somewhere else. The peer-reviewed record says otherwise. The clinical signal that matters lives in text Across condition areas where regulatory submissions depend on completeness, structured fields capture a small fraction of what clinicians have documented. Social determinants of health are the starkest case. A 2024 study in npj Digital Medicine compared natural language processing on clinical notes against ICD-10 Z-codes for the same patients: NLP identified adverse SDoH in 93.8% of patients, while the structured codes identified 2.0%. For a regulatory question about outcomes by housing, food or transportation security, structured data is not a partial view. It is absent. Family history follows a similar shape. A 2015 study in the AMIA Annual Symposium Proceedings found specified family history in 58.7% of neurology admission notes against 5.2% in the structured record, a twelvefold gap. Any genetics-aware risk model that draws only from structured fields operates without most of its predictive signal. In oncology, the data that drives staging, therapy and outcomes lives in pathology reports and clinic notes rather than discrete fields. A 2022 study in the Journal of Medical Internet Research reported 93.5–97.6% accuracy for cancer site and histology extracted directly from free-text pathology reports. Without that extraction, the structured oncology record is, on its own, incomplete enough that cancer registry and external-control-arm work cannot be defended. For diagnoses more generally, a 2021 audit in the International Journal of Medical Informatics found that nearly 40% of important inpatient diagnoses appeared only in free-text notes and never reached the structured problem list. A 2025 study presented at the PHUSE/FDA Computational Science Symposium reported that observed suicidality and self-harm events doubled once unstructured EHR data was added to the surveillance window. This is consistent with earlier work showing that only about 3% of suicidal ideation events and 19% of suicide-attempt events documented in notes carry corresponding ICD codes. For pharmacovigilance and safety analyses, the gap is the difference between detecting a signal and missing it. And what is captured is noisier than it looks Treating the structured record as ground truth understates a second problem: the codes that are present are frequently wrong. A 2017 simulation study in the AMIA Annual Symposium Proceedings found that just over half of entered diagnosis codes were appropriate for the clinical scenario, and about a quarter of the codes expected from the chart were omitted entirely. A 2022 study in the Annals of Translational Medicine reported an average of 4.9 medication discrepancies per patient, with more than 90% of patients carrying at least one. And the CDC has documented that about one in five new prescriptions is never filled, and roughly half of those filled are taken incorrectly. The structured layer is not only thin. It is also unreliable in ways that propagate silently into derived measures. This brings the discussion to the most uncomfortable finding. Completeness changes the answer, not just the coverage A 2018 study in the American Journal of Managed Care computed Charlson comorbidity scores (a widely used mortality-prediction index) from two sources for the same patients: from free-text clinical notes and from the structured problem list. The version computed from the notes predicted long-term mortality. The version computed from the structured record did not. The math was identical. The data layer changed which conclusions were valid. This is the pattern the new FDA guidance is responding to. The agency’s relevance-and-reliability framework cares less about volume than about accuracy. The clinical facts in a submission have to accurately represent what happened to the patient, and critical information cannot be systematically missing. A submission whose underlying measure is built on the structured-only Charlson is, by the agency’s own framework, not fit for the regulatory question it is being used to answer. What this means for the architecture, not just the dataset The implication runs deeper than “add NLP to your pipeline.” It changes the unit of work. Under the new guidance, the question is no longer “is this dataset complete enough?” but “is this fact about this patient accurate, and where did it come from?” Every clinical assertion in a real-world evidence submission has to be treatable as a claim: sourced, dated, contextualized, scored for confidence and reconcilable when sources disagree. That has architectural consequences. It means ingesting and parsing every modality losslessly, including text, FHIR, HL7, DICOM and PDFs, without throwing away the original. It means extraction with healthcare-specific language models that handle negation, assertion status, temporality and clinical context. It means terminology mapping that survives audit. It means a reconciliation layer that knows what to do when the chart says 80 mg and the pharmacy feed says 40 mg and surfaces the conflict rather than picking silently. None of that is exotic engineering. But it is incompatible with pipelines whose first design assumption was that structured fields would carry the load. Sponsors operating under the new guidance will need to rebuild that assumption from the ground up. Capturing the right data is the easier part. Proving you captured it correctly, fact by fact, is the harder one. The new guidance treats both as requirements, not options. This article is published as part of the Foundry Expert Contributor Network. Want to join?

Read Original Article

Source Summary

Related Dispatches