The true toll of long COVID may be double that of current estimates and hidden from current surveillance systems that rely on capturing diagnostic codes, according to new research led by Mass General Brigham. Investigators used a novel AI algorithm to comb through medical records of nearly 460,000 patients with COVID-19 across 58 U.S. hospitals, finding approximately 1-in-6, or roughly 16%, developed long COVID. These rates, which translate to more than 18 million Americans, are twofold higher than current estimates and reflect the growing cumulative burden of chronic conditions following COVID-19 infection. Results are published in JAMA Network Open .
“Over 10 million people with long COVID would go entirely undetected by the diagnostic code that health systems and policymakers rely on to track the disease burden,” said study corresponding author Hossein Estiri, PhD, a faculty member in the Mass General Brigham Department of Medicine. “The figures we uncovered are almost certainly an undercount.”
Current diagnostic coding, including the ICD code U09.9 designated for post-COVID conditions, captures fewer than 7% of patients with long COVID.
Mass General Brigham researchers deployed a novel “precision-phenotyping” algorithm they designed specifically to identify long COVID in longitudinal electronic health records by analyzing temporal sequences of clinical events from hundreds of thousands of COVID-19 patients. The algorithm was previously validated to identify cases of long COVID as a diagnosis of exclusion, which identifies conditions that appeared after COVID-19 infection and cannot be explained by preexisting conditions already in a patient's medical history.
Researchers analyzed electronic health records from 457,950 patients who had previously tested positive for COVID-19 across four U.S. regions: New England, Southeast Texas, Southern California and Western Pennsylvania. They identified long COVID in 16.3% of patients overall, with rates ranging from 13.6% to 22.7% across regions. Across the full study cohort, 14.5% of COVID-19 patients (66,587 individuals) developed chronic conditions requiring sustained clinical care. The study also uncovered regional variations of long COVID clinical manifestations, such as dramatically different rates of prediabetes – an emerging sequelae of long COVID – across various parts of the U.S.
Contrary to the assumption that long COVID is a legacy of early waves of the pandemic, the researchers also found that cumulative prevalence continued to increase across all regions studied. This indicates the virus continues to act as a catalyst for new, long-term chronic health conditions impacting different systems in the body. Statistical modeling showed significant quarterly increases in New England, Southern California and Western Pennsylvania, with trends pointing to continued growth over the next decade if current patterns persist.
“This work demonstrates how longitudinal clinical data in a health system can be structured and analyzed to support more consistent identification of complex post-viral conditions,” said Shawn Murphy, MD, PhD, study co-author and Chief Research Information Officer for University of Washington. “There is significant potential for clinical AI when it is designed for public health and integrated across real-world care settings.”
The researchers note that their findings do not include undocumented infections, which have become the majority since widespread testing ended, and exclude patients without longitudinal medical records. These limitations suggest the overall disease toll of long COVID may be even higher.
“These patients are not absent from clinical care; they are absent from the diagnostic code that would identify them as long COVID patients,” said lead study author Jiazi Tian, MSc, a data scientist in the Clinical Augmented Intelligence Group at Mass General Brigham. “The cardiologist seeing new dysautonomia, the endocrinologist seeing new metabolic disease, the neurologist seeing unexplained cognitive complaints — some of these presentations are long COVID arriving without the label that would connect them to a COVID-19 infection.”
"This study demonstrates how hospitals can leverage AI to help fill surveillance gaps that public health agencies are no longer tracking. What excites me most is what can come next with this new surveillance data,” said Estiri. "Once we can distinguish different clinical and organ-specific manifestations of long COVID, we gain the ability to launch new trials and test targeted treatments for the right patients.”
Authorship: In addition to Estiri, Murphy and Tian, research co-authors included Alaleh Azhir, Matthew Decaro, Ngan Chau, Jonas Hügel, Michele Morris, Jingya Cheng, Pedram Fard, PhD, Ingrid V. Bassett, Douglas S. Bell, Elmer V. Bernstam, Shyam Visweswaran, and Jeffrey G. Klann.
Disclosures: None
Funding: This study was supported by the National Institutes of Health/National Institute of Allergy and Infectious Diseases (R01 AI165535) and the National Center for Advancing Translational Sciences (U24 TR004111). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Paper cited: Tian J et al. “Long COVID Persistence and Surveillance Gaps Across 58 US Hospitals” JAMA Network Open DOI: 10.1001/jamanetworkopen.2026.14909.
###
About Mass General Brigham
Mass General Brigham is an integrated academic health care system, uniting great minds to solve the hardest problems in medicine for our communities and the world. Mass General Brigham connects a full continuum of care across a system of academic medical centers, community and specialty hospitals, a health insurance plan, physician networks, community health centers, home care, and long-term care services. Mass General Brigham is a nonprofit organization committed to patient care, research, teaching, and service to the community. In addition, Mass General Brigham is one of the nation’s leading biomedical research organizations with several Harvard Medical School teaching hospitals. For more information, please visit massgeneralbrigham.org.
JAMA Network Open
10.1001/jamanetworkopen.2026.14909
Data/statistical analysis
People
Long COVID Persistence and Surveillance Gaps Across 58 US Hospitals
27-May-2026
Dr Hügel reported receiving grants from the German Academic Exchange Service and the German Research Foundation during the conduct of the study. No other disclosures were reported.