OMNY Health dataset surpasses 100M patients, creating largest US repository of unstructured clinical data

OMNY Health has reached a major milestone, announcing that its real-world data (RWD) platform now includes de-identified health information from over 100 million patients across all 50 U.S. states. According to the company, this represents nearly 30% of the U.S. population and makes the platform one of the largest sources of longitudinal, HIPAA-compliant clinical data available for healthcare research and AI development.

Founded in 2020, OMNY Health now partners with more than 46 healthcare organizations, including St Luke’s University Health Network, Bon Secours Mercy Health, and Baptist Health System KY & IN, contributing data from more than 650,000 providers and over 6.5 billion clinical notes. The company says its dataset spans over eight years of care and supports use cases across all therapeutic areas.

Unlike many other datasets, the OMNY Health platform integrates both structured medical records and unstructured data from clinician notes, transforming them into research-ready variables using AI-driven natural language processing (NLP) and large language models (LLMs). This includes information on symptoms, treatment rationale, adverse events, and social determinants of health that are often missed in standard datasets.

Partner perspectives

Health system leaders say the ability to securely share de-identified, high-quality clinical data is key to supporting equitable AI development and driving innovation.

“Unlocking AI’s transformative power in healthcare demands a new approach to collaboration,” said Matthew Fenty, managing director of innovation & strategic partnerships at St. Luke’s. “It’s not just about data volume, but also quality, diversity, and real-world representation.”

Dr Mark Townsend, chief clinical digital ventures officer at Bon Secours Mercy Health, added: “The COVID-19 pandemic highlighted how crucial shared data is in closing information gaps. Our partnership with OMNY supports safe and responsible innovation.”

Dr Brett Oliver, CMIO at Baptist Health, compared the dataset to “a patchwork quilt” of insights. “True clinical equity emerges when every thread of the patient story is woven into the model. OMNY’s platform helps make that possible.”

Platform growth

OMNY Health says its platform includes comprehensive data for more than 10 million patients per therapeutic area, enabling insights into disease progression, treatment pathways, and patient outcomes at scale. The company plans to grow its network to 125 million patient lives by the end of 2025, adding more hospitals, academic medical centers, and integrated delivery networks.

“This milestone reflects our vision of accelerating innovation through high-quality real-world data,” said Dr Mitesh Rao, OMNY Health’s CEO and founder. “As we continue to expand, we aim to support research that solves some of healthcare’s most complex challenges.”

Mail Icon

news via inbox

Sign up for our newsletter and get the latest news right in your inbox