Turning AI Skeptics into AI Advocates Through High-Integrity Data
With 89% of AI pilots failing to scale, biopharmas can move from data skepticism to advocacy by building a trusted, globally harmonized foundation.
By Sebastian Wurst, Director of OpenData Strategy, Veeva Systems
For years, biopharma has accumulated vast volumes of data. Now, as the industry races toward an AI-driven future, that data must become a true strategic asset.
Veeva’s The State of Data, Analytics, and AI in Commercial Biopharma report shows that while 95% of biopharma companies are actively pursuing AI initiatives in marketing and sales, nearly all of them (89%) fail to scale more than half of their pilots. That lack of success often stems from a readiness gap, with the same research revealing that 67% of leaders abandon AI projects because of poor data quality.
Clean, accurate data is foundational for any AI initiative, with 73% of leaders saying that poor data quality is the single biggest hurdle to scaling AI. But most companies’ current data curation strategy relies on trying to harmonize siloed data acquisitions. That approach is unsustainable, especially as the volume of data continues to grow. Data scientists I’ve spoken with estimate that they spend up to 80% of their time preparing, cleaning, and transforming data to make it usable.
For AI to move from failed pilots to provide enterprise value, we need to address the issue of data quality. This requires a shift from fragmented data management toward a globally harmonized data foundation with proactive data stewardship.
Poor data quality remains biggest barrier to scaling AI
The industry’s reliance on flawed, fragmented data has led to growing skepticism, with 96% of leaders saying their data is not ready for AI. We see this impact clearly in the field: 72% of companies plan to use AI to help field teams by summarizing HCP updates for call preparation, yet adoption is lagging behind.
Field teams tend to reject AI recommendations because they don’t believe the data behind those recommendations is accurate. When a “next-best-action” model suggests a tactic based on a three-month-old affiliation change, the rep or MSL doesn’t just ignore the suggestion. They lose faith in the entire platform. Many biopharma data leaders we interviewed are resistant to using next-best-action models because of these deep-seated trust issues.
Erika Husing, business analyst, commercial operations at GSK, explains, “If we don’t trust the data, how can we draw conclusions from it? It’s really important that we move away from the data skepticism that we see right now, toward data advocacy. Having high-quality reference data is important for all of our team to trust the data.”
The true cost of manual data stewardship
The other resource drain comes from the countless hours teams spend manually mapping local specialties and HCP types to global standards. This creates a massive administrative burden with every new market.
At one top 20 biopharma company, data scientists ask the field force to review customer segmentation data every six months. That task pulls field teams away from their primary role of building relationships in the field.
Perhaps even more frustrating, the biopharma estimates that only 10% of their data is clean and well-curated enough to use, and only 1% is being accessed for relevant use cases. As data volumes continue to grow, biopharma companies can no longer afford to solve data quality issues at the local level.
Globally harmonized data foundation enables AI at scale
Data management is inherently complex, with local affiliates maintaining data differently to meet regional regulations. The result is a fragmented system where records don’t look the same across countries, making cross-country analytics and AI challenging.
A globally harmonized data foundation ensures that data is consistent and accessible across every market. Before implementing a global data model, Bayer AG dealt with inconsistent data definitions and a lack of a single customer view across markets. “Our global data landscape was fragmented — different countries relied on different sources,” explains Stefan Schmidt, digital capability lead at Bayer. “To see the full picture, we needed a unified customer master.”
For Bayer, having a centralized, accurate data foundation has not only provided a single source of truth for HCP and HCO data, but also increased confidence that AI insights are reliable. In turn, this means field teams are less likely to second guess the system and more likely to use AI recommendations.
A globally harmonized data foundation provides the architecture to make data useful across the enterprise, but the true efficiency gain comes from the data source itself. By starting with a better data foundation, the focus shifts from fixing errors to maintaining ongoing excellence through agentic curation.
Maintain high-integrity data through human and agentic data curation
For decades, the industry has relied on manual, human stewardship to maintain data quality. We now have an opportunity to elevate millions of records to a new level of data quality by combining human expertise with agentic data curation.
AI agents can take over specific, repetitive data curation tasks like cross-referencing or checking for duplicates. Specialized agents review 100% of records daily, and a human data steward then reviews the findings.
Because agents run constantly, they capture changes immediately as they occur, often finding signals before they even hit public registries. For example, agents can capture affiliations at a granular level and identify when they change. That real-time accuracy allows AI to make meaningful next-best-action recommendations that are based on live data changes. It also prevents the common pitfall of field teams receiving insights about things they already know, like a doctor’s office relocation they visited last week.
By largely shifting the burden of data curation to autonomous agents, we can stop relying on field reps to submit updates and move from a reactive model that often leads to data skepticism, to a proactive one. Agentic data curation with human stewardship delivers the verified, trusted, and high-quality data needed to scale AI.
You can’t scale what you can’t trust, and you can’t trust what you haven’t harmonized. A globally consistent data foundation enables biopharma to focus on using data, rather than spending time cleaning it. It’s this focus that will turn the biggest AI skeptics into confidant advocates.




