In AI-Driven Research Workflows, Data Quality Is the Bottleneck

What we're hearing from analysts and PMs in 2026, and the new questions they're asking of their data providers.

C. Max Magee, Principal, Research Operations
April 30, 2026

The pilots are over. AI is moving into production at investment firms, and the conversations I’m having with analysts and PMs have shifted from whether to adopt AI to how to use it successfully. Why are so many workflows still stalling? The answer is turning out to be less about the AI than most people expected — and it’s changing how firms are vetting their data providers.

The Wall Analysts Keep Hitting

I was recently talking to an analyst at a multi-billion dollar long/short equity fund. This came up at an AI-focused event where Verity was presenting, and he described something I’ve been hearing a lot. He’d been using AI to help populate fundamental models and kept hitting a wall. The models would stall, produce garbage outputs, or require so much manual cleanup that the productivity gain evaporated — until he got access to the VerityData MCP. VerityData’s inputs were reliable. The downstream work actually held up. For him, the immediate need was 10-Q and 10-K parsing, and that’s just a small part of what the VerityData MCP covers.

I’ve had some version of that conversation in nearly every client meeting this year. They want to talk about the AI part of their research workflow, and as we dig in, we discover that data quality is the limiting factor.

The Shift From Ad Hoc to Always-On

Firms that have moved AI from pilot to production are rebuilding their research stacks around MCPs and APIs, inputs that are always on, continuously ingested, structured, and available to downstream systems. That’s a different architecture than the old model, where an analyst would pull a filing or reference a dataset when they needed it. In that world, data quality was a concern but it wasn’t a bottleneck. You could compensate manually.

In automated workflows, you can’t.

When the data is wrong or inconsistent, the error propagates. When a filing is parsed incorrectly, the model that depends on it is wrong. When data normalization is unreliable, every downstream output built on it is suspect. The workflow breaks, or worse, it produces outputs that look plausible but aren’t.

The Questions Firms Are Now Asking

Sourcing and normalizing high quality data has always been a challenge, and firms have always known that the data gathering layer is expensive. A PwC study of finance professionals from the pre-AI era found that in typical firms, nearly twice as much time is spent on data gathering as on analysis. A CFA Institute report from the same period found that analysts spent most of their time collecting information, not analyzing it. What’s changed is that the cost of bad data has shifted from being a friction problem to a structural one.

The questions I’m hearing from clients reflect that shift:

  • Whose filings are parsed and structured most reliably?
  • Which datasets can operate inside automated workflows without manual intervention?
  • Who has the quality controls to be trusted at scale?

These are architectural questions now, and firms are starting to make architectural decisions based on the answers. The speed, reliability, and scalability of the entire research workflow depends on getting them right.


Ready for Quality Data in Your Research Stack?

VerityData delivers structured, decision-ready data from SEC filings, insider transactions, institutional ownership, and more — built for the automated workflows investment teams are deploying today. Access 15+ years of the most accurate and complete data of its kind, plus 2,500 annual research briefs from VerityData experts.

Book a demo

C. Max Magee, Principal, Research Operations

Max Magee is Principal, Research Operations, at Verity. For 15+ years, he’s helped VerityData clients understand and interpret data that leads to faster, more confident investment decisions. Alongside his work producing daily insights for VerityData clients, Max is the research lead on all of VerityData’s GenAI offerings, developing the conceptual frameworks upon which the products are built.

Related Resources

Outperformance Starts Here

See how Verity accelerates winning investment decisions for the world's leading asset managers.

Request a Demo