Retab launches developer-first document AI platform and raises $3.5M in pre-seed funding
Retab, a startup aiming to simplify document automation for developers, has launched its AI-powered platform following a $3.5 million pre-seed funding round. The San Francisco- and Paris-based company plans to use the funds to expand platform capabilities and scale operations, supporting demand from AI-first startups and innovation teams across logistics, healthcare and finance.
The funding round includes backing from VentureFriends, Kima Ventures, K5 Global, Eric Schmidt (via StemAI), Olivier Pomel (CEO, Datadog) and Florian Douetteau (CEO, Dataiku).
The company was founded by engineers who previously built internal tools for document-heavy workflows. Frustrated by unreliable AI pipelines and the lack of scalable solutions for extracting data from unstructured formats, they created Retab to solve the challenge.
“People k
eep building demos that look like magic, but break the moment you put them into production,” said Louis de Benoist, co-founder and CEO of Retab.
“We lived that pain ourselves. Wiring up fragile pipelines just to extract a few fields from a PDF. We built Retab because it’s the developer-first platform we always wished we had.”
The platform enables developers to define the schema of data they want to extract, and then handles the full document automation lifecycle, from dataset labeling and evaluations to prompt engineering and model selection.
The startup is not another large language model, but rather an orchestration layer that enables large models from providers such as OpenAI, Google and Anthropic to be used more reliably in production environments.
The platform includes:
Self-optimizing schemas that refine instructions for maximum accuracy based on a user’s own documents before launch.
Intelligent model routing that benchmarks tasks and assigns them to the best-performing models based on cost, speed or precision.
Guided reasoning and k-LLM consensus to reduce risk by forcing step-by-step logic and using multiple models to flag uncertainty and strengthen reliability.
“Retab is the OS for reliably extracting structured data,” said de Benoist. “It wraps the best models in a layer of logic that actually makes them usable with error handling and structured outputs. That’s what devs need if they want to build production apps, not just prototypes.”
Early users include companies in logistics and financial services. A large trucking firm used the platform to identify the smallest and fastest model that met a 99% accuracy target, reducing costs significantly. A financial firm used it to extract metrics and risk factors from long-form quarterly reports, cutting down work that previously required days of analyst time.
According to Douetteau, “The AI-fication of the economy depends on the capability to convert operations based on millions of documents into verified, structured data that autonomous systems can utilize. On a large scale, this process hinges on quality control, cost efficiency and rapid implementation. The team at Retab understands this thoroughly and is uniquely positioned to solve it for the thousands of AI first companies that are emerging.”
The company is now building integrations with platforms such as Zapier, Dify and n8n. It also plans to apply its extraction tools to web content and further expand its developer community.
The long-term goal is to become a middleware layer between unstructured data – such as contracts, reports and customs forms – and the AI agents that need to interpret them. With just ten employees and growing interest from developers, the startup is positioning itself as a core component in the next wave of AI infrastructure.




