95% of companies see no measurable return on their AI investments. That's not a headline — it's the baseline from the MIT report State of AI in Business 2025.
The most common reason isn't the model. It's the starting point. Most companies begin AI where a demo looks impressive quickly — a customer service chatbot, a copilot with no data foundation, an “assistant for everything.” That generates internal visibility, rarely a process gain.
Productive AI usually starts somewhere else. Less visible, but operationally effective from day one: with document-heavy processes.
Why the wrong starting point derails most AI projects
“What could we actually do with AI?” sounds open and strategic. Operationally, it's the question that leads to an unspecific use case nine times out of ten: too broad, too hard to measure, too hard to justify internally.
In environments with data protection, quality, or governance requirements, this gets worse. A scope that's too broad doesn't fail because of the model — it fails on questions like: Who reviews the outputs? How is it documented? How are edge cases handled? Who is liable?
Productive AI doesn't need maximum visibility first. It needs a clear operational context. The easiest place to find it is where your company already works in a structured way: in document-heavy processes.
What makes document-heavy processes suitable for AI
A document-heavy process brings five things that AI would otherwise have to construct with difficulty:
Clear inputs — PDFs, forms, official notices, emails with attachments. Recurring formats.
Recurring patterns — document types, mandatory fields, standard layouts.
Defined rules — deadlines, thresholds, responsibilities, required data points.
Recognizable exceptions — incomplete documentation, edge cases, liability topics.
Clean handover points — from intake to specialist, from specialist to approval.
This shifts the key question. Instead of “Can AI somehow help here?” you ask specifically: Which information needs to be extracted first? Which cases are standard, which aren't? At what uncertainty threshold does a human take over? Where does the decision land — and how is it documented?
In practice, that's the difference between a demo and a workflow that runs at 8 a.m. on Monday.
A concrete example: claims processing in the back office
To make the frame concrete, here's a real process type from our consulting work.
A European mid-market insurance company processes around 400 incoming claim notifications per day via email. Each notification contains a different combination of: cover letter, claim report PDF, photos, cost estimate, sometimes a police report.
Before: A case handler spends about 6 minutes per case checking completeness, classifying the claim type, assigning the right processing path, and entering the case into the system. The actual professional decision only happens in the next step. At 400 cases, that's roughly 40 hours of pure pre-screening per day.
With AI upstream, the process looks like this:
Extraction: AI identifies claim number, policyholder, claim date, type, amount, and attachments.
Completeness check: The system flags missing required data points and drafts automatic follow-up requests.
Classification: Claim type (e.g., water damage, storm, liability) and urgency are set.
Triage by confidence threshold: Below a defined confidence score, or for recognized edge cases (personal injury, major loss, subrogation), the case is immediately escalated to a specialist.
Handover: Case handlers see a pre-structured, prioritized inbox — instead of an unsorted email stack.
What this shifts: The professional decision stays with the human. But the 40 hours of daily pre-screening become a structured input. The scarce resource — the time of experienced case handlers — goes where their judgment is actually needed.
This isn't a full-automation project. It's a pre-screening and triage system with a clear escalation rule.
Pre-screening and triage beat full automation
The most common reasoning error in early AI projects: the assumption that value only emerges with full automation. In most mid-market processes, that's wrong. The leverage sits upstream of that.
Pre-screening brings an incoming case into a usable form. Extraction, completeness, structure.
Triage separates standard cases from edge cases, complete from incomplete, critical from non-critical. Not by gut feeling, but by defined rules and thresholds.
Escalation ensures that professional, liability-relevant, or uncertain decisions land where they belong — with a human, with full context and a prepared case.
In regulated environments, that's not a compromise — it's the actual product. The system doesn't displace specialist work. It relieves it from tasks that currently only absorb attention because they had to be prepared by hand.
Human-in-the-loop isn't an intermediate step — it's the design
Human-in-the-loop is often described as a placeholder: “Until AI is good enough.” In regulated processes, it's the opposite.
When a process deals with deadlines, auditability, or sensitive data, maximum autonomy isn't the goal at all. The goal is a system that handles routine reliably, recognizes uncertainty, and prepares decisions transparently.
In the European mid-market especially, this determines adoption. When the specialist team can see why a case was prioritized or escalated, usage grows. When they can't, the system stays a pilot — just like the 95% in the MIT statistics.
Which process in your company absorbs the most time in pre-screening and triage today? In a 30-minute introductory call, we'll assess together whether it's suited for an AI pilot. Free, no commitment. Book your call.
Where the leverage is greatest in the mid-market
Not every industry is equally well-suited, but process structure matters more than industry. Particularly suitable are workflows with:
Many recurring documents (forms, notices, applications, reports).
A manual first-pass review that currently ties up specialist time.
Rule-based decisions with recognizable exceptions.
Multiple processing paths split by case characteristics.
High routine-work load on experienced specialist roles.
Typical candidates we see in Diagnostic Sprints: internal approval and review processes, service and back-office workflows, compliance-related flows, quote and tender evaluation, intake processes with follow-up and handoff.
A checklist: how to recognize a good first use case
Ask yourself, for a specific process:
Do inputs arrive recurrently and in comparable formats?
Are there defined criteria for evaluating cases?
Can standard and edge cases be separated?
Does the process currently consume measurable time from experienced specialists in first-pass review?
Is there friction around completeness, prioritization, or handoffs?
Do decisions need to be documented or better prepared?
Four or more yes answers: you have a candidate for a clean, narrow pilot. Not a transformation program.
Why a narrow Diagnostic Sprint delivers more than an AI program
Many companies start AI with vision papers, target pictures, and programs. That isn't wrong, but it's too abstract for the decisive operational question: Is this specific process AI-suitable, and if so, what would a pilot look like that delivers a reliable result in four to six weeks?
A Diagnostic Sprint answers exactly that. We analyze a specific process against the six checklist questions, examine data flows, rules, and escalation logic, and deliver a pilot scope including effort estimate and success metrics. At the end, you don't know whether AI would be “somehow useful” — you know whether this process is pilot-ready and what it realistically delivers.
Start small. But start right.
You don't have to start with the most spectacular AI use case. You should start with the most concrete one.
Document-heavy processes bring exactly what productive AI needs: clear inputs, defined rules, recognizable edge cases, clean escalations. Starting there doesn't build a pilot that gets shelved after three months. It builds a workflow that runs on Monday morning.
And that's where — not in the next AI vision — it's decided whether your company lands in the 5% or the 95%.
Source: MIT Project NANDA, The GenAI Divide: State of AI in Business 2025, July 2025. View PDF.
:quality(80))

:quality(80))