June 14, 2026
How do you know if your AI is actually working?
6 min listen
6 min read
Most AI teams obsess over deflection rate. How many conversations did the AI touch? How many stayed away from a human?
But when Gladly studied what retail's best AI deployments have in common, a different pattern emerged.
The strongest predictor isn't the AI. It's ownership.
One finding from the data stands out above everything else. Across the companies Gladly studied, the strongest predictor of AI performance wasn't the model, the vendor, or the number of integrations built.
It was whether one person woke up every day responsible for how the AI served customers.
One company in the dataset lost its AI lead without a replacement. Workflow updates stopped. Configuration went stale. Their addressable resolution rate — the share of AI-reachable conversations that AI actually resolved — fell to 0.38%.
Another company had let its AI sit for nearly a year — no updates, no new channels, no iteration. Resolution rate oscillated between 6% and 20%. Within two months of one person taking dedicated ownership, they launched new channels, refreshed their knowledge base, and added use cases. Resolution rate climbed to 48.9%.
Same technology. One clear variable.
Which means the metrics below aren't really AI metrics. They're operational health metrics. They go up when someone owns the AI and keeps it current. They erode when no one does.
Four metrics that tell you what's actually happening
These aren't four disconnected KPIs. They form a progression — from outcome to performance to coverage to operations. Together they tell you not just whether your AI is working, but why.
1. Reopen rate (outcome metric)
A reopen happens when a customer comes back about the same issue. It's the most direct signal that the first resolution didn't hold — the answer was incomplete, the action didn't work, or the customer left more confused than before.
Across the top performers in Gladly's 2026 data guide, AI-resolved conversations reopened less frequently than agent-handled ones:
Fintech platform: 17.3% AI reopen rate vs 32.9% agent — nearly half
Home décor retailer: 18.4% vs 24.3%
Beauty retailer: 11.8% vs 12.1% — essentially parity, which is itself a strong quality signal
Speed gets the headlines. Reopen rate is where quality actually shows up.
A low reopen rate is rarely an accident. It's the outcome of someone keeping the AI current — workflows updated, integrations working, knowledge base reflecting real policies.
2. Addressable resolution rate (performance metric)
Addressable resolution rate = the share of AI-reachable conversations that AI actually resolves.
This is different from overall resolution rate, which includes conversations the AI was never configured to handle. Addressable resolution rate isolates AI performance within the scope it's actually operating in.
When this number drops — especially suddenly — it's usually a signal that:
Configuration is stale
A new question type has emerged that AI isn't equipped for
An integration or workflow broke behind the scenes
This is the metric that fell to 0.38% when the company above lost its AI owner. No one was watching, so no one caught it.
3. Topic coverage (coverage metric)
Ask: what percentage of your real inbound volume do your AI use cases actually cover?
Most teams start with FAQs, which typically cover 20–40% of volume. That's a reasonable first move. But the question is what they prioritize next.
The data makes this counterintuitive: one company built a single integration — connecting their order management system to their AI — and achieved 100% topic coverage. Another built 11 integrations and covered only 33% of volume. More configurations doesn't mean more coverage.
A team with 6 use cases covering 80% of volume is in better shape than one with 12 use cases covering 40%
Breadth of deployment is not the same as coverage
Topic coverage tells you whether your AI is pointed at the right problems — the ones your customers are actually bringing you, not the ones that were easiest to configure.
4. Workflow update frequency (operational metric)
This is the most operational of the four, but it predicts all the others.
The leading beauty retailer in the dataset made 202 workflow updates in 30 days. 131 came in a single week. One person was driving it — updating flows as policies changed, products launched, and new questions showed up in the data.
Teams that update regularly maintain performance over time. Teams that don't see their numbers erode:
Reopen rate creeps up
Addressable resolution rate falls
Topic coverage lags behind what customers are actually asking
If you can't remember the last time someone updated an AI workflow, you've likely already started to see it in your other metrics.
Metrics don't improve themselves
Reopen rate is the most honest signal of AI quality. But the deeper lesson from Gladly's data is that metrics don't improve themselves.
Every top performer had a named owner, regular workflow updates, strong topic coverage, and executive visibility into how the AI was performing. They treated AI like any other operational system — something that needs ongoing attention after launch, not a project that ends at go-live.
If you're not sure whether your AI is working, start with reopen rate. Then ask who's responsible for improving it.
Get the data behind this article
This post draws on Gladly's 2026 AI data guide — a look at what retail's best AI deployments actually have in common, built from proprietary platform data across a focused set of retail customers.

Gladly Team
With over a decade of customer experience focus, Gladly is the only customer experience AI that delivers the cost savings you need AND the customer devotion that drives lasting business value. Trusted by the world’s most customer-centric brands, including Crate & Barrel, Ulta Beauty, and Tumi, Gladly delivers radically efficient and radically personal experiences.
AI customer service performance FAQs
Quick answers to common questions about measuring whether your AI is actually working.
Recommended reading

What retail's best AI deployments have in common
What do retail's best AI deployments have in common? Gladly studied a focused set of retail customers and found seven patterns that separate top performers from everyone else.

Bad CX Is a feature, not a bug—why AI vendors are designed to frustrate your customers
Most AI vendors profit when their bots deflect customers, creating frustrating CX. Resolutions + Assists model drives both loyalty and efficiency
By
Aashna Malpani

Gladly wins VIP Award for Best Unified Customer Experience
Gladly won the 2026 VIP Award for Best Unified Customer Experience. Learn how customer experience AI built for engagement drives devotion and value.
By
Gladly Team