What's the most important question to ask an AI vendor before signing?

Ask for the contractual definition of "resolution." If a vendor can't put in writing that resolution means the customer's issue was solved — not that they timed out or stopped replying — the headline metric you're being sold isn't what it appears.

How do I test an AI vendor's hallucination risk during a trial?

Bring out-of-scope questions the vendor didn't prepare for. Ask whether the AI redirects cleanly or generates a plausible-sounding wrong answer. Also ask what guardrails you control — topic constraints, prohibited responses, and brand tone enforcement. If the answer is "not much," that's a meaningful limitation.

What does "live in weeks" actually mean for AI implementation?

It depends on how "live" is defined. Ask whether that means a limited beta on one channel or full production on your real traffic. Get week-by-week milestones in writing, clarify who does the work (self-service vs. vendor-led vs. third-party), and ask for peer references from companies your size.

What should I know about data practices before signing an AI vendor contract?

Ask whether your conversation data is used to train their models today, and what happens if that policy changes. Review retention and deletion policies, verify compliance certifications (SOC 2, GDPR, CCPA), and ask whether you'll be notified before model updates that could affect behavior in production.

What's the right way to evaluate AI handoff quality?

Test handoff quality, not just handoff speed. When the AI escalates, does the team member receive full conversation context — or does the customer have to repeat themselves? Also ask whether you can see failure data: where the AI handed off, why, and how to use that to improve.

March 25, 2026

6 questions your AI vendor hopes you won't ask

Best practices

5 min listen

5 min read

Every AI vendor offers a free trial. Most of them look great. Dashboards are polished, demo scripts are tight, and resolution numbers are high. But trials are curated experiences — and the questions that matter most are rarely on the agenda.

Before you sign anything, here are six areas where the gap between trial performance and production reality tends to be widest.

1. How they define "resolution"

Resolution rate is the headline metric in almost every AI pitch. It's also one of the most inconsistently defined numbers in the industry.

Ask for the contractual definition in writing. "Resolved" should mean the customer's issue was actually solved — not that they stopped replying, timed out, or clicked away. Some vendors measure resolution only on a curated subset of AI conversations, not your full traffic. That's not your resolution rate. That's their best-case sample.

Request sample invoices from the trial period. How metrics translate to line items is rarely shown upfront. And ask specifically about partial resolutions — if the AI handles step one and a team member closes the loop, who gets credit?

The vendor that welcomes this question has thought it through. The one that deflects probably hasn't.

2. Whether the AI will go off-script

Hallucination is the risk most vendors hope you won't probe. If a vendor can't clearly explain how they measure and share hallucination data, that's your answer.

During the trial, test with out-of-scope questions. Does the AI make something up, or does it redirect cleanly? More importantly: what guardrails do you control? Can you constrain topics, set prohibited responses, and enforce brand tone — or is the model a black box you're just trusting?

The most important test here isn't one the vendor will run for you. Bring your actual edge cases — the conversations your team dreads, the angry customers, the unusual requests. That's where the cracks show.

3. What "setup time" actually means

A go-live estimate without a detailed plan behind it is just a sales line. Get week-by-week milestones in writing. Ask who does the work: is it self-service, vendor-led, or does it require a third-party SI? Hidden implementation costs add up fast and rarely appear in the initial proposal.

Clarify what "live" actually means. Is it a limited beta on one channel, or full production on your real traffic? Those are very different things, and vendors don't always volunteer the distinction.

Ask for peer references from companies your size. Implementation complexity scales with volume and organizational complexity — the logo on a vendor's website doesn't tell you what the rollout actually looked like.

4. How they handle your edge cases

Standard demos are built to succeed. Multi-intent conversations — where a customer asks three things at once, changes their mind midway, or escalates emotionally — are where AI systems fail silently.

Bring your top 10 escalation scenarios to the trial. If the vendor won't let you test them, that's the answer. Pay attention to how the AI handles distressed or angry customers — empathy and tone matter as much as accuracy. A technically correct response delivered poorly can make a bad situation worse.

Insist on metrics from your data, not their benchmark averages. Your volume, complexity, and customer base are what matter.

What customers actually expect from AI interactions

Our 2026 Customer Expectations Report reveals what's driving customer loyalty — and what breaks it. Essential context for any AI evaluation.

Get the report

5. Where your data goes

Data practices are the part of an AI contract most teams review too late. Ask directly: is your conversation data being used to train their models today? What happens if that policy changes?

Review retention and deletion policies — how long are conversations stored, and who controls the purge? Verify compliance certifications (SOC 2, GDPR, CCPA) rather than assuming. And ask about model update disclosure: will you be notified before a change that could alter behavior in production?

This isn't just a legal question. It's an operational one. A model update that shifts tone or changes how certain intents are handled can affect your customer experience overnight.

6. What happens when it fails

No AI system operates at 100%. The question isn't whether it will fail — it's what happens when it does.

Map the failure path before you go live. When confidence is low, does the AI auto-escalate, loop, or dead-end the customer? Test handoff quality, not just handoff speed — does the team member receive full conversation context, or does the customer have to start over?

Ask about failure reporting: can you see where and why the AI handed off, and can you use that data to improve? And ask what happens if the AI layer goes down entirely. Does the platform stay up and route to your team, or does your support operation go dark?

Free trials are designed to impress. These six questions are designed to find out what they're not showing you. The right vendor will welcome every one of them — and put their resolution definition in writing before the trial begins.

See how Gladly answers every one of these questions

Get a personalized demo and bring your edge cases. We'll show you exactly how Gladly performs on your traffic, not ours.

Get started

Angie Tran

Staff Content & Communications Lead

Angie Tran is the Staff Content & Communications Lead at Gladly, where she oversees brand storytelling, media relations, and analyst engagement. She helps shape how Gladly shows up across content, PR, and thought leadership.

38

4

45

58