perspective 30 May 2026 · 6 min read

The Honest-UX Bar Every AI Product Should Clear

ToRun Team

Most AI products fail on a dimension that has nothing to do with model quality. They fail on honesty — not in the sense of lying outright, but in the quieter ways: burying what a call costs, expressing confident nonsense without a hedge, disappearing state the user cannot recover, and never learning from the friction they inflict. These failures are design choices, even when they look like oversights.

Below is the five-pattern framework we use internally at ToRun to audit every surface we build. We are not claiming we always pass. We are claiming these are the right questions.

Transparency: show the real state and real cost

The canonical dishonest pattern is a vague "some credits were used." Transparent billing means the user can look at any past call and reconstruct exactly what was charged and why — provider, model, modality, units consumed, rate at time of execution, currency conversion if applicable, and the platform margin. If any of those fields are absent or estimated after the fact, the invoice is fiction.

A subtler transparency failure is hiding model selection. When a product silently downgrades your request to a cheaper model because you hit a soft limit, you may get a worse answer and never know it. Honest behavior is to tell the user which model ran, why it was chosen, and what the alternatives were. The routing decision should be inspectable, not a black box.

Transparency also covers state. If a background job is running on your behalf, the UI should say so and give you a way to stop it. "Processing" spinners that cannot be cancelled are a small lie.

Reversibility: let users undo

Irreversibility is the most underrated UX failure in AI products. You ask a workflow to send an email or post to a channel, it fires immediately, and the only recovery path is a manual apologetic follow-up. No staging. No confirmation when the stakes are high. No undo.

The reversibility pattern requires products to distinguish between reads and writes, surface confirmation steps for consequential actions, and hold state long enough that mistakes can be caught. For AI-generated content, this means drafts exist before publication, generated artifacts are recoverable after deletion, and chat sessions can be exported or forked before you lose them.

Reversibility is also about configuration. If changing a setting has cascading effects — say, switching the default model for all future sessions — the product should tell you what will change before you confirm, not after.

Proactive Concern: warn before harm, not after

An honest product says "this message is about to consume a large chunk of your balance" before the call, not in a post-hoc receipt. It says "this workflow step has no error handler; if the upstream tool fails, the run will halt silently" during the build, not when an incident happens at 2 a.m.

Proactive concern means the product models what could go wrong from the user's perspective and surfaces it at the decision point. This is harder than it sounds because it requires the product to understand user intent, not just execute commands. A user asking for a 200-image batch generation probably wants to know the estimated cost before it starts, even if they did not explicitly ask.

The failure mode here is optimizing for conversion — suppressing cost warnings because they cause hesitation. Short-term, that might increase activation. Long-term, it trains users not to trust you.

Epistemic Honesty: say what you do not know

AI models confabulate. Every team building AI products knows this. The dishonest design response is to present model output with uniform confidence and let the user discover the problem downstream. The honest response is to build explicit signal paths for uncertainty.

This means: when a model cites sources, the citations must be real and linkable. When a model says "as of my training data," that hedge must be visible in the UI, not stripped for cleaner copy. When retrieval-augmented answers fall back to parametric knowledge because no relevant chunk was found, the response should say so rather than presenting the parametric answer as grounded.

Epistemic honesty extends to the product itself. "This feature is experimental" is not a disclaimer to hide — it is information the user needs to calibrate trust. If a confidence score is available but low, show it. If the routing pipeline fell back to a secondary model because the primary was unavailable, say that in the call metadata.

Learning Loop: close the feedback cycle

The most insidious UX failure is a product that never gets better at serving you because it never observes where it frustrated you. Rage-clicking a button four times because it appears disabled but is not is a signal. Cancelling a generation halfway through consistently is a signal. Writing the same correction after every response is a signal.

An honest platform captures these signals with disclosure, uses them to improve routing and defaults, and surfaces what it learned. Not "we use your data to improve the product" in a terms-of-service footer — but a visible feedback loop: "we noticed you often switch from the auto-selected model to a specific model for code tasks; do you want to set that as your default?"

The learning loop also means acknowledging when the platform made a bad call. If a model produced a factually wrong answer that the user corrected, that correction is worth more than a thousand thumbs-up signals. Treating it as such is an act of honesty toward the user and toward the product's own calibration.

The Bar

None of these five patterns is technically difficult to implement. They are difficult to prioritize because they occasionally reduce short-term engagement metrics. Users who see cost warnings hesitate. Users who see uncertainty hedges trust individual answers less. Users who can undo are less locked in.

The bet honest design makes is that these frictions build durable trust, and that durable trust is the only foundation a product in a category as consequential as AI can build on. The platforms that skip the honest-UX bar are not building relationships — they are running a slow extraction. The distinction shows in churn, in support volume, in the kind of word-of-mouth that actually compounds.

We audit every ToRun surface against these five patterns before shipping. We still miss. When we do, the audit trail is public enough that users can tell us.

Share on X Share on LinkedIn Email