A Catalog of Interaction Patterns, Visualization, and Interface Design for Agentic AI
Draft v0.1
May 2026
Table of Contents
About This Catalog
This is the thirteenth volume in a catalog of the working vocabulary of agentic AI, and the third one I added knowing it wasn’t a major missing piece. Volume 11 sat adjacent to Volume 8 as compliance-officer-facing governance. Volume 12 sat adjacent to Volume 8 in a different direction as security-engineer-facing infrastructure. This volume sits adjacent to Volume 7 (Human-in-the-Loop): where Volume 7 covered the engineering of approval gates, intervention mechanisms, and observability for human-supervised agent runs, this volume covers the design discipline of how users actually interact with agents through interfaces. Different audience (designers, design engineers, frontend developers). Different artifacts (interfaces, interaction specifications, component libraries). Same problem space.
The pattern of these three governance-adjacent and discipline-adjacent volumes is now explicit: most production agent products need engineering layers plus complementary disciplines that consume the engineering’s outputs. Volume 11 documents the compliance discipline that consumes Volume 8’s safety engineering. Volume 12 documents the security discipline that protects the infrastructure Volume 8’s safety mechanisms run on. Volume 13 documents the design discipline that determines how end users experience the agents Volume 7’s HITL mechanisms supervise. The catalog series has always treated engineering layers as primary; the adjacent volumes recognize that engineering produces outputs for specific audiences and the disciplines those audiences practice deserve explicit treatment.
Agent UX is a young and rapidly evolving discipline. The patterns documented in this volume were largely invented between 2023 and 2026 as agent products moved from research demos to production deployments and design teams discovered that traditional software UX patterns don’t handle agents well. Streaming UX as a first-class pattern emerged when ChatGPT shipped in late 2022. Generative UI as a coherent pattern emerged around the Vercel AI SDK’s generative UI primitives in 2024. Anthropic Artifacts and OpenAI Canvas introduced the side-panel pattern in 2024. Component libraries (assistant-ui, AI Elements, CopilotKit) emerged to package the patterns for reuse through 2024—2026. The discipline is younger than the engineering disciplines documented in Volumes 1—10; the patterns are correspondingly less stable. What this volume captures is the working state of practice as of mid-2026, with the explicit understanding that the patterns will evolve faster than the engineering substrates do.
Scope
Coverage:
-
Streaming and progressive disclosure: token streaming UX, intermediate state display (thinking, tool use).
-
Generative UI and artifact patterns: Anthropic Artifacts, OpenAI Canvas, Vercel AI SDK generative UI.
-
Approval and confirmation UX: pre-action approval patterns, undo-first design alternatives.
-
Citation, sources, and confidence: source attribution patterns, uncertainty display.
-
Long-running task UX: status and progress patterns, notification and resume patterns.
-
Component libraries and SDKs: Vercel AI SDK UI, assistant-ui, AI Elements (shadcn-style), CopilotKit.
-
Multi-agent visualization and dashboards.
-
Discovery and design community resources.
Out of scope:
-
Volume 7’s engineering of HITL approval gates and observability mechanisms. This volume cites them as inputs.
-
Visual design fundamentals --- typography, color theory, layout grids. The established design literature covers these comprehensively.
-
Accessibility for AI specifically. Important and emerging as a discipline; deserves separate treatment beyond this volume’s scope.
-
Brand and personality design for AI products. Cited where it intersects with interaction patterns; not the primary focus.
-
Mobile-specific agent UX. The patterns largely transfer from desktop; the mobile-specific dimensions warrant their own treatment.
-
Voice and multimodal interfaces beyond text. The patterns are emerging but not yet stable enough for catalog treatment.
How to read this catalog
Part 1 (“The Narratives”) is conceptual orientation: why this volume sits adjacent to Volume 7; the five characteristics that distinguish agent UX from traditional software UX; the chat-is-the-UI fallacy and the working alternatives; transparency vs. cognitive load and the progressive disclosure pattern that resolves it; and the approval-undo-resume pattern that structures most agent-user interactions. Five diagrams sit in Part 1.
Part 2 (“The Substrates”) is reference material organized by section. Each section opens with a short essay on what its entries have in common. Representative substrates appear in the Fowler-style template established by the prior twelve volumes, adapted for design patterns and component libraries which have different characteristic attributes than engineering substrates.
Part 1 — The Narratives
Five short essays frame the design space for agent UX. The reference entries in Part 2 assume the vocabulary established here.
Chapter 1. Why This Volume Sits Adjacent to Volume 7
Volume 7 covered Human-in-the-Loop: the engineering of approval gates, intervention APIs, queue management, and observability instrumentation for agent runs that humans supervise. The artifacts were code, configuration, and architecture decisions. The audience was engineers building agent runtimes. The question the volume answered was how to build a system that humans can supervise.
This volume covers a different question with overlapping concerns: how do humans actually work with the agent? The mechanisms Volume 7 documents --- the approval queues, the intervention APIs, the trace pipelines --- produce interfaces that users see and interact with. The interface design is its own discipline, with its own practitioners, its own vocabulary, and its own substrates (component libraries, design patterns, interaction specifications). The engineering team builds the substrate; the design team determines what users experience. Both functions need each other; neither produces a complete product alone.
The pattern echoes Volumes 11 and 12: these three governance-and-discipline-adjacent volumes recognize that engineering work has audiences beyond the engineering team itself. Volume 11 documents the compliance audience consuming Volume 8’s safety engineering. Volume 12 documents the security audience protecting the infrastructure that Volume 8’s mechanisms run on. Volume 13 documents the design audience that turns Volume 7’s HITL engineering into user experiences. In each case, the engineering layer is necessary but not sufficient; the adjacent discipline turns the engineering into something users actually benefit from. The catalog’s growth into these adjacent volumes reflects production reality: the teams that ship successful agent products invest in both engineering and adjacent disciplines, and the disciplines deserve explicit vocabulary.
This volume’s entries are interaction patterns and component libraries rather than infrastructure substrates. The Fowler-style template adapts: products like Anthropic Artifacts, the Vercel AI SDK, assistant-ui have intent, motivating problems, mechanism, and applicability the way infrastructure tools do; patterns like streaming UX or progressive disclosure have those properties differently but recognizably. The reader of this volume is expected to coordinate with designers and frontend engineers, knowing what patterns and tools they’re working with and what trade-offs the patterns embody. The vocabulary documented here supports that coordination.
Chapter 2. What’s Different About Agent UX
Five characteristics distinguish agent UX from traditional software UX. Each requires patterns that don’t have direct equivalents in non-AI software. Understanding the characteristics is prerequisite for understanding why agent UX needed new patterns and why the patterns documented in Part 2 take the shape they do.
Nondeterminism is the first characteristic. Traditional software is deterministic: the same input produces the same output. Users build mental models of software behavior on this assumption. AI models are stochastic: the same prompt produces different outputs across runs, sometimes substantially different. UX patterns must communicate this uncertainty rather than promise consistency. Confidence indicators, the option to regenerate responses, explicit disclaimers about variability, plurality (showing multiple alternatives rather than a single answer when appropriate) are emerging patterns. The traditional pattern of presenting software output as authoritative --- the calculator’s answer, the search engine’s top result --- doesn’t fit agents, where the output is one possible response from a distribution of plausible responses.
Latency is the second characteristic. Traditional software responds in milliseconds; users expect immediate feedback for direct manipulation. Agent responses take 1—60+ seconds for typical operations, longer for complex multi-step tasks. The latency is visible to users in ways that fast software’s latency isn’t. UX patterns fill the gap: streaming responses (showing tokens as they generate), intermediate state display (showing thinking, tool calls, retrieval), progress indicators for longer tasks, asynchronous handoff (“I’ll work on this in the background”) for tasks that exceed reasonable wait times. The traditional pattern of presenting only the final result, with brief loading indicators for the wait, doesn’t fit agents where the wait is the dominant experience and what happens during the wait shapes the user’s perception.
Decision-making is the third characteristic. Traditional software performs actions the user explicitly directs: open this file, send this message, run this calculation. AI agents make decisions: which tool to call, what to retrieve, what to include in the response, how to phrase the answer. Users build mental models of agent behavior partly by seeing the decisions; surfacing decisions is a UX requirement traditional software didn’t have. Tool-use visibility, citation of sources, exposure of reasoning steps, plan-before-execute UI patterns all serve this function. The right level of decision exposure depends on the audience and context; Chapter 4 covers the transparency-vs-cognitive-load trade-off in detail.
Fallibility is the fourth characteristic. Traditional software fails in known ways with explicit error states; users learn to recognize and recover from the failure modes. AI agents are routinely wrong in ways that aren’t flagged as failures: hallucinations confidently delivered, misunderstandings of what the user wanted, wrong tool selection, incorrect synthesis of correct retrieved information. UX patterns must support correction (the user disagrees and wants the agent to revise), undo (the agent took an action the user wants reversed), error states that distinguish agent failure modes (model error, tool error, refusal, hallucination), and graceful degradation (the agent encountered partial failure but produced some useful output). Section C covers approval and undo patterns; the broader fallibility-aware design extends beyond explicit approval gates.
Agency is the fifth characteristic. Traditional software doesn’t act on its own; it does what the user directly commands and stops between commands. AI agents have agency: they call tools, modify state, take actions in external systems, sometimes act without specific user direction. UX patterns must communicate boundaries --- what the agent can do, what it cannot do, what it has done, what it intends to do next. The line between user and agent shifts; the UX must make the current position of that line legible. Sandbox visualization, capability display (“I can access your email but not your bank”), action preview (“I’m about to send this email to these three people”) all serve this function. The traditional pattern of treating the software as a passive instrument the user wields doesn’t fit agents whose agency is part of their value proposition.
Chapter 3. The Chat-Is-The-UI Fallacy
The default interface for AI agents has been chat: a text input where the user types, a scrolling conversation history where the response appears, optionally some affordances for new conversations or settings. ChatGPT established the pattern in late 2022; most consumer-facing AI products since have used variations of it. The chat interface is the default for understandable reasons --- it’s simple to build, flexible across many use cases, and familiar from messaging applications. The pattern also turns out to be the wrong interface for many tasks, and the design discipline’s recognition of this is shaping the next generation of agent products.
Chat works well for tasks that naturally fit linear dialog. Open-ended questions, brainstorming and ideation, iterative refinement of text, quick lookups and explanations, exploratory data analysis, conversational customer service --- these tasks have a linear back-and-forth structure where each turn builds on the previous one. The chat metaphor isn’t a constraint; it’s a fit. Users with these tasks have largely successful experiences with chat-based AI; the interface doesn’t fight the work.
Chat fails for tasks needing structure. Tabular data --- spreadsheets, lists, comparisons --- is poorly served by a linear text format that requires the user to scroll back through chat history to compare items. Spatial information --- maps, diagrams, layouts --- doesn’t fit a linear format at all. Structured editing --- forms, configurations, code with explicit fields --- wants direct manipulation rather than describing changes to be made. Persistent reference material --- documents, dashboards, ongoing work artifacts --- needs to stay visible while the user iterates rather than scroll off as the conversation continues. Multi-step workflows with parallel state --- multiple sub-tasks proceeding simultaneously --- don’t serialize into a single linear conversation cleanly. Complex code with file-tree structure --- most real software projects --- needs the file tree visible alongside the code. In each case, chat’s linear text format fights the task, and users feel the friction even when they can’t articulate it.
The working alternatives recognize that the interface shape should match the task. Generative UI --- where the AI generates structured interfaces (forms, tables, charts, custom widgets) rather than just text --- fits tasks where the answer has structure the AI can express in interface elements directly. The Vercel AI SDK’s generative UI primitives, OpenAI’s structured outputs, Anthropic’s tool use with structured arguments all enable this pattern. Side panels (Anthropic Artifacts, OpenAI Canvas) keep persistent work artifacts visible alongside the chat where iteration happens. Inline action affordances embed suggested actions directly in chat responses, reducing the friction of moving from “the agent suggested this” to “do this.” Embedded agents place the AI inside the existing application (CopilotKit’s pattern, the embedded agent in many SaaS products) rather than in a separate chat interface, so the work artifacts and the AI coexist.
The design discipline’s position in 2026: choose the interface shape from the task, not from the chat default. Many production agent products use chat as one of several interface modes rather than the primary or only interface. The pattern is still evolving; the chat-only pattern is identifiably the 2023 design; production deployments are moving past it as the patterns documented in Section B mature.
Chapter 4. Transparency vs. Cognitive Load
Agents do many things behind the scenes. They think (chain-of-thought reasoning, sometimes explicit, sometimes hidden). They plan (deciding which steps to take). They call tools (searching, retrieving, executing code). They evaluate (checking results, deciding if more work is needed). They synthesize (combining retrieved information, prior context, and reasoning into the response). Each of these is a candidate for the UI to show or hide. The design choice between showing and hiding produces a trade-off the discipline calls transparency vs. cognitive load.
Hide-everything fails because users don’t trust outcomes they can’t inspect. “The AI said so” isn’t a satisfying answer for any non-trivial task; users want to know how the AI arrived at the answer, especially when the answer matters. Errors are invisible until they cause visible harm. Trust erodes over time as users discover cases where the AI was wrong but appeared confident. The chat interface’s default in 2023—2024 was close to hide-everything: just show the final response, with optional access to the reasoning behind expanded controls. The pattern doesn’t survive production deployment in domains where the agent’s decisions matter.
Show-everything fails for a different reason. The full agent trace is voluminous: reasoning tokens, tool calls with arguments, retrieved documents, intermediate hypotheses, self-corrections. Showing everything overwhelms users; signal gets lost in noise. Cognitive load exceeds the capacity of most users to extract useful information. Power users with specific debugging needs benefit from the full trace; general users are harmed by it. The early developer-tools-for-AI products tended toward show-everything (LangSmith’s default trace view, Phoenix’s detailed spans); the user-facing products have moved away from this pattern.
Progressive disclosure is the dominant pattern that resolves the trade-off. Summary always (the final answer is the primary content), reasoning on demand (one click reveals the thinking and tool calls), full trace for power users (a deeper click or toggle reveals the complete trace). Anthropic Claude’s extended-thinking blocks default to collapsed; users can expand them when they want to see the reasoning. OpenAI ChatGPT’s reasoning is similarly visible behind a click. Claude.ai’s side-panel artifacts work with the chat’s conversation, providing structured work outputs alongside the chat’s linear response. The pattern lets different audiences get different levels of disclosure from the same interface: casual users see clean answers; power users dig in; debuggers get the full trace.
The pattern requires choosing what counts as summary, what counts as on-demand detail, and what counts as power-user trace. The choices vary by domain. For coding agents, the summary is the proposed code change; on-demand detail is the explanation of why; power-user trace is the full sequence of tool calls (file reads, test runs, search queries). For research agents, the summary is the synthesized answer; on-demand detail is the list of sources with their relevance scores; power-user trace is each individual search query and reranking step. For customer service agents, the summary is the response to the customer; on-demand detail is the policy lookup and the customer history; power-user trace is the full reasoning chain. The design discipline’s working observation: production agent UIs typically support multiple disclosure levels users can toggle, rather than picking one disclosure level for all users.
Chapter 5. The Approval-Undo-Resume Pattern
Three phases of agent-user interaction recur across virtually every production agent UX: approval before action, undo after action, resume after pause. The three phases have different design considerations and trade-offs; the right design depends on the specific application but the framework recurs.
Approval before action is the pattern Volume 7’s HITL engineering serves: the user reviews what the agent intends before the action executes. The UX patterns are richer than Volume 7’s engineering suggests. Pre-action approval (the simplest pattern: agent proposes an action, user clicks confirm before it runs) works for low-frequency high-impact operations. Batched approval (the user reviews N proposed actions together rather than one at a time) reduces approval fatigue for medium-frequency operations. Progressive trust (the agent acts without approval for patterns the user has explicitly approved in prior sessions; new patterns trigger approval) adapts to the user’s growing trust over time. Risk-scaled approval (only sensitive operations require explicit approval; routine operations execute without it) reduces friction for the common case while protecting the high-stakes operations. The right pattern depends on the operation’s reversibility, the user’s expertise level, and the frequency of operations.
Undo after action is often a better pattern than approval before action. Undo-first design (act fast; allow rollback when something goes wrong) produces lower friction for routine operations than approval-first design. The user doesn’t have to think about every action; only the rare action that needs reversal interrupts the flow. The pattern also forces engineering of reversibility --- the engineering team has to think about how to undo each operation --- which catches design issues that approval-first design lets slide. Snapshot-and-restore patterns work for stateful operations where the state before the action can be captured and restored. Soft delete with grace period (the action is staged and visible as undoable for a window of time before becoming permanent) gives users a low-stakes way to reverse without explicit approval gates. Action history with selective revert allows users to undo specific actions from a list rather than just the most recent one.
Resume after pause is the pattern for long-running tasks and sessions that span multiple user visits. Conversation persistence across sessions is the baseline: the user can return to a conversation they started yesterday and continue. Notification on long-running task completion bridges the gap between starting a task and seeing the result when the wait is too long for synchronous interaction. Resume-from-checkpoint for failed runs lets the user pick up after a failure rather than starting over. Shareable session links for handoff support collaboration where one user starts a session and another continues it. What gets persisted determines what’s resumable: conversation history is the floor; tool state and partial results are valuable for many tasks; user preferences within the session are nice-to-have for personalization.
The patterns compose in production. Most production agent UX uses approval for irreversible operations (sending external emails, deleting records, making purchases), undo for reversible ones (editing documents, changing settings, drafting), and resume for long-running tasks (research that takes minutes to hours, complex multi-step workflows). The discipline isn’t to pick one pattern; it’s to match each operation type to the right pattern. The design failures occur when teams use the same pattern uniformly: approval-first for everything produces friction; undo-first for irreversible operations produces disasters; ignore-resume produces users who lose work when sessions end.
Part 2 — The Substrates
Eight sections follow. Each opens with a short essay on what its entries have in common. Representative substrates are presented in the Fowler-style template, adapted for design patterns and component libraries.
Sections at a glance
-
Section A --- Streaming and progressive disclosure
-
Section B --- Generative UI and artifact patterns
-
Section C --- Approval and confirmation UX
-
Section D --- Citation, sources, and confidence
-
Section E --- Long-running task UX
-
Section F --- Component libraries and SDKs
-
Section G --- Multi-agent and visualization
-
Section H --- Discovery and design communities
Section A — Streaming and progressive disclosure
Token streaming UX and intermediate state display
Agent responses take time. Token streaming --- displaying response tokens as they generate rather than waiting for the complete response --- was the canonical UX response when ChatGPT shipped in late 2022, and it remains the baseline pattern. The discipline has matured beyond simple token streaming: intermediate state display now shows thinking tokens, tool calls, retrieved documents, and other process steps in addition to the final response. The design challenge is choosing what intermediate state to show and how, balancing transparency with cognitive load (Chapter 4).
Token streaming UX
Source: Server-Sent Events (SSE) over HTTP; WebSockets; vendor SDKs (Anthropic Messages API streaming, OpenAI Chat Completions streaming)
Classification The baseline pattern: display response tokens as they generate rather than waiting for completion.
Intent
Reduce perceived latency for agent responses by streaming tokens to the UI as they generate, producing the feeling of an agent typing in real time rather than waiting silently and then dumping a complete response.
Motivating Problem
Agent responses can take seconds to a minute for typical operations. Waiting silently for the complete response, then displaying it all at once, produces a worse user experience than showing the response as it generates, even when the total wall-clock time is identical. Users tolerate streaming latency much better than silent latency: the visible token-by-token generation reads as progress; the silent wait reads as a frozen application. Token streaming was an immediate consensus pattern in 2022—2023 and remains the baseline for any agent UI with response times longer than a second.
How It Works
Server-side: the agent’s response is generated token-by-token (the way LLMs natively work) and pushed to the client over a streaming connection. Server-Sent Events (SSE) is the standard transport for HTTP-based streaming --- a long-lived HTTP connection where the server can push events to the client. WebSockets is the alternative for bidirectional communication.
Client-side: the UI receives tokens as they arrive and appends them to the rendered response. The user sees the response building up character-by-character (or word-by-word, depending on the granularity). Markdown formatting is applied progressively (a streaming markdown renderer handles incomplete markdown gracefully).
Vendor SDK support: Anthropic’s Messages API and OpenAI’s Chat Completions API both support streaming as a first-class mode. The SDKs (anthropic-sdk-python, openai-python, ai (Vercel’s SDK)) handle the streaming protocol details; application code typically calls a streaming method and iterates over the resulting async stream.
Granularity choices: streaming at the token level produces a fast-typewriter effect; streaming at the word boundary produces smoother output; streaming at the sentence boundary trades the typewriter feel for cleaner partial output. Most production UIs stream at the token level, debounced or smoothed in the UI to avoid jitter.
Edge cases: errors mid-stream require graceful handling (the user has seen partial output and the error means more won’t arrive); cancellation requires the client to be able to close the stream and the server to detect the closure; reconnection after network interruption is rarely implemented because most cases are short enough that retry-from-scratch is acceptable.
When to Use It
Essentially all user-facing agent interfaces. The pattern is so universally expected that its absence reads as a defect. Even applications that primarily produce structured output (generative UI, tool outputs) typically stream the natural-language portions of responses.
Alternatives --- non-streaming responses where the response is structured data rendered atomically (a generated JSON object, a complete chart) without intermediate text. Asynchronous patterns (Section E) where the response time is long enough that the UI shifts to background-task mode entirely.
Sources
-
docs.claude.com/en/docs/build-with-claude/streaming
-
platform.openai.com/docs/api-reference/streaming
-
sdk.vercel.ai (Vercel AI SDK streaming primitives)
Example artifacts
Code.
// Token streaming with the Vercel AI SDK (React)
'use client';
import { useChat } from 'ai/react';
export default function ChatComponent() {
const { messages, input, handleInputChange, handleSubmit } =
useChat({
api: '/api/chat',
});
return (
<div className="chat">
{messages.map(m => (
<div key={m.id} className={`message \${m.role}`}>
{/* m.content updates token-by-token as the response streams */}
{m.content}
</div>
))}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} />
<button type="submit">Send</button>
</form>
</div>
);
}
// Server-side (Next.js Route Handler):
// app/api/chat/route.ts
import { anthropic } from '@ai-sdk/anthropic';
import { streamText } from 'ai';
export async function POST(req: Request) {
const { messages } = await req.json();
const result = await streamText({
model: anthropic('claude-opus-4-7'),
messages,
});
return result.toDataStreamResponse();
}
Intermediate state display (thinking, tool use, retrieval)
Source: Anthropic Claude thinking blocks; OpenAI reasoning visibility; LangChain/LangGraph streaming events; custom implementations
Classification Display of the agent's process — reasoning, tool calls, retrieval — in addition to the final response.
Intent
Show users what the agent is doing during the response generation --- not just the final answer, but the reasoning, tool calls, retrieved sources, and other process steps --- in a way that supports trust and debugging without overwhelming with cognitive load.
Motivating Problem
Pure token streaming shows the final response building up but not the process behind it. Users wait through silent gaps when the agent is thinking, retrieving, or calling tools. The gaps erode the streaming UX’s benefit and leave users uncertain whether the agent is making progress. Intermediate state display fills these gaps: the UI shows thinking tokens, tool call invocations, retrieved documents, evaluation steps as they happen, alongside the final response that streams when it’s ready. The design challenge is presenting the intermediate state without overwhelming the primary response --- progressive disclosure (Chapter 4) is the dominant pattern.
How It Works
Thinking blocks (Anthropic’s pattern): the agent generates reasoning tokens explicitly marked as thinking. The UI renders these in a visually distinct way (typically collapsed by default, in a muted color when expanded, with a label like “Thinking” or “Reasoning”). The thinking is part of the streaming response --- it streams in real time --- but it’s distinguished from the final answer the user is meant to consume.
Tool call visibility: when the agent calls a tool, the UI shows the tool invocation (which tool, with what arguments) and the tool’s result. Different products handle this differently. ChatGPT shows tool calls as expandable cards inline with the response. Claude.ai shows tool calls as smaller text in the response stream with the result inline. Cursor and other coding assistants show file-system operations and code edits as their own UI elements distinct from the chat. The right pattern depends on the audience and the tool’s nature.
Retrieved source display (RAG agents): when the agent retrieves documents, the UI shows what was retrieved --- typically a card or list of retrieved sources with titles, snippets, and relevance scores. The sources are then cited in the response (Section D). The display lets users verify the retrieved content makes sense for the question and gives them a starting point for digging deeper.
Plan display: for agents that plan before executing, the UI shows the plan --- the sequence of steps the agent intends to take --- before executing them. Users can review and override; the agent updates the plan as execution proceeds. The pattern is most useful for multi-step tasks where the plan has meaningful structure; it’s overhead for simple single-step tasks.
Streaming event protocols: the Vercel AI SDK’s data stream protocol, OpenAI’s streaming API events, Anthropic’s streaming Messages API all support emitting structured events alongside the text stream. The events carry typed information (tool_use, thinking, citation, error) that the UI uses to render the appropriate intermediate state element.
When to Use It
Production agent UIs where users benefit from understanding the agent’s process. Coding agents (showing file operations matters). Research agents (showing retrieved sources matters). Complex multi-step agents (showing plans matters). High-stakes agents where trust depends on inspectability.
Alternatives --- hidden intermediate state for simple consumer applications where the user wants just the answer (a recipe suggestion, a quick lookup). The trade-off is users who later want to debug why the agent gave a wrong answer have less to work with; production deployments often expose intermediate state as an opt-in (developer mode, debug view) rather than a default.
Sources
-
docs.claude.com/en/docs/build-with-claude/extended-thinking
-
sdk.vercel.ai/docs/ai-sdk-ui/streaming-data
-
platform.openai.com/docs/guides/reasoning
Section B — Generative UI and artifact patterns
Anthropic Artifacts, OpenAI Canvas, Vercel AI SDK generative UI --- beyond chat
Chapter 3 argued that chat is the wrong interface for many tasks. The working alternatives have emerged through 2024—2026: generative UI (the AI produces structured interface elements, not just text) and artifact panels (persistent work artifacts displayed alongside the chat where iteration happens). Three products and one pattern dominate the space. Anthropic Artifacts established the side-panel pattern for documents, code, and structured outputs. OpenAI Canvas provided a similar pattern with different specific design choices. Vercel AI SDK’s generative UI primitives turned the pattern into a developer primitive: agents stream React components into the UI as part of the response, mixing prose and structured widgets fluidly.
Anthropic Artifacts
Source: claude.ai (consumer product feature, launched 2024)
Classification Side-panel pattern for persistent work artifacts — documents, code, diagrams, interactive HTML.
Intent
Provide a dedicated UI surface for substantial work products the agent generates --- documents, code, interactive HTML, diagrams --- displayed alongside the chat where the user iterates on them, distinct from the conversation history where chat-style interaction happens.
Motivating Problem
Generated content like documents, code, and visualizations doesn’t fit the linear chat format. Users want to see the work artifact prominently, iterate on it, copy it, edit it; users don’t want to scroll through the chat history to find it after a few more turns. The artifact pattern resolves this: the chat stays on one side for the conversation; the artifact stays visible on the other side, updated as the user requests changes, with affordances for preview, code view, copy, and download. The pattern has become widely adopted; OpenAI Canvas, generative UI patterns, and many other implementations follow Anthropic’s original design.
How It Works
Trigger: the model decides when content warrants an artifact (typically: substantial documents over a few hundred words, code blocks longer than 20 lines, structured content like tables or diagrams, interactive content like HTML/React/SVG). The decision is partly model-driven and partly application-controlled; system prompts can adjust the threshold.
Display: the artifact appears in a side panel (typically on the right, occupying roughly 60% of the viewport on desktop) while the chat remains in a narrower column on the left. The artifact is the focus when generated; the chat scrolls below the original message that produced it without re-displaying the artifact content inline.
Iteration: subsequent turns can update the artifact. The agent regenerates it or applies targeted edits. The artifact panel shows the latest version; version history may be available depending on the product. The user can edit the artifact directly in some cases (text documents, code) or only through chat in others.
Preview vs source view: artifacts that render visually (HTML, SVG, React components) show in preview mode by default with a toggle for source view. Code artifacts show source by default with a copy button. Document artifacts render with formatting.
Sharing and persistence: artifacts can be shared via link, downloaded, or published in some cases (Claude.ai’s public artifacts feature allows hosting interactive artifacts on the web). The artifact becomes a first-class object beyond the ephemeral chat conversation.
When to Use It
Substantial generated content the user wants to iterate on or use elsewhere: documents to copy into other tools, code to integrate into a project, interactive demonstrations, structured outputs. Conversational AI products where chat is the right primary interface but generated content needs its own home.
Alternatives --- inline content within the chat for short generated outputs that fit naturally there. Fully generative UI (Vercel AI SDK pattern) where the entire interface adapts to the response rather than a fixed chat-plus-side-panel layout. Embedded agents (CopilotKit pattern) where the AI lives inside the existing application rather than producing artifacts in a separate AI interface.
Sources
-
claude.ai (product)
-
anthropic.com/news/artifacts (announcement)
OpenAI Canvas
Source: chatgpt.com (consumer product feature, launched 2024)
Classification OpenAI's implementation of the side-panel artifact pattern.
Intent
Provide a dedicated workspace for document and code editing within ChatGPT, with chat-driven and direct-manipulation editing both supported, designed to let users iterate on substantial work products without losing them in chat history.
Motivating Problem
The same problem Anthropic Artifacts addresses: substantial generated content doesn’t fit chat well, and users want a dedicated workspace for it. OpenAI’s implementation makes different design choices in some places while sharing the overall pattern.
How It Works
Activation: Canvas opens automatically for substantial documents or code, or when the user explicitly requests it. Like Artifacts, the trigger is partly model-driven; system prompts adjust the threshold.
Layout: a workspace area replaces the chat column when active. The chat condenses to a narrower column. The workspace shows the document or code with editing affordances.
Direct editing: Canvas allows direct editing of the workspace content (typing in the document, modifying code) alongside chat-driven editing. This makes it more like a collaborative editor than purely a display surface.
Targeted edits: the model can apply targeted edits to specific parts of the workspace rather than regenerating the whole document. This pattern reduces friction for iterative refinement and matches how human writers and developers actually work.
Inline suggestions: for some content types, Canvas shows inline suggestions (proposed edits the user can accept or reject one at a time) rather than only chat-driven changes. The pattern echoes Cursor’s code editing UX applied to documents.
When to Use It
ChatGPT-based workflows for document and code creation where iterative editing matters. Use cases similar to Anthropic Artifacts but within the OpenAI product ecosystem.
Alternatives --- Artifacts in the Anthropic Claude product. Pure chat for short outputs. Direct integrations into existing tools (Word, Google Docs, VS Code) where the editor is the primary surface.
Sources
-
chatgpt.com (product)
-
openai.com/index/introducing-canvas/ (announcement)
Vercel AI SDK generative UI
Source: sdk.vercel.ai (open source, TypeScript/React/Svelte/Vue/Solid)
Classification Developer primitive for streaming React components from the server as part of agent responses.
Intent
Enable applications to stream interactive UI components (React, Svelte, Vue, Solid) from server-side AI responses rather than only text, so the agent can return structured interface elements (forms, cards, charts, custom widgets) that the user interacts with directly.
Motivating Problem
The artifact pattern from Anthropic and OpenAI is product-specific --- it works within Claude.ai or ChatGPT. Developers building their own AI products need primitives for generating structured UI from agent responses without building the artifact panel infrastructure themselves. Vercel AI SDK’s generative UI fills this gap: server-side AI calls return React components alongside text; the components stream to the client and render inline; users interact with them as native UI elements rather than parsing JSON or markdown.
How It Works
Server-side: the application defines tools or function calls that return React Server Components (RSC) rather than text. When the AI decides to invoke one of these (“the user wants to see a chart of their orders”), the server renders the component and streams it to the client.
Client-side: the client uses Vercel AI SDK’s UI primitives (useUIState, useActions) to manage streaming React components in the chat flow. Components render inline with the conversation; users interact with them like any other React UI element.
Composability: generated UI can compose with the rest of the application. A streaming chart can use the same charting library the rest of the app uses; a generated form can submit to the same backend the rest of the app submits to. The pattern reduces the inside-the-AI-bubble feeling of standalone chat interfaces.
Interaction patterns: the AI can generate not just static UI but interactive UI --- forms that submit back to the AI, charts users can drill into, configurators users adjust. The interaction triggers new AI responses, producing a richer back-and-forth than text-only chat permits.
Trade-off: generative UI requires more upfront design effort than chat. The developer has to define what components the AI can generate and what they do; the design of those components determines the user experience. The benefit is interfaces that match tasks (Chapter 3); the cost is that the developer has to think through the interface design rather than letting chat’s linear text be the default.
When to Use It
Applications building agent interfaces from scratch where chat alone is insufficient. Cases where the AI’s outputs have structure that maps to UI components naturally (charts, forms, lists, custom widgets). React/Vue/Svelte applications where the UI framework supports the generative pattern.
Alternatives --- chat-with-artifacts (Anthropic/OpenAI pattern) for cases where the product is primarily AI-conversational with occasional structured outputs. Embedded AI in existing applications (CopilotKit pattern) for cases where the application is primary and AI augments specific surfaces. Pure chat for cases where the conversation is the product.
Sources
-
sdk.vercel.ai/docs/ai-sdk-ui/generative-user-interfaces
-
vercel.com/blog/introducing-vercel-ai-sdk-3
-
github.com/vercel/ai
Example artifacts
Code.
// Generative UI with Vercel AI SDK (Next.js + RSC)
// app/actions.tsx
'use server';
import { generateText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { OrderChart } from '@/components/order-chart';
export async function submitMessage(userMessage: string) {
const result = await generateText({
model: anthropic('claude-opus-4-7'),
messages: [{ role: 'user', content: userMessage }],
tools: {
showOrderChart: tool({
description: 'Display a chart of the user\'s orders.',
parameters: z.object({
period: z.enum(['7d', '30d', '90d']),
}),
execute: async ({ period }) => {
const data = await fetchOrderData(period);
// Return a React component, not text
return <OrderChart data={data} period={period} />;
},
}),
},
});
// The OrderChart component streams to the client
// and renders inline with the agent's response
return result;
}
Section C — Approval and confirmation UX
Pre-action approval, undo-first design, progressive trust
Volume 7 covered the engineering of HITL approval gates: the queue mechanisms, intervention APIs, callback hooks. The UX side has its own discipline. Different approval patterns produce different user experiences with different trade-offs. The wrong approval pattern produces friction (confirmation fatigue, slow workflows) or risk (irreversible actions executed without review); the right pattern matches the operation’s reversibility, the user’s expertise level, and the frequency of operations.
Pre-action approval patterns
Source: Pattern documented across many production agent UIs; engineering substrate covered in Volume 7
Classification UX patterns for reviewing agent intentions before execution.
Intent
Provide users with visibility into what the agent is about to do, with explicit opportunity to confirm, modify, or cancel before the action executes, scaled to the operation’s significance and frequency.
Motivating Problem
For operations the agent shouldn’t execute without user knowledge --- sending external communications, modifying important data, making purchases, deleting content --- the user needs an opportunity to review what the agent intends before it happens. The naive design (confirm every action) produces friction that destroys the agent’s value proposition. The more nuanced design scales approval to the operation’s significance: routine read operations execute without approval; consequential operations get explicit review; sensitive operations may require step-up authentication. The design discipline involves deciding which operations get approval gates and what those gates look like.
How It Works
Pre-action confirmation: the agent describes what it’s about to do (“I’ll send this email to alice@example.com with this subject and body”); the user clicks confirm to proceed or modify to revise. The pattern is clearest for irreversible operations with high blast radius. Most production agent UIs use this pattern for at least some operations.
Batched approval: instead of confirming each action separately, the agent presents N proposed actions together and the user approves the batch. The pattern reduces approval fatigue for medium-frequency operations where one-at-a-time confirmation would be tedious. Cursor’s code edit approval shows multiple file changes in a diff view the user accepts together. The pattern works when the actions are independent enough that batch approval is meaningful and not so consequential that they each need individual review.
Progressive trust: the agent auto-approves patterns the user has explicitly approved in prior sessions; new patterns trigger approval. The pattern adapts to the user’s growing trust over time. ChatGPT’s memory feature uses something similar: facts the user has explicitly let the AI remember don’t require re-confirmation in future sessions. The design challenge is making the trust adjustments visible (the user can see what’s been auto-approved) and revocable (the user can withdraw trust).
Risk-scaled approval: only sensitive operations require explicit approval; routine operations execute without it. The agent classifies operations by risk (often using a separate classifier or rule-based heuristics); the UI shows the classification (a badge or label indicating sensitivity) so the user understands why approval was or wasn’t required. The pattern works when the risk classification is reliable; misclassification of sensitive operations as routine creates the same risk as no approval gates.
Approval interface design: the approval UI itself matters. Showing what will happen, in concrete detail, is more useful than abstract descriptions (“Send email to alice@example.com” vs. “Send email”). Showing what the user can change in the approval step (“edit the message before sending”) reduces the friction of “approve or restart entirely.” Showing the consequences of approving and not approving (“if approved, the message goes; if denied, nothing happens”) makes the choice concrete.
When to Use It
Irreversible operations with significant blast radius. Operations affecting external systems where the user can’t easily undo through the agent. Sensitive operations (financial transactions, content with privacy implications). High-stakes operations where the cost of a wrong action exceeds the cost of friction.
Alternatives --- undo-first design for reversible operations. Auto-execution for low-stakes routine operations. The discipline is matching the pattern to the operation, not picking one pattern for everything.
Sources
-
Production agent UIs (Cursor, Claude Code, ChatGPT, Claude.ai documented patterns)
-
Volume 7 (engineering substrate)
Undo-first design
Source: Pattern documented in agent and non-agent software design; cited as alternative to approval-first in agent UX literature
Classification UX pattern of allowing fast action with rollback rather than slow confirmed action.
Intent
Let the agent act on routine operations without explicit approval, with a clear path to undo if the action is wrong. Reduce friction for the common case while preserving the user’s control over outcomes.
Motivating Problem
Approval-first design produces friction proportional to the frequency of operations. For high-frequency low-stakes operations, requiring approval for each one destroys the agent’s productivity benefit; users learn to click confirm reflexively (which defeats the purpose of confirmation) or stop using the agent for these operations. Undo-first design takes the opposite approach: act fast, allow rollback, design the system so reversal is easy and well-supported. The pattern produces better UX for operations where reversal is feasible; it requires engineering investment in the reversibility itself.
How It Works
Snapshot-and-restore: before performing a stateful operation, the agent captures the relevant state. After the operation, the user can choose to restore the previous state if the result wasn’t what they wanted. The pattern works for document edits (preserve the prior version), settings changes (capture the prior value), and other operations with capturable state.
Soft delete with grace period: destructive operations (delete, remove) are staged rather than executed immediately. The UI shows the item as removed but actually preserves it for a window (minutes to days) during which the user can restore. Email systems have used this pattern for decades (the “Undo Send” feature, the trash folder); agent UIs apply it to broader operations.
Action history with selective revert: the agent maintains a log of actions taken; the user can review the log and revert specific actions selectively. The pattern allows undoing not just the most recent action but earlier ones too, which is valuable when the user doesn’t notice the wrong action immediately.
Engineering investment: undo-first design requires the operations to be reversible, which is engineering work the team has to do. Some operations are inherently irreversible (external emails sent, payments made, content posted to public channels); for these, approval-first is the only viable pattern. The discipline is identifying which operations are reversible and engineering the reversibility well; the design follows from the engineering capability.
Combining patterns: most production agent UIs use undo-first for reversible operations and approval-first for irreversible ones, with the classification often visible to users (“you can undo this” vs. “this cannot be undone”). The combination handles both cases well; using either pattern exclusively produces failures at the boundary.
When to Use It
Reversible operations where the engineering of reversibility is feasible. High-frequency operations where approval friction would destroy the agent’s value proposition. Document editing, settings management, content creation where the user iterates and doesn’t want every iteration to require confirmation.
Alternatives --- approval-first design for irreversible operations. Hybrid patterns where the action executes immediately with a brief undo window before becoming permanent.
Sources
-
Jef Raskin, The Humane Interface (the original undo-first articulation)
-
Email Undo Send feature design retrospectives
-
Modern agent UX literature citing the pattern as preferred for reversible operations
Section D — Citation, sources, and confidence
Source attribution and uncertainty display --- making agents inspectable
Agent outputs come from somewhere: training data, retrieved documents, tool results, prior conversation context. Citation patterns make the somewhere visible. The discipline draws on academic citation practice but adapts to AI specifics: citations need to be clickable to the source (not just bibliographic references), citations may apply to specific claims rather than entire responses, the agent may not always know which source supported which claim. Confidence and uncertainty display extends citation: when the agent doesn’t have a confident answer, the UI must communicate this rather than presenting uncertain claims as authoritative.
Citation patterns for AI responses
Source: Pattern across Anthropic Claude (citations in responses), OpenAI ChatGPT (sources in search results), Perplexity (citations as first-class UX), Google Gemini
Classification UX patterns for attributing AI claims to specific sources.
Intent
Make the sources of agent claims visible and verifiable, so users can check the agent’s work, understand which claims are supported by retrieved sources vs. model knowledge, and dig deeper into specific sources when needed.
Motivating Problem
Agent responses synthesize information from multiple sources: retrieved documents, training data, tool results. Without explicit citation, users can’t distinguish supported claims from hallucinated ones; can’t check the agent’s synthesis against the originals; can’t dig deeper into specific points. Citation makes the sources visible and verifiable. The design challenge is presenting citations without overwhelming the response --- inline footnote-style references, hover-to-preview source snippets, source lists with relevance indicators all serve the function with different trade-offs.
How It Works
Inline citation markers: the response includes superscript numbers or links at specific points; the citations resolve to the sources that supported the claim at that point. Perplexity’s UX is the canonical implementation; many other products follow similar patterns. The user can hover or click for source detail without leaving the response.
Source lists: the response is followed (or accompanied) by a list of sources with titles, snippets, URLs. The user can see at a glance what informed the response and explore specific sources. Anthropic’s citations in Claude API responses use this structured pattern; the application renders the list in whatever format fits the product.
Span-level citation: more granular than inline markers, span-level citation marks the specific text spans in the response that came from each source. The pattern is more accurate but visually denser; production UIs typically use it as an opt-in (toggle to see span-level citations) rather than a default.
Citation reliability: the citation’s accuracy depends on the agent’s ability to attribute claims correctly to sources. Modern foundation models (Claude with citations enabled, GPT-4 with structured output for citations) have improved reliability; older patterns where the model generated free-form citations often produced citations to documents that didn’t actually support the claim. The pattern works when the engineering produces reliable citations; it fails badly when the citations are wrong because users trust them.
Citation in retrieved content: the source’s position in the retrieved document matters. “Source X says Y” should ideally point to the specific passage in X that supports Y, not just X as a whole. The deep-link pattern (cite to the specific paragraph or passage) provides this; it requires the retrieval system to preserve location information.
When to Use It
RAG-based agents where the agent’s value comes partly from synthesizing retrieved sources. Research and analysis tools where source verification matters. Customer-facing agents where users need to trust the agent’s claims. Regulated domains (legal, medical, financial) where claims need provenance.
Alternatives --- no citation for purely conversational agents where the model’s general knowledge is the source. Source lists without inline citations for cases where the response is a synthesis from multiple sources rather than discrete claims attributable to specific ones.
Sources
-
perplexity.ai (UX reference)
-
docs.claude.com/en/docs/build-with-claude/citations
-
Academic citation practice (general background)
Confidence and uncertainty display
Source: Pattern across multiple products; emerging discipline less consolidated than citation
Classification UX patterns for communicating agent uncertainty about claims and decisions.
Intent
Communicate to users when the agent is uncertain about claims, decisions, or actions, so users can apply appropriate skepticism and verify or override as needed.
Motivating Problem
Agents often produce confident-sounding output regardless of whether the underlying reasoning is well-supported. Users learn to read all AI output as authoritative because the model’s confidence calibration in generated text doesn’t reliably distinguish certain claims from uncertain ones. Explicit uncertainty display --- “I’m not sure about this part”, “This is my best guess but I’d verify”, confidence levels for specific claims --- helps users apply appropriate skepticism. The pattern is less developed than citation; the engineering of reliable uncertainty estimation in LLMs is still maturing, and UX patterns for displaying uncertainty are correspondingly less consolidated.
How It Works
Explicit uncertainty language: the simplest pattern is the model expressing uncertainty in natural language (“I’m not certain, but…”, “my best guess is…”, “I don’t know”). The pattern depends on the model’s calibration --- if it expresses uncertainty randomly rather than tracking actual confidence, the cue is unreliable. Modern foundation models are reasonably well-calibrated for this pattern; system prompts can reinforce it.
Confidence scores: numeric confidence indicators alongside specific claims. The pattern requires the agent to produce confidence estimates, which is harder than producing uncertain-sounding language. Specific tooling (perplexity-based confidence, token-probability-based confidence, ensemble approaches) can produce numeric confidence; the UX displays it as a percentage, a scale, or a categorical indicator (“high confidence”, “medium”, “low”).
Disagreement display: when the agent has multiple plausible answers, showing the disagreement is more honest than picking one and presenting it as confident. Patterns include showing multiple alternatives (with their relative likelihoods), explicit disagreement statements (“sources differ on this”), or branching responses that let the user explore alternatives.
Refusal patterns: when the agent shouldn’t answer (insufficient information, safety constraint, out-of-scope question), the refusal itself is a UX element. Production patterns: clear acknowledgment of what the user asked, explanation of why the agent isn’t answering, suggestions for what the user could do instead. The pattern is well-established in safety-critical contexts (medical, legal); it’s less developed in general consumer agents.
Verification suggestions: when the agent is uncertain, suggesting how the user could verify --- “check the official documentation,” “confirm with your accountant,” “search for recent updates” --- turns uncertainty into actionable guidance. The pattern combines uncertainty acknowledgment with practical next steps.
When to Use It
Agents handling high-stakes claims where misplaced confidence has costs (medical advice, legal guidance, financial decisions). Research and analysis tools where users need to apply judgment to AI output. Customer-facing agents where overconfident wrong answers damage trust more than acknowledged uncertainty does.
Alternatives --- no explicit uncertainty for low-stakes conversational agents where the costs of wrong answers are bounded. Implicit uncertainty (the model’s natural language tends to express uncertainty when present) without explicit numeric or categorical indicators.
Sources
-
Calibration literature in ML
-
Production agent UX in regulated domains (medical, legal AI products)
Section E — Long-running task UX
Status and progress patterns; notification and resume
Some agent tasks take minutes or hours: deep research that spans many tool calls, code generation across many files, complex multi-step workflows. Synchronous waiting doesn’t fit; the user shouldn’t stare at a loading spinner for 20 minutes. The UX patterns for long-running tasks resolve this: status indicators that communicate progress without forcing the user to wait, notifications that surface completion when the user has moved on, resume patterns that let the user pick up tasks after a pause. The discipline draws on patterns from non-AI long-running task UX (CI/CD pipelines, video rendering, batch processing) and adapts them to agent specifics.
Status and progress UX for long-running agent tasks
Source: Pattern across deep research products (Anthropic Claude with web search and research, OpenAI Deep Research, Perplexity Pro), agent platforms (LangGraph Studio, agent dashboards)
Classification UX patterns for displaying agent progress during long-running tasks.
Intent
Communicate progress on tasks that take longer than reasonable synchronous wait times, in ways that let users monitor without forcing constant attention and that surface relevant state without overwhelming with raw trace data.
Motivating Problem
When agent tasks take more than 10—20 seconds, the synchronous wait pattern breaks down. Users either stare at the screen unhappily or task-switch and forget about the agent. Status and progress UX bridges this gap: the user can monitor the task without watching it, switch contexts and return later, see what’s happening at the level of detail appropriate for the moment. The design challenge is choosing what level of detail to surface and how to communicate progress when the agent’s work doesn’t have a natural percentage-complete metric.
How It Works
Step-level status: the agent’s work is broken into named steps (planning, searching, reading, analyzing, writing); the UI shows which step is currently active and which have completed. The pattern works when the steps have clear boundaries and stable structure. Anthropic’s deep research feature and similar products use this pattern: “Researching… reading 5 sources… analyzing findings… writing summary…”.
Activity feed: a scrolling feed of what the agent is doing in real time --- tool calls, retrievals, decisions --- typically at higher granularity than step-level status. The pattern lets users dip in for detail; the feed isn’t the primary content but is available. LangGraph Studio and developer-focused agent UIs use this pattern.
Progress estimates: when the task has predictable structure, the UI can estimate progress (“step 3 of 5”, “approximately 60% complete”, “estimated time remaining: 2 minutes”). The estimates are imprecise for agent tasks where the work depends on what the agent discovers; communicating the imprecision (“approximately”, “estimated”) is important to set expectations.
Streaming partial results: even when the final result isn’t ready, the agent can stream partial results as they become available. A research agent can stream early findings while continuing to research; a code agent can stream completed files while still working on others. The pattern keeps the user engaged with the work as it builds up rather than waiting for atomic completion.
Background mode: for tasks too long for any reasonable wait, the UI shifts to background mode. The task continues; the user can navigate elsewhere; the UI shows the task in a list of in-progress work; completion triggers notification. The pattern is essential for hour-scale tasks; it works less well for minute-scale tasks where the user would naturally wait.
When to Use It
Agent tasks taking longer than ~20 seconds. Deep research workflows. Complex code generation across many files. Multi-step automation tasks. Anywhere the synchronous wait pattern would damage UX.
Alternatives --- synchronous wait with streaming for tasks short enough that the user reasonably waits. Pure asynchronous (fire-and-forget) for batch tasks where no progress display is meaningful.
Sources
-
Anthropic Claude research feature documentation
-
OpenAI Deep Research product UX
-
LangGraph Studio (developer agent observability)
Notification and resume patterns for sessions
Source: Pattern across consumer AI products (ChatGPT, Claude.ai, Gemini), professional tools (Cursor, agent platforms)
Classification UX patterns for completion notification and continuation across sessions.
Intent
Let users start agent tasks and return to them later --- across sessions, across devices, after the task completes --- with the agent’s state preserved and the user’s context restored.
Motivating Problem
Volume 6 covered the engineering of memory and state persistence. The UX dimension determines whether users actually benefit from the persistence. Notification on completion lets users move on while a task runs and surfaces the result when ready. Conversation persistence lets users return to in-progress conversations across days. Resume-from-checkpoint lets users pick up after failures. Shareable session links support collaboration. The patterns transform agents from chat-with-now sessions into work-with-over-time tools.
How It Works
Completion notifications: when a long-running task completes, the user gets notified through their preferred channel --- in-app banner, push notification, email, sometimes SMS for high-stakes tasks. The notification surfaces enough information to know what completed without requiring the user to open the app. Production patterns: “Your research on X is ready”, “Claude finished refactoring 12 files”, “Your analysis completed with these findings.”
Conversation persistence: conversations stay available indefinitely; the user can return to a conversation from days or weeks ago and continue. The pattern depends on engineering (Volume 6) but the UX matters too: how is the conversation surfaced (sidebar, history page, search), how is it titled (auto-generated from content, editable by the user), how is it organized (chronological, by topic, by project)?
Resume-from-checkpoint: when a task fails or is interrupted, the user can resume from where it left off rather than starting over. The UX surfaces the partial state (“this task got through 6 of 10 steps before failing”) and offers explicit resume actions. The engineering depends on state persistence (Volume 6) and idempotent steps (Volume 1 patterns).
Shareable session links: a session can be linked, shared via URL, and continued by another user. The pattern supports collaboration (one team member starts research, another continues) and handoff scenarios. ChatGPT’s shared conversations and Claude.ai’s shared chats both implement this pattern. The design considerations: what gets shared (the conversation, the artifacts, the underlying tools), what permissions apply (read-only, fork, continue), how the recipient sees the shared content.
Cross-device continuity: starting a session on one device and continuing on another. The pattern is increasingly important as users move between phone, tablet, and desktop. The UX requires careful attention to interface adaptation (what works in the phone form factor may not match what works on desktop) but the underlying engineering is conversation persistence applied across devices.
When to Use It
Any agent product where users have tasks longer than a single session. Professional tools where work spans multiple work sessions. Consumer products with research, analysis, or creation workflows that benefit from across-session continuity. Collaboration scenarios where multiple users interact with the same agent session.
Alternatives --- ephemeral sessions for low-commitment use cases (quick lookups, casual queries). The trade-off is users can’t return to previous work; for the casual use cases that’s often fine.
Sources
-
Production AI products (ChatGPT, Claude.ai, Gemini) documented patterns
-
Cursor and Claude Code (professional agent tools) UX
Section F — Component libraries and SDKs
Vercel AI SDK UI, assistant-ui, AI Elements, CopilotKit
Building agent UIs from scratch requires implementing many of the patterns this volume documents --- streaming, generative UI, citation, approval, status. Component libraries and SDKs package the patterns into reusable primitives, accelerating development and standardizing the user experience across products. Four substrates dominate as of 2026. Vercel AI SDK’s UI primitives are the broadest-adopted, packaged with the AI SDK’s server-side capabilities. assistant-ui is a React library focused specifically on AI chat UIs with thread management, attachments, and tool calls. AI Elements (shadcn-style) provides composable React components for AI interfaces following the shadcn pattern. CopilotKit is a React framework for embedding AI inside existing applications rather than standalone chat.
The choice among these depends on the application shape. Vercel AI SDK UI fits Next.js applications and applications already using the AI SDK’s server-side primitives. assistant-ui fits applications wanting a polished standalone chat experience with rich features. AI Elements fits teams already using shadcn and wanting AI-specific extensions in the same style. CopilotKit fits applications where AI is embedded into existing UIs rather than the primary interface.
Vercel AI SDK UI primitives
Source: sdk.vercel.ai (open source, npm: ai, @ai-sdk/react, @ai-sdk/vue, @ai-sdk/svelte, @ai-sdk/solid)
Classification UI primitives for AI applications, packaged with Vercel AI SDK's server-side capabilities.
Intent
Provide React, Vue, Svelte, and Solid hooks and components for building AI applications, with first-class support for streaming, tool calling, generative UI, and multi-turn conversations.
Motivating Problem
Building AI UIs requires implementing many patterns from scratch: streaming state management, tool call rendering, generative UI composition, message history persistence. The Vercel AI SDK’s UI primitives package these into framework-specific hooks and components. useChat for chat interfaces, useCompletion for single-turn completion, useObject for streaming structured objects, useUIState/useActions for generative UI --- each hook handles a category of AI UI work that would otherwise require significant from-scratch implementation.
How It Works
useChat hook: the canonical primitive. Returns messages (the conversation history), input (the current input value with controlled updates), handleSubmit (form submission), isLoading (whether a response is in progress), stop (cancel an in-progress response). The hook handles the streaming protocol, message state, and error handling.
useCompletion hook: simpler variant for single-turn completions that don’t need conversation history. Returns completion (the streaming output), input, handleSubmit, isLoading, stop. The pattern fits applications where the AI is consulted for specific tasks rather than carrying on conversations.
useObject hook: streams structured objects (validated against a Zod schema) rather than free-form text. The pattern fits cases where the AI’s output has known structure that the UI can render progressively as it streams.
Generative UI primitives: useUIState and useActions support the generative UI pattern from Section B. Server-side actions return React components; the components stream to the client and render inline; the UI state tracks the conversation including the generated components.
Framework support: the SDK supports React (Next.js, plain React), Vue, Svelte, and Solid. The hooks and components have parallel implementations in each framework, with consistent APIs across them. The pattern reduces the cost of building AI UIs in non-React frameworks where AI library support has historically been weaker.
When to Use It
Next.js applications building AI interfaces (the SDK is Vercel-native and integrates particularly well). React, Vue, Svelte, Solid applications wanting framework-native AI primitives. Cases where the application uses Vercel AI SDK’s server-side capabilities and wants matching UI capabilities.
Alternatives --- assistant-ui for cases where a more polished standalone chat experience is the goal. CopilotKit for embedded AI patterns. Direct integration with vendor SDKs (Anthropic, OpenAI) for cases where the SDK’s abstractions don’t fit.
Sources
-
sdk.vercel.ai
-
github.com/vercel/ai
assistant-ui
Source: github.com/Yonom/assistant-ui (open source, npm: @assistant-ui/react)
Classification React component library focused on AI chat interfaces with rich features.
Intent
Provide a polished, batteries-included React component library for building AI chat interfaces, with thread management, attachments, tool calls, message editing, and many other features that production chat UIs need.
Motivating Problem
For applications where AI chat is the primary interface, building all the features users expect (thread switching, message editing, attachment upload, tool call rendering, copy/regenerate/edit affordances) requires significant work. assistant-ui packages these features as React components that integrate with various AI backends (Anthropic, OpenAI, AI SDK, custom). The library is more opinionated about the chat experience than Vercel AI SDK’s primitives, providing more out-of-the-box at the cost of less flexibility for non-chat patterns.
How It Works
Composable components: the library is built as composable React components rather than a monolithic chat widget. Thread, Composer, Message, ToolCall, and other components compose into the chat UI; teams can customize specific components without rebuilding the whole interface.
Backend adapters: the library connects to various AI backends through adapters. Anthropic, OpenAI, and AI SDK adapters are included; custom adapters can be written for other backends. The pattern decouples the UI from the specific AI provider.
Features: thread management (multiple conversations, switching, persisting), message editing (edit prior messages and regenerate), attachments (file upload, image input), tool call rendering (expandable cards showing tool invocations and results), copy/regenerate/like/dislike affordances on messages, markdown rendering with code highlighting, and many more.
Styling: built on top of Radix UI primitives with optional Tailwind styling. Teams can use the library’s default styling, customize through Tailwind, or replace styling entirely while keeping the component logic.
When to Use It
Applications where AI chat is the primary interface and the team wants a polished experience without building everything from scratch. Teams already using React and Tailwind. Cases where the library’s opinions about chat UX match what the product needs.
Alternatives --- Vercel AI SDK UI for more flexibility at the cost of more from-scratch UI work. AI Elements (shadcn) for shadcn-style composition. CopilotKit for embedded patterns. Custom implementations when the library’s opinions don’t fit.
Sources
-
github.com/Yonom/assistant-ui
-
assistant-ui.com
AI Elements (shadcn-style) and CopilotKit
Source: ai-sdk.dev/elements (Vercel AI Elements); github.com/CopilotKit/CopilotKit
Classification Component libraries for AI UIs and embedded AI in existing applications.
Intent
Cover two complementary substrates: AI Elements as shadcn-style composable AI components for teams already using shadcn, and CopilotKit as a framework for embedding AI inside existing applications rather than as standalone chat.
Motivating Problem
Different applications need different patterns. Teams using shadcn want AI components that follow shadcn’s composition style: copy the component code into your project, customize freely. Teams with existing applications want to embed AI as a sidekick or co-pilot inside the application, not as a separate chat interface. AI Elements and CopilotKit serve these different patterns; they’re documented together because the choice between them is one of the foundational architectural decisions in agent UX.
How It Works
AI Elements pattern: shadcn-style component library where users copy components into their project and modify them, rather than installing a package and using opaque components. The library covers AI-specific components (Message, ToolCall, Citation, ChatInput) in the shadcn aesthetic. The benefit is full control over component code; the cost is more code to maintain in the user’s project. Fits teams already using shadcn for the rest of their UI.
CopilotKit pattern: a React framework for embedded AI rather than standalone chat. The agent runs inside the existing application; AI features manifest as side panels, inline suggestions, command palettes, autocomplete-with-AI rather than a dedicated chat screen. The pattern fits applications where AI is augmenting existing functionality rather than being the primary interface.
CopilotKit specifics: provides hooks for embedding AI capabilities (useCopilotAction, useCopilotChat, useCopilotReadable) that integrate with React applications. Actions let the AI invoke application functions; readable lets the AI see application state; chat provides an in-app conversational interface. The combination supports AI that knows about the user’s context (what page they’re on, what data is displayed) and can take actions in the application directly.
Comparison: AI Elements is for chat-shaped AI products where the team wants component-level control. CopilotKit is for AI-augmented existing products where AI lives inside the application surface. assistant-ui sits between them as a polished chat library. Vercel AI SDK UI primitives sit beneath all of them as lower-level building blocks.
When to Use It
AI Elements: teams using shadcn for their UI who want AI components in the same style. Cases where component-level control matters more than installable convenience.
CopilotKit: applications where AI augments existing functionality rather than being the primary interface. SaaS products adding AI to existing surfaces (Notion-style AI in document editors, AI in dashboards, AI in CRMs).
Alternatives --- the other libraries in this section for cases where the patterns fit better. Custom implementations when no library matches the application’s needs.
Sources
-
ai-sdk.dev/elements
-
github.com/CopilotKit/CopilotKit
-
copilotkit.ai
Section G — Multi-agent and visualization
UX patterns for systems with multiple coordinating agents
Volume 9 covered the engineering of multi-agent coordination. The UX side has its own patterns: how do users understand what multiple agents are doing, intervene when needed, debug when things go wrong, and trust the coordination they can’t fully observe? The patterns are less consolidated than single-agent UX; multi-agent products are relatively few as of 2026, and the design practice is correspondingly less mature. What this section documents is the working state of practice with explicit acknowledgment that the patterns will evolve as multi-agent products proliferate.
Multi-agent UX patterns and visualization
Source: Patterns across LangGraph Studio, agent observability platforms, multi-agent products (Manus, Devin, others)
Classification UX patterns for multi-agent system visibility and control.
Intent
Provide users with visibility into multi-agent systems --- what each agent is doing, how they’re coordinating, where intervention is possible --- and intervention surfaces appropriate to the system’s complexity.
Motivating Problem
Multi-agent systems multiply the visibility problem of single agents. Where a single agent’s tool calls and reasoning can be shown sequentially in chat, multi-agent activity happens in parallel, with agents communicating with each other and producing intermediate state that affects other agents. The chat-as-UI pattern breaks down entirely. Visualization patterns --- graphs of agent activity, swim lanes showing parallel work, dashboard views of agent state --- emerge as the working alternatives. The patterns are still evolving; what works for two coordinated agents may not work for ten.
How It Works
Agent dashboards: a panel showing each active agent with its current state, recent activity, and pending tasks. The dashboard updates in real time as agents work. The pattern fits cases with a small number of agents (2—5) where each agent has meaningful identity from the user’s perspective.
Graph visualization: agents and their interactions rendered as a graph. Nodes are agents; edges are messages or task delegations between them. The graph updates as the system runs. The pattern fits complex coordination patterns where the relationships between agents matter; tools like LangGraph Studio implement variations on this pattern.
Swim lanes: parallel work streams shown as horizontal lanes with time flowing left to right. Each lane corresponds to an agent (or a task within the multi-agent system); events are placed on the lane at the time they occurred. The pattern fits cases where temporal coordination matters and users want to see what happened when.
Aggregate progress: when there are many agents and individual visibility would overwhelm, an aggregate progress view shows the overall state (“5 of 12 agents complete, 4 in progress, 3 pending”) without per-agent detail. The pattern fits cases with many similar agents (parallel processing patterns) where individual identity doesn’t matter from the user’s perspective.
Intervention surfaces: where can the user intervene? Per-agent pause/cancel buttons, ability to send messages to specific agents, ability to override the coordinator’s decisions, ability to add new agents to the system. The intervention design depends on how much agency the user wants to retain; many multi-agent products limit intervention significantly because allowing arbitrary intervention can break the coordination patterns.
Audit and replay: the ability to scroll back through what happened, replay the system from a checkpoint, or inspect specific message flows. The pattern is essential for debugging when the multi-agent system produces unexpected outcomes; the engineering depends on full state capture which Volume 7 and Volume 12 cover.
When to Use It
Multi-agent products where users have legitimate reasons to monitor the multi-agent activity. Debugging-oriented UIs for developers building multi-agent systems. Operations dashboards for production multi-agent deployments. Use cases where users need to trust the coordination they can’t fully observe.
Alternatives --- hidden multi-agent (the user sees only a single conversational interface; the multi-agent nature is implementation detail) for cases where multi-agent is an internal architectural decision rather than a user-visible feature. The trade-off is users can’t intervene effectively when something goes wrong; production deployments often start with hidden multi-agent and expose more visibility as users encounter situations where they need it.
Sources
-
LangGraph Studio (developer observability)
-
Multi-agent product UIs (Manus, Devin, others) for working examples
Section H — Discovery and design communities
Resources for tracking AI UX patterns as they evolve
Agent UX is a young discipline. The patterns documented in this volume will see significant evolution within 12 months as new products ship, designers learn from production deployments, and the techniques mature. Tracking the evolution requires monitoring the design community’s output: design system documentation from major AI products, blog posts and case studies from product designers, conference talks at design and AI conferences, the broader UX design community as it engages with AI-specific challenges.
AI UX design resources and community tracking
Source: Various design publications, conference proceedings, product design documentation
Classification Discovery and tracking resources for AI UX design developments.
Intent
Provide pointers to the active sources of AI UX design knowledge: vendor documentation of design decisions, design publications covering AI specifically, conferences where AI UX patterns are presented, and the broader design community as it engages with AI.
Motivating Problem
AI UX patterns evolve as new products ship and designers learn from production. Static reference material ages quickly. Effective AI UX work in 2026 depends on continuous tracking: monitoring vendor design documentation for new patterns, following case studies of production AI products, engaging with the design community’s evolving conversation about AI-specific challenges.
How It Works
Vendor design documentation: Anthropic, OpenAI, Google publish design rationale for their AI products. Anthropic’s blog (anthropic.com/news) includes posts on UX decisions in Claude (artifacts, computer use, the chat interface). OpenAI’s blog covers Canvas, ChatGPT design, and related products. The documentation isn’t comprehensive but provides authoritative explanations of specific design decisions.
Design publications: Nielsen Norman Group, A List Apart, Smashing Magazine, and other design publications increasingly cover AI UX patterns. The coverage is uneven (some publications engage deeply, others publish more surface-level pieces) but tracking the major design publications surfaces relevant case studies and pattern documentation.
Conference talks: design conferences (UX Week, Interaction Conference, IxDA events) and AI conferences with design tracks (Anthropic’s Builder Day, NeurIPS workshops on human-AI interaction) include talks on AI UX. The talk recordings and slides become reference material; the conferences themselves are venues for the community to engage.
Academic HCI: the human-computer interaction research community (CHI conference, TOCHI journal) publishes research on AI interaction. The research is more rigorous than industry case studies but slower to surface; tracking the academic literature complements the faster-moving industry coverage.
Open-source design systems: AI products with design system documentation public (some of the AI startups publish design systems alongside their products) provide concrete pattern documentation. The pattern is less consolidated than for general design systems (Material Design, Apple HIG) but emerging.
AI UX-specific resources: emerging blogs, newsletters, and communities specifically focused on AI UX. The space is recent enough that no single resource dominates; tracking multiple sources is the working pattern.
When to Use It
Any organization with AI UX design responsibilities. Designers transitioning from general software design to AI specifically. Frontend engineers implementing AI features who want to understand the patterns. Product managers shaping AI products who need to understand the design space.
Alternatives --- internal pattern documentation for organizations with mature AI UX practice. Direct copying of established product patterns (Anthropic Claude, OpenAI ChatGPT, others) when those patterns fit. The combination of external tracking and internal documentation is the working pattern for most organizations.
Sources
-
anthropic.com/news (design rationale posts)
-
openai.com/blog (design and product posts)
-
Nielsen Norman Group AI coverage (nngroup.com)
-
CHI conference proceedings (chi.acm.org)
-
Various AI UX blogs and newsletters
Appendix A --- Pattern Reference Table
Cross-reference of the major patterns covered in this volume, what each solves, and representative implementations.
| Pattern | Solves | Representative implementations | Section |
|---|---|---|---|
| Streaming UX | Latency feedback | Token streaming, SSE/WebSockets | Section A |
| Intermediate state display | Process transparency | Thinking blocks, tool call visibility | Section A |
| Side-panel artifacts | Persistent work artifacts | Anthropic Artifacts, OpenAI Canvas | Section B |
| Generative UI | Task-shaped interfaces | Vercel AI SDK generative UI | Section B |
| Pre-action approval | Reversibility safeguards | Confirmation gates for irreversible operations | Section C |
| Undo-first design | Low-friction reversibility | Snapshot-and-restore, soft delete | Section C |
| Inline citation | Source attribution | Perplexity-style citations, Claude citations | Section D |
| Confidence display | Uncertainty communication | Calibrated language, confidence indicators | Section D |
| Step-level status | Long-task visibility | Named steps with progress indicators | Section E |
| Conversation persistence | Across-session continuity | Conversation history, resume from checkpoint | Section E |
| Multi-agent dashboards | Multi-agent visibility | Per-agent state, graph visualization | Section G |
Appendix B --- The Thirteen-Volume Series
This catalog joins the twelve prior volumes to form a thirteen-layer vocabulary for agentic AI.
-
Volume 1 --- Patterns of AI Agent Workflows --- the timing of agent runs.
-
Volume 2 --- The Claude Skills Catalog --- model instructions in packaged form.
-
Volume 3 --- The AI Agent Tools Catalog --- the function-calling primitives.
-
Volume 4 --- The AI Agent Events & Triggers Catalog --- the activation layer.
-
Volume 5 --- The AI Agent Fabric Catalog --- the infrastructure substrate.
-
Volume 6 --- The AI Agent Memory Catalog --- the state and context layer.
-
Volume 7 --- The Human-in-the-Loop Catalog --- HITL engineering for human-supervised agents.
-
Volume 8 --- The Evaluation & Guardrails Catalog --- LLM-internal safety mechanisms.
-
Volume 9 --- The Multi-Agent Coordination Catalog --- the agent-to-agent communication layer.
-
Volume 10 --- The Retrieval & Knowledge Engineering Catalog --- finding the right information in a corpus.
-
Volume 11 --- The AI Compliance & Regulatory Catalog --- compliance-facing governance.
-
Volume 12 --- The AI Infrastructure Security Catalog --- security around the AI system.
-
Volume 13 --- The Agent UX Patterns Catalog (this volume) --- design discipline for agent interaction.
The series now has four adjacent-to-engineering volumes: Volume 8 covers LLM-internal safety as engineering, while Volumes 11, 12, and 13 cover three complementary disciplines that consume engineering outputs --- compliance, security, and design. The pattern of these adjacent volumes is consistent: each documents a discipline with its own audience, artifacts, and vocabulary, complementary to but distinct from the engineering substrate it sits alongside. The catalog’s growth into these adjacent volumes reflects production reality: shipping successful agent products requires the engineering plus the adjacent disciplines done well, and the disciplines deserve explicit vocabulary.
The series can be read as engineering layers (Volumes 1—10 with Volume 8 as the engineering-of-safety layer) plus three discipline complements (Volumes 11—13). The compliance discipline (Volume 11) reads outputs from Volume 8 to produce documentation auditors and regulators accept. The security discipline (Volume 12) protects the infrastructure all the engineering layers run on. The design discipline (Volume 13) turns the engineering into interfaces end users experience. Together the thirteen volumes describe what shipping agentic AI products entails as of mid-2026.
Appendix C --- The Eight Agent UX Anti-Patterns
Eight recurring mistakes that distinguish thoughtful agent UX from improvised AI interfaces. Avoiding these is most of the practical wisdom in the field:
-
Chat as the default interface for every task. Chat works for some tasks and fights others. Tasks with structure (tables, diagrams, forms, persistent reference) suffer in chat’s linear format. Choose the interface shape from the task; generative UI and side panels are working alternatives.
-
Hiding the agent’s process entirely. Users can’t trust outcomes they can’t inspect. Show summaries always; make reasoning and tool calls available on demand. The progressive disclosure pattern --- collapsed details that users can expand --- resolves the transparency-vs-cognitive-load trade-off for most users.
-
Showing the entire trace by default. Power users want full traces; general users are overwhelmed by them. The dominant pattern in 2026 is summary primary with on-demand detail, not full trace as default. Save full traces for debug modes and power users who explicitly request them.
-
Approval-first design for every operation. Confirmation fatigue destroys agent productivity. Use approval for irreversible high-stakes operations; use undo-first design for reversible routine operations; use risk-scaled approval where the agent classifies operations and only sensitive ones gate.
-
Citing without making citations verifiable. Citation that just lists sources without letting users check whether the source actually supports the claim erodes trust when the citations turn out to be wrong. Span-level citation, hover-to-preview snippets, clickable source links are the patterns that make citations useful rather than performative.
-
Synchronous wait for long-running tasks. Tasks longer than 20 seconds need different UX than tasks shorter than 20 seconds. Status indicators, partial result streaming, notification on completion, background mode --- the patterns from Section E. Forcing users to stare at loading spinners for minutes is a UX failure.
-
Ignoring conversation persistence. Sessions that disappear when the user closes the tab feel disposable. Conversation history, resume across sessions, shareable links --- these patterns transform agents from chat-now sessions into work-over-time tools. Users notice when the persistence is missing.
-
Building all the patterns from scratch. Component libraries and SDKs (Section F) package the working patterns. Building everything from scratch produces UI that’s less polished, takes longer to ship, and lacks the affordances users expect from agent products. Use the libraries unless the application has specific reasons not to.
Appendix D --- Discovery and Standards
Resources for tracking agent UX patterns as they evolve:
-
Anthropic blog (anthropic.com/news) --- design rationale for Claude, Artifacts, Computer Use.
-
OpenAI blog (openai.com/blog) --- design and product posts for ChatGPT, Canvas, related products.
-
Vercel AI SDK documentation (sdk.vercel.ai) --- reference implementations of many patterns documented here.
-
Nielsen Norman Group AI coverage (nngroup.com) --- design research perspective on AI UX.
-
CHI conference proceedings --- academic HCI research on AI interaction.
-
Component library documentation: assistant-ui, AI Elements, CopilotKit --- working pattern documentation.
-
Product design retrospectives published by major AI products as they iterate --- practitioner perspective on what worked and didn’t.
-
Design community discussions: AI-focused Slack and Discord communities, design Twitter, design conferences with AI tracks.
Two practical recommendations. First, read the design system documentation of major AI products you don’t build for. Anthropic Claude’s design decisions inform OpenAI ChatGPT’s alternatives; both inform what your application should consider. The cross-pollination across products accelerates the discipline’s maturation. Second, prototype patterns before committing. Agent UX patterns are non-obvious; what reads well on paper may feel wrong in practice; what feels wrong in early prototypes may turn out to be the right pattern after iteration. Prototyping fidelity matters more for AI UX than for traditional UX because the interaction qualities (streaming feel, response latency, decision visibility) only manifest when actually using the prototype.
Appendix E --- Omissions
This catalog covers about 14 substrates across 8 sections. The wider agent UX landscape is larger; a non-exhaustive list of what isn’t here:
-
HITL engineering (approval queues, intervention APIs, observability). Covered in Volume 7.
-
Visual design fundamentals (typography, color theory, layout grids). The established design literature covers these.
-
Accessibility for AI specifically (screen reader behavior for streaming responses, keyboard navigation for chat interfaces, cognitive accessibility for AI). Important and emerging; deserves dedicated treatment.
-
Brand and personality design for AI products (voice and tone, persona design, anthropomorphization choices). Cited briefly; not the focus.
-
Mobile-specific agent UX (touch interactions, keyboard considerations, small-screen layouts for AI). The patterns largely transfer; mobile-specific dimensions warrant separate treatment.
-
Voice and multimodal interfaces (voice input, voice output, video input, multimodal output). Emerging but not yet stable enough for catalog treatment.
-
Embodied AI UX (robots, AR/VR, ambient computing with AI). Different design discipline with overlap; out of scope for this volume.
-
Specific product analysis (deep dives on individual products) beyond the substrate-defining entries. Many products implement variants of the patterns; comprehensive coverage isn’t feasible.
-
Design system documentation conventions for AI components. Emerging but not standardized.
Appendix F --- A Note on a Young Discipline
Agent UX is a young discipline. The patterns documented in this volume were largely invented between 2023 and 2026. Token streaming as a first-class UX pattern is roughly four years old. Generative UI as a coherent pattern is roughly two years old. Side-panel artifacts emerged in 2024. Multi-agent UX is still being figured out as multi-agent products ship. The component libraries that package the patterns (Vercel AI SDK UI, assistant-ui, AI Elements, CopilotKit) have all been released or substantially developed within the past 18 months. Compared to the engineering disciplines documented in Volumes 1—10, the UX discipline is recent and the patterns are correspondingly less stable.
What this means for how to use this volume: the patterns documented here represent the working state of practice in mid-2026, not a settled discipline. Some patterns will mature into long-lasting conventions (token streaming probably; progressive disclosure probably). Some patterns will be replaced by better alternatives as the field learns (specific component-library APIs probably; specific multi-agent visualization techniques probably). The structural insights --- nondeterminism requires uncertainty UX, agents have agency that needs boundary communication, chat is wrong for tasks with structure, transparency and cognitive load trade off via progressive disclosure --- should hold up better than the specific implementations. Treat the volume as a starting point that anchors current practice while remaining open to the patterns evolving.
The catalog’s series-level reflection from Volume 12’s Appendix F applies here too: the series could continue. Adjacent areas where comparable treatment would be valuable include cost engineering for AI (the operational discipline of managing inference costs at scale); model lifecycle management beyond evaluation (versioning, deprecation, replacement); the integration patterns between AI systems and existing enterprise systems (ERP, CRM, SaaS, identity, communication). Each could be a volume; none is currently a major gap; each would add structural vocabulary to the catalog’s existing layers. Whether to extend further is a judgment about diminishing returns. Thirteen volumes covers what an architect needs to design, build, operate, and ship agentic AI products responsibly across the engineering, governance, security, and design dimensions as of mid-2026.
Thirteen volumes. Patterns, Skills, Tools, Events, Fabric, Memory, Human-in-the-Loop, Evaluation & Guardrails, Multi-Agent Coordination, Retrieval & Knowledge Engineering, AI Compliance & Regulatory, AI Infrastructure Security, Agent UX Patterns. The structural vocabulary covers the working design space of agentic AI as of mid-2026. The products and patterns will keep evolving; the structural understanding should hold up; that’s the catalog’s value proposition. Thirteen volumes in, the proposition still holds.
--- End of The Agent UX Patterns Catalog v0.1 ---