GEOJan 28, 2026·12 min read

ChatGPT Atlas Agent Compatibility Audit: WAI-ARIA Patterns That Help And Hurt

Capconvert Team

Content Strategy

TL;DR

Atlas and Operator parse pages through the accessibility tree, which means your WAI-ARIA implementation directly determines whether the agent can navigate, transact, and report on your site. Pages with semantic landmarks, labeled inputs, and proper button-versus-link usage are agent-friendly. Pages with div-soup, missing form labels, and custom interactive elements without ARIA semantics are not. This audit walks the patterns that help and hurt, ends with a 25-item checklist, and includes the testing workflow that surfaces issues before they hit production.

When a user asks ChatGPT to book a flight, find the cheapest specific product on three vendor sites, or extract pricing information from a competitor's docs, the request lands inside Atlas (OpenAI's native browser) or Operator (the agent that runs on top of it). The agent does not see your page the way a human reader sees it. It sees the page through the accessibility tree, which is the same data structure that screen readers use to describe a web interface to users who cannot see the screen visually. The agent uses ARIA roles, labels, and state attributes to figure out what every element on the page actually does, then drives the page through the same accessibility surface to complete the task.

For most publishers this is an invisible interaction. The agent visits, parses, possibly transacts, and leaves, and the site owner never knows it happened beyond a server-log entry. But the difference between sites that agents can navigate cleanly and sites that defeat them is observable and growing in commercial importance. As more buyers offload research and transactional tasks to agents, brands whose product pages, pricing pages, and purchase flows are agent-readable get the conversions. Brands whose pages render correctly to humans but confuse the agent lose those conversions to competitors whose pages happen to be semantically clean.

This audit is the practical lens on agent compatibility. The patterns that help, the patterns that hurt, the checklist that surfaces both, and the testing workflow that lets you verify your site is reachable to the next generation of programmatic buyers before the buyers themselves arrive.

Why Agents Use The Accessibility Tree, Not The DOM

A modern web page contains two parallel representations of the same content. The first is the Document Object Model (DOM), the tree of HTML elements that the browser builds from your markup and that JavaScript can manipulate. The second is the accessibility tree, a derived structure that the browser builds from the DOM by applying accessibility semantics: roles, labels, states, relationships. The accessibility tree is smaller and more abstract than the DOM. It is what assistive technology consumes and what programmatic agents consume.

The reason agents prefer the accessibility tree is practical. The DOM tells the agent that there is a div nested inside another div. The accessibility tree tells the agent that there is a button labeled "Add to cart" inside a region called "Product details." The accessibility tree gives the agent what it needs to act intelligently. The DOM gives the agent a wall of generic markup that requires inference to interpret.

OpenAI's Operator and Atlas implementations consume the accessibility tree directly. When the agent needs to click a button, it asks the page for elements with role="button" and a matching label. When the agent needs to fill a form field, it looks for inputs with associated labels. When the agent needs to navigate between sections, it uses landmark roles. The agent's actions translate directly into accessibility-tree queries, which means the quality of your accessibility implementation is the binding constraint on what the agent can accomplish.

This is the deeper insight worth internalizing. Agent compatibility and accessibility compliance are the same engineering work, applied to different consumers. A page built for blind users with screen readers is by definition a page built for AI agents. The two audiences benefit from identical implementation patterns. The argument that accessibility is a niche concern is wrong by an order of magnitude in 2026, because the agent traffic that hits accessibility-rich pages is a fast-growing share of commercial web activity.

The Atlas-Specific Behavior

Atlas is OpenAI's native browser launched in late 2025 to handle agentic tasks that benefit from full browser-equivalent capabilities (JavaScript execution, login state, multi-step flows). Atlas runs the same accessibility-tree consumption logic as Operator does on third-party browsers, but with deeper integration into ChatGPT's task-management system. The result is that Atlas can perform longer agent runs across more sites in a single session without the user staying engaged. Pages that Atlas can read cleanly become candidates for unattended agent traffic. Pages that defeat Atlas's parsing get skipped or fail mid-task, and competitor pages get the conversion. The companion piece on the SEO implications of Atlas covers the broader strategic context.

Landmark Roles That Atlas Looks For

The first thing an agent does on a new page is establish geography. Where is the main content? Where is the navigation? Where is the form to fill? Landmark roles tell the agent the answer to these questions directly. Without landmarks, the agent has to infer the structure from heuristic cues like text density and link ratio, which is slower and less reliable.

The HTML5 elements that auto-generate landmark roles are:

<main>     → role="main"
<nav>      → role="navigation"
<header>   → role="banner" (when top-level)
<footer>   → role="contentinfo" (when top-level)
<aside>    → role="complementary"
<section>  → role="region" (when labeled)
<form>     → role="form" (when labeled)

Sites that wrap their content in semantic HTML5 elements get landmark roles for free. Sites that use generic divs for everything have to add the roles explicitly:

<div role="main">...</div>
<div role="navigation" aria-label="Primary">...</div>

The aria-label on the navigation is important when multiple navs exist on the same page. A site with a primary nav, a secondary nav, and a footer nav should label each one (Primary, Secondary, Footer) so the agent can distinguish between them.

For the agent's purposes, the most important landmarks on a transactional page are the main content region (containing the product, article, or checkout form) and the navigation region (allowing the agent to move between site sections). Pages without these two landmarks force the agent to guess, and the guesses are wrong often enough to break tasks.

The Most Common Landmark Failure

A surprising number of modern frameworks generate output without semantic landmarks. Some Next.js templates wrap everything in nested divs with class names but no roles. Some headless CMS themes do the same. The audit takes 30 seconds: open your page in Chrome DevTools, switch to the Accessibility panel, look at the Accessibility Tree. If you do not see "banner," "navigation," "main," and "contentinfo" as top-level entries, your landmarks are missing and the agent has to do extra work to figure out the page.

The Form Input Patterns That Actually Work

Forms are where agent compatibility most often breaks. The agent's job on a form is to fill the right value into the right input. If the agent cannot determine which input is which, the task fails or fills the wrong fields.

The minimum viable form input is an input element with an associated label:

<label for="email">Email address</label>
<input type="email" id="email" name="email">

The for attribute on the label and the id on the input create the association. The agent reads the label, identifies the input as the email field, and fills it correctly.

Variations that also work:

<label>Email address <input type="email" name="email"></label>

<input type="email" name="email" aria-label="Email address">

<input type="email" name="email" aria-labelledby="email-label">
<span id="email-label">Email address</span>

Variations that do not work for agents:

<input type="email" name="email" placeholder="Email">

<div class="form-label">Email</div>
<input type="email" name="email">

Placeholder text alone is not a label. The placeholder disappears when the user starts typing, and the agent often does not consume placeholders as labels in the first place. The placeholder is a hint, not a name.

A separate div with text near the input is also not a label. The agent has no way to know the div is meant to describe the input below it. The browser has no way to associate them. The visual layout creates an apparent relationship that does not exist programmatically.

For complex inputs (select, combobox, datepicker, multi-select), the agent depends on the ARIA pattern used to build the widget. The ARIA Authoring Practices Guide documents the patterns for every interactive widget type. A combobox that follows the APG pattern is agent-readable. A combobox built as a custom div tree with onclick handlers but no ARIA semantics is not.

Required Fields And Validation State

The agent benefits from knowing which fields are required and whether the current input is valid. The HTML attributes that carry this information:

<input required>
<input aria-required="true">
<input aria-invalid="true">
<input aria-describedby="email-error">
<div id="email-error">Please enter a valid email address.</div>

The required and aria-required attributes both work. The aria-invalid attribute lets the agent know a previous attempt failed validation. The aria-describedby pointer connects the error message to the input, so the agent can report what went wrong if it cannot fix the issue itself.

One of the most consequential ARIA distinctions for agent behavior is the difference between a button and a link. To a human reader, both are clickable. To the agent, they mean different things.

A link (a element with href) is a navigation action. Clicking takes the user (or agent) to a new URL. The agent treats it as a page transition and updates its model of the site accordingly.

A button (button element or role="button") is an action within the current page. Clicking might submit a form, open a modal, toggle state, or trigger an interaction. The agent treats it as a state-changing action and watches for the resulting page change before proceeding.

The distinction matters because the agent's planning logic depends on it. If your "Add to cart" element is implemented as a link, the agent expects a page transition and waits for the cart page to load. If the cart actually opens as a modal on the same page, the agent's expectation breaks and the task may fail mid-flow.

The right rule is straightforward: use button for actions, use a (link) for navigation. If a click stays on the page and changes state, button. If a click leaves for another URL, a. Custom div implementations that wrap onclick handlers without specifying role="button" produce neither pattern correctly, and the agent has to guess what the element does.

The MDN accessibility reference for buttons and links covers the formal distinction in detail and is the reference to consult when implementing a custom interactive element that does not fit the native HTML elements cleanly.

The Cypress-Style Tests That Catch This

A simple audit on any page: get-by-role queries. Cypress, Playwright, and Testing Library all support querying elements by their accessibility role. If you can write getByRole('button', { name: 'Add to cart' }) and find the right element, the agent can too. If the query returns nothing or finds the wrong element, your implementation is invisible to agents and the same fix that makes the test pass makes the agent compatible.

Modals, Drawers, And State That Confuses Agents

Interactive widgets that change page state (modals, drawers, accordions, tabs) are where agent compatibility most often degrades on otherwise well-built sites. The agent needs to know what state each widget is in and how to change it.

For modals, the right ARIA pattern is role="dialog" or role="alertdialog" with aria-modal="true" and an aria-label or aria-labelledby pointing to the modal's title. When the modal opens, focus moves into it. When it closes, focus moves back to the trigger. The pattern is documented in the ARIA Authoring Practices Guide and supported by most modern UI libraries (Headless UI, Radix, Material UI) out of the box.

For drawers and side panels, the same pattern applies with slight variations. The role can be "dialog" if the drawer captures interaction, or "region" with a label if it is more of an info panel.

For accordions, the pattern uses button elements with aria-expanded="true" or "false" to indicate state, paired with aria-controls pointing to the panel that the button toggles. The agent can read the expanded state and decide whether to click the button to expand the section.

For tabs, the pattern uses role="tablist" containing role="tab" elements with aria-selected="true" on the active tab. Each tab has aria-controls pointing to a role="tabpanel" element. The agent can read which tab is active and switch between tabs by clicking the role="tab" buttons.

The unifying principle: state is communicated through aria-expanded, aria-selected, aria-checked, aria-pressed, and aria-current. Without these attributes, the agent sees a tree of generic elements and cannot tell which widget is in which state.

State Changes Without ARIA Updates

The single most common modal failure is a modal that visually opens but does not update its ARIA state. The modal element gets a CSS class change that makes it visible, but the aria-hidden attribute stays "true" or the aria-modal attribute is never set. To a screen reader (or an agent), the modal does not exist. The user perceives a modal. The agent does not. The fix is to update the ARIA state alongside the visual state, which most modern UI libraries do automatically and many hand-rolled implementations forget.

The Atlas Compatibility Checklist

A 25-item audit that surfaces the most common agent-compatibility issues on a content or transactional page:

  1. Page has exactly one main element or one element with role="main".
  2. Primary navigation uses nav element or role="navigation" with an aria-label.
  3. Footer uses footer element or role="contentinfo".
  4. Heading levels are sequential (no jumping from h1 to h4).
  5. Each interactive widget has a single accessible name (aria-label, aria-labelledby, or visible text).
  6. All inputs have associated labels via label, aria-label, or aria-labelledby.
  7. Placeholders are not used as the only label.
  8. Required inputs are marked with required or aria-required="true".
  9. Validation errors are connected to inputs via aria-describedby.
  10. Buttons use the button element or role="button".
  11. Links use the a element with href, not button or div with onclick.
  12. Custom-built buttons have role="button" and a keyboard handler.
  13. Modals use role="dialog" with aria-modal="true" and a labeled title.
  14. Modals trap focus when open and restore focus when closed.
  15. Accordions use buttons with aria-expanded reflecting open/closed state.
  16. Tabs use role="tablist" / role="tab" / role="tabpanel" with aria-selected.
  17. Images that convey information have alt text. Decorative images have alt="".
  18. Icons that are clickable have aria-label or visually-hidden text.
  19. SVG elements that are interactive have role="img" and a title or aria-label.
  20. Color is not the only signal for state (error, success, required).
  21. Loading states use aria-busy="true" or live regions to announce changes.
  22. Skip-to-content links are present at the top of the page.
  23. Focus order matches visual order. No tabindex values greater than 0.
  24. Custom select widgets follow the ARIA Combobox or Listbox pattern.
  25. Page works without JavaScript for read-only browsing (server-side rendered HTML).

This list is not exhaustive. WCAG 2.2 contains 78 success criteria across four principles and the full audit takes hours, not minutes. The 25 items above are the high-leverage subset that catches the patterns most likely to break agent flows. The patterns are equally applicable to preparing your site for AI agents broadly and not specific to Atlas.

The Audit Cadence

Run the 25-item checklist once during a quarterly site audit and once after any major template or framework change. The audit takes 30-90 minutes per page template, not per page; once you confirm the template is compatible, every page rendered from that template inherits the compatibility. The work compounds across the site rather than scaling linearly with content volume.

Testing Your Site Through Atlas's Eyes

The fastest way to see what Atlas actually sees is to inspect the accessibility tree directly. Chrome DevTools has a built-in Accessibility panel that renders the tree as the browser computes it.

The workflow:

  1. Open the page in Chrome.
  2. Open DevTools (Cmd+Option+I or F12).
  3. Open the Elements panel.
  4. Select an element of interest in the DOM tree.
  5. Open the Accessibility pane at the bottom.
  6. Read the computed accessible name, role, and attributes.

If the role is "generic" or the accessible name is empty, the element is invisible to the agent. The fix is to add the missing semantics (label, role, aria attribute) until the panel shows what you want.

For a fuller view, the axe DevTools browser extension (free) runs an automated audit against WCAG and ARIA best practices, producing a per-element issue list with remediation guidance. The audit is more comprehensive than the 25-item checklist and works as a complement to it; the checklist defines what you care about most, and axe surfaces the issues at every level.

For programmatic verification, Playwright and Cypress both support get-by-role queries that return only elements that are accessible-tree-discoverable. Writing tests against role queries means your tests catch the same compatibility issues that affect agents. Test pass means agent-readable.

For end-to-end agent verification, the most direct method is to run Atlas itself or use Operator to perform a representative task on the page. Open ChatGPT, instruct the agent to perform the same task a human would (extract pricing, complete the form, click through to a specific section), and observe whether the agent completes the task. If the agent succeeds, your site is agent-readable for that flow. If the agent fails, the failure mode often surfaces exactly which element confused the parsing logic.

The Audit Output

After running the test workflow on a page template, the deliverable is a list of issues with severity and remediation. The severity scale we use: blocker (agent cannot complete the canonical task on the page), degrader (agent can complete but with reduced confidence or extra steps), polish (cosmetic accessibility improvement). Blockers go into the next sprint. Degraders go into the next quarter. Polish items go into the backlog. This prioritization keeps accessibility work proportional to its impact on actual outcomes rather than treating every audit finding as equally urgent.

Frequently Asked Questions

Does Atlas have its own user agent string I can detect?

Atlas operates as a Chromium-based browser and presents a user agent that resembles standard Chrome with additional identifying information indicating the OpenAI client. The user agent is documented in OpenAI's developer materials and changes occasionally as Atlas evolves. If you need to detect Atlas specifically, the OpenAI bot documentation is the canonical reference. For most use cases, treating Atlas as a standard browser is the right default; the agent benefits from the same rendering your human users get.

What is the difference between WAI-ARIA and WCAG?

WAI-ARIA (Web Accessibility Initiative - Accessible Rich Internet Applications) is the specification for the roles, states, and properties you add to your HTML to make custom widgets accessible. WCAG (Web Content Accessibility Guidelines) is the broader standard that defines what accessible content looks like, with success criteria spanning content, design, and code. ARIA is one tool to meet WCAG. WCAG includes ARIA-based criteria plus many others (color contrast, text alternatives, time-based media, navigation patterns). For agent compatibility, ARIA is the more immediately relevant of the two.

Do I need to install Atlas to test agent compatibility?

No. The accessibility tree that Atlas reads is the same one any modern browser computes. Chrome's DevTools Accessibility panel, axe DevTools, and the role-based query APIs in Playwright/Cypress all give you the same view Atlas uses. Running Atlas itself is the highest-fidelity verification but not a prerequisite for fixing the issues.

Can I keep my visual design while fixing accessibility?

Yes. Accessibility implementation is almost entirely invisible to sighted users. Adding ARIA attributes, semantic HTML elements, and proper label associations does not change how the page looks. The interventions live in the markup and the JavaScript that updates state. The visual presentation is unchanged. Some interventions (focus indicators, skip-to-content links) are visible but designed to be subtle and only apparent during keyboard navigation.

What is the relationship between Atlas, Operator, and ChatGPT?

ChatGPT is the user-facing chat product. Operator is OpenAI's agent layer that performs multi-step tasks on the user's behalf. Atlas is the OpenAI-built browser that Operator runs inside for tasks requiring full browser capabilities. When a user asks ChatGPT to "find the cheapest Patagonia jacket on three sites," Operator orchestrates the task and Atlas does the browsing. The agent's interaction with your site happens through Atlas. The companion piece on Operator and agentic browsing covers the architecture and the broader implications.

Agent traffic is growing. The brands whose pages are agent-readable get the conversions. The brands whose pages are not lose them quietly to competitors. The work to be agent-compatible is the same work to be accessible, which is work most sites should be doing anyway and most sites are doing incompletely. Closing the gap between the visual presentation users see and the semantic representation agents consume is one of the highest-leverage technical investments a publisher can make in 2026.

If your team wants the full audit (the per-template review, the prioritized issue list, and the regression tests that prevent the issues from coming back), that work sits inside our generative engine optimization program. The patterns are well-defined. The interventions are mechanical. The payoff is access to a buyer surface that did not exist three years ago and increasingly drives the share of buyer-research traffic that converts.

Ready to optimize for the AI era?

Get a free AEO audit and discover how your brand shows up in AI-powered search.

Get Your Free Audit
Free Audit