Always-on monitoring of OpenAI's training crawler. GPTBot decides what ChatGPT knows about your brand when it answers without invoking live search. Sentry catches Cloudflare AI Audit blocks, accidental WAF interference, and policy mismatches. Cortex handles the fix.
Continuous audits of GPTBot's accessibility to your site against the 8 things that determine whether ChatGPT learns from you. Cloudflare AI Audit blocks GPTBot by default on most plans; Sentry catches it before your brand disappears from ChatGPT's default knowledge. Cortex fixes it.
User-agent: GPTBot is not Disallowed, and the wildcard User-agent: * directive does not catch it. OpenAI honors the robots.txt token strictly per their published bot policy.
Cloudflare's 'Block AI Scrapers and Crawlers' managed rule is either off or has GPTBot in its allowlist. Default-on on most Cloudflare plans, and blocks GPTBot silently regardless of robots.txt.
Recent visitor IPs claiming the GPTBot UA match OpenAI's published gptbot.json range list. Filters spoofed UAs and confirms real OpenAI traffic.
Critical content (title, h1, body) appears in the initial server-rendered HTML. OpenAI does not publish a JS-render commitment for GPTBot; SSR is the safe path for inclusion.
No active WAF or CDN rule returns 4xx/5xx to the GPTBot/1.3 user-agent fingerprint. WAF rules that block 'non-browser' fingerprints commonly catch GPTBot.
robots.txt directives for GPTBot do not bundle with OAI-SearchBot. Blocking GPTBot does not block ChatGPT Search visibility (and vice versa). Publishers wanting indexed-in-search-but-not-training should explicitly differentiate.
Emerging standard. No major AI engine has publicly committed to consuming /llms.txt yet, but recommended as a forward-compatible signal for future LLM-readable site description.
Page-level `noai` and `noimageai` meta tags reflect the site's intended training-data stance. No contradiction between robots.txt allowance and meta-level signals.
Paste your homepage URL. Sentry verifies robots.txt allowance, Cloudflare AI Audit status, OpenAI IP-range fidelity, SSR reachability, WAF posture, and policy alignment, then ships a per-rule report. No signup, instant results, always free.
Sentry fetches your site, runs every GPTBOT rule, and renders the full result page before your next sip of coffee.
Each failed rule ships with a prescription paragraph. Hand it to engineering and the gap is closed before lunch.
Add your site to the daily Sentry sweep with one click. New regressions get caught the next morning.
8 rules in the GPTBOT Sentry. Daily 3:30 AM ET sweep.
One brain. Thirty-six pairs of eyes. Sentries monitor every visibility signal that decides whether search engines, AI engines, and ad platforms show you. Cortex reads what they see, weighs it against a unified corpus of platform documentation, and acts. Every move follows a defined decision protocol: action stated, reason given, impact named.