AI Bot Policy

How Capconvert sets expectations for AI training crawlers, AI search retrieval crawlers, and general-purpose bots across capconvert.com.

Capconvert AI Bot Policy

Capconvert, LLC. publishes search marketing content across capconvert.com (the marketing site, the Learn knowledge surface, the blog, author pages, case studies, audit reports, and the Cortex Research publication). This page sets expectations for AI training crawlers, AI search retrieval crawlers, general-purpose search crawlers, and miscellaneous bots that access these properties.

It is the human-readable companion to our robots.txt. Where this page and our technical signals conflict, the technical signals control for the request in question; this page documents the intent.

Scope

This policy applies to:

Marketing surface: capconvert.com root and all marketing pages (services, industries, audit, case studies, about, contact, results).
Knowledge + blog: /learn, /authors, /author, and the blog at /learn/blog.
Cortex marketing + product: capconvert.com/cortex, /corpus, and all Cortex application surfaces (see also /cortex/ai-bot-policy for stricter rules on the Cortex-specific surface).
Capconvert outputs wherever they may appear, including outputs published by Capconvert customers on their own properties.

AI Training Crawlers - Disallowed on /cortex and /corpus

The following user agents are NOT permitted to crawl capconvert.com/cortex or /corpus for the purpose of training, fine-tuning, distilling, or evaluating any AI model: GPTBot, ClaudeBot, anthropic-ai, Google-Extended, Applebot-Extended, CCBot, Bytespider, FacebookBot, Meta-ExternalAgent, cohere-ai, DuckAssistBot, YouBot, Diffbot, ImagesiftBot, Amazonbot. This list is non-exhaustive; any user agent whose published purpose includes “training,” “model improvement,” “dataset creation,” or “corpus construction” for an AI/ML system is disallowed on /cortex and /corpus even if not named above.

On other Capconvert marketing surfaces (the homepage, the blog, /learn, /authors, case studies), training crawlers that respect Google-Extended-style opt-out signals will see our standard robots.txt directives; we presently permit general indexing of the public marketing site so it remains discoverable in AI search overviews and citations.

AI Retrieval / Search Crawlers - Allowed

Crawlers that retrieve current content to answer live user queries (without retaining content as training data) are generally allowed across the public Capconvert surface, including the Cortex marketing pages, so long as they identify themselves, respect robots.txt and rate limits, and do not retain content beyond the retrieval window. Examples include OAI-SearchBot (ChatGPT search retrieval), PerplexityBot in live-search retrieval mode, and Google-CloudVertexBot for indexed-results retrieval.

These crawlers must not access authenticated Cortex surfaces (dashboards, APIs, customer-data pages) or the per-client preview URLs under /optimization/content/*.

General-Purpose Search Crawlers - Allowed

Mainstream search-engine crawlers (Googlebot, Bingbot, DuckDuckBot, YandexBot, Baiduspider, Slurp, Sogou Spider, Applebot in indexing mode) are welcome on public capconvert.com pages. We expect them to honor robots.txt, respect rate limits, and respond to noindex and nofollow meta directives.

Authenticated Surfaces - All Bots Disallowed

No automated agent, regardless of stated purpose, may crawl or retrieve content from any authenticated Capconvert surface. This includes the Cortex application, the capconvert-pm dashboard at /ops, per-customer preview URLs, API endpoints (documented and undocumented), and any export or share URL that requires authentication or a per-request token.

Robots.txt directives, request-level authentication, rate limits, and per-account behavior monitoring all enforce this. Bypassing those controls is a violation of the Acceptable Use Policy and the Computer Fraud and Abuse Act (18 U.S.C. § 1030) and equivalent international statutes.

Text & Data Mining Opt-Out

Capconvert reserves all rights under EU Directive 2019/790 Article 4(3) and equivalent jurisdictions to opt out of text-and-data mining of our publicly available content for commercial AI training purposes. The opt-out is expressed both in this page and through technical signals (robots.txt, meta tags, and TDMRep where implemented). It applies regardless of whether a crawler identifies itself as a training crawler at the time of access.

Enforcement

We monitor access logs, AI-bot request fingerprints, and downstream AI output watermark patterns. Bots that violate this policy may have their requests blocked, rate-limited, served deceptive content, or referred to their operator’s abuse contact. Persistent or large-scale violation may result in legal action (including injunctive relief and damages) against the operator and any party that knowingly relies on the extracted content.

Contact

AI crawler operators with a legitimate use case not covered by this policy can request permission at help@capconvert.com. Security researchers should follow the Vulnerability Disclosure Policy.

Last updated: May 26, 2026