LIVE|BYTESPIDER|v.1

TikTok & Doubao Discoverability

Govern Visibility To ByteDance's AI Surfaces

Always-on monitoring of ByteDance's crawler. Bytespider feeds TikTok recommendation training and Doubao's foundation model. ByteDance publishes no official docs and no IP ranges - third-party UA fingerprinting is the only verification path. Sentry catches accidental blocks and accidental consent. Cortex handles the fix.

sentry.bytespider.live● 22 min ago
03:30:00GET https://capconvert.com/
03:30:00200 OK · text/html · Bytespider UA
03:30:01Parsing 6 Bytespider rules...
03:30:02 FAIL robots_decision_made (implicit consent via wildcard)
03:30:03 PASS waf_posture_intentional
03:30:04 WARN rate_limit_isolated (3 bursts > 200 req/s last 7d)
03:30:05 PASS geo_reachable
03:30:06 PASS policy_aligned
03:30:07 PASS fingerprint_documented
03:30:08 Score 4/6 · Grade C · critical
ByteDance Crawl Governance

Continuous Bytespider Posture Monitoring

Continuous audits of Bytespider's access to your site against the 6 things that matter for ByteDance's AI surfaces. Bytespider is the only crawler in our coverage without first-party documentation or a published IP list. Third-party fingerprinting is the only verification path. Sentry catches both visible and implicit decisions. Cortex fixes it.

RULE · 1

robots_decision_made

Explicit Bytespider rule

robots.txt contains an explicit Allow or Disallow for User-agent: Bytespider. Default-wildcard absence is treated as consent by ByteDance per third-party crawler-log analysis. Make the decision intentional.

RULE · 2

waf_posture_intentional

WAF allows or blocks consistently

Cloudflare or Akamai bot management is not silently rejecting Bytespider's Singapore-origin requests when the publisher intends to allow it, nor accepting them when the publisher intends to block. The robots.txt directive and the WAF rule agree.

RULE · 3

rate_limit_isolated

Bursts not cascading to others

Bytespider's documented bursting pattern is not triggering blanket WAF rate-limits that catch other legitimate crawlers (Googlebot, Bingbot) in the cross-fire. WAF rules match on UA + IP, not raw connection count.

RULE · 4

geo_reachable

Singapore traffic reachable

Site is reachable from Bytespider's Singapore-routed crawler infrastructure. No country-block or geo-WAF rule accidentally hiding the site from ByteDance.

RULE · 5

policy_aligned

Rule reflects publisher intent

The Bytespider directive matches the publisher's stated stance on ByteDance training (Doubao LLM + TikTok recommendation). Most publishers have never consciously made this decision; the rule should be intentional.

RULE · 6

fingerprint_documented

Verification heuristic documented

Since ByteDance publishes no IP range list, the team documents the heuristic used to identify Bytespider (UA token + spider-feedback@bytedance.com email signature + Singapore ASN observation). Future audits can re-apply consistently.

Bytespider Posture

Free Bytespider Checker

Paste your homepage URL. Sentry verifies the Bytespider directive, WAF posture, burst-pattern footprint, and geographic reachability from ByteDance's Singapore crawl infrastructure, then ships a per-rule report. No signup, instant results, always free.

Comprehensive auditInstant resultsCompletely free
Instant

Audit in under a minute

Sentry fetches your site, runs every BYTESPIDER rule, and renders the full result page before your next sip of coffee.

Actionable

Every failure gets a fix

Each failed rule ships with a prescription paragraph. Hand it to engineering and the gap is closed before lunch.

Ongoing

Locked in for the long haul

Add your site to the daily Sentry sweep with one click. New regressions get caught the next morning.

6 rules in the BYTESPIDER Sentry. Daily 3:30 AM ET sweep.

Govern ByteDance Access

Stop Guessing. Start Seeing. Get Cortex.

One brain. Thirty-six pairs of eyes. Sentries monitor every visibility signal that decides whether search engines, AI engines, and ad platforms show you. Cortex reads what they see, weighs it against a unified corpus of platform documentation, and acts. Every move follows a defined decision protocol: action stated, reason given, impact named.

250
Ranking signals
30
Sentries
60
Platforms
Daily
Always-on
llms.txtai-citationsai-crawlersbrand-pulsetitle-tagsmeta-descstructuredsitemapcore-vitalspage-speedaccessibilitydomain-agehttpsaboutauthorsbacklinksmentionsreviewsinternalgbpnapyelptrusthelpfultopicalfirst-handcronavboostfreshnesshreflangimage-seopage-qualitycanniballlm-outputtrackingssl-tlsCORTEXdecision engine