The Traffic Light System for Robots.txt
Implement a strategic, tiered approach to bot access. Learn the complete robots.txt configuration for blocking training bots while allowing high-value search and commerce agents.
Key Takeaways
- Complete robots.txt templates for each tier
- Which bots belong in Red, Yellow, and Green categories
- How to implement Crawl-delay for Yellow tier bots
- A production-ready robots.txt configuration
The Traffic Light Framework
Instead of treating all bots the same, implement a three-tier system based on value exchange. Red (Block), Yellow (Monitor), and Green (Allow) create a strategic access framework.
RED LIGHT: BLOCK
Criteria: Bots that scrape content for model training without driving traffic back to your site. These take your intellectual property without providing value.
# RED LIGHT - BLOCK Training Scrapers
User-agent: CCBot
Disallow: /
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: Amazonbot
Disallow: /YELLOW LIGHT: MONITOR
Criteria: Bots with mixed value or uncertain purposes. Allow access with throttling to monitor impact before full authorization.
# YELLOW LIGHT - MONITOR with Throttling
User-agent: Applebot-Extended
Crawl-delay: 10
User-agent: Anthropic-AI
Crawl-delay: 10GREEN LIGHT: ALLOW
Criteria: High-value search and commerce agents that drive traffic and enable transactions. These should have full access.
# GREEN LIGHT - ALLOW Revenue Drivers
User-agent: Googlebot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: GoogleAgent-Mariner
Allow: /
User-agent: Google-Shopping
Allow: /Important: The order of rules in robots.txt matters. More specific user-agent rules should come before general rules. Always test your configuration with Google's robots.txt testing tool.
Complete Production Configuration
Here is a complete robots.txt configuration implementing the Traffic Light System. Customize based on your specific needs and competitive landscape.
# VectorGap Traffic Light System for Robots.txt
# Last updated: 2026
# === RED LIGHT: BLOCK ===
# Training scrapers that take content without traffic return
User-agent: CCBot
Disallow: /
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: Amazonbot
Disallow: /
# === YELLOW LIGHT: MONITOR ===
# Mixed value, throttled access
User-agent: Applebot-Extended
Crawl-delay: 10
Allow: /
User-agent: Anthropic-AI
Crawl-delay: 10
Allow: /
# === GREEN LIGHT: ALLOW ===
# Search and commerce agents that drive revenue
User-agent: Googlebot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: GoogleAgent-Mariner
Allow: /
User-agent: Google-Shopping
Allow: /
# Default: Allow standard crawlers
User-agent: *
Allow: /
Sitemap: https://yoursite.com/sitemap.xmlPractitioner assets
Turn this lesson into a repeatable GEO workflow
Use the checklist, sources, templates, and assessment prompts to move from theory to a client-ready diagnostic or implementation step.
- highDefine the prompt set, user intent, market, persona or vertical scenario for this lesson.
- highCapture current AI answer evidence with provider, date, excerpt, citations and competitor mentions.
- highIdentify the likely root cause: content gap, authority gap, technical access, source inconsistency, review signal or policy risk.
- mediumCreate the visible page, proof block, profile update, policy clarification or report artifact that resolves the gap.
- mediumAssign owner, due date, expected impact and remeasurement window before calling the work complete.
- Google Search Central: Robots.txt introductionGoogle Search Central · 2025
- Google Search Central: Intro to structured dataGoogle Search Central · 2025
- Schema.org vocabularySchema.org · 2025
- Traffic-Light System for AI Access Work Product TemplateA repeatable worksheet for applying Traffic-Light System for AI Access to a real brand or client account.
- Before/After Answer ProofA reporting format for showing how AI answer quality changed after the improvement shipped.
This lesson includes 5 assessment questions to reinforce the concepts before you apply them to a real GEO audit.