AI Bot Protection

Bot Protection allows you to block AI crawlers and bots from accessing your website content.

What is Bot Blocking?

AI companies use automated crawlers (bots) to scrape website content for training their AI models. Bot Protection identifies and blocks these crawlers based on their IP addresses.

Why Block AI Bots?

  • Protect your content from being used to train AI models without permission
  • Reduce server load from aggressive crawling
  • Maintain competitive advantage by keeping your content proprietary
  • Comply with terms of service that prohibit AI training

Supported Bot Providers

Atomic Edge maintains up-to-date IP lists for major AI bot providers:

OpenAI (GPTBot)

The crawler used by OpenAI to train ChatGPT and other models.

Anthropic (Claude-Web)

Anthropic’s web crawler for training Claude AI.

Google AI (Google-Extended)

Google’s crawler for Bard and other AI products (separate from regular Google search).

DeepSeek

DeepSeek AI’s web crawler for training their models.

GitHub Copilot

GitHub’s crawler for training code completion models.

Perplexity AI

Perplexity’s crawler for their AI search engine.

How to Enable Bot Blocking

  1. Enable Bot Blocking: Toggle on the feature
  2. Select Bots to Block: Check the boxes for bot providers you want to block
  3. Choose Response Code:
    • 403 Forbidden: Standard "Access Denied" response
    • 404 Not Found: Pretend content doesn’t exist (stealth mode)
    • 451 Unavailable: Legal/compliance blocking
  4. Save: Changes take effect immediately

Response Codes Explained

403 Forbidden

  • Most transparent option
  • Bot knows it’s being blocked
  • Good for clear communication

404 Not Found

  • Stealth mode – bot thinks content doesn’t exist
  • May cause bot to stop crawling entirely
  • Good for hiding content

451 Unavailable For Legal Reasons

  • Indicates legal/policy blocking
  • Appropriate for terms of service enforcement
  • Good for compliance scenarios

IP List Updates

Atomic Edge automatically updates bot IP lists from public sources. You don’t need to manually maintain these lists.

Update frequency: Daily

Coverage: Includes known IP ranges published by bot providers

Important Notes

Legitimate Search Engines

  • Bot Protection does NOT block legitimate search engines like Google Search, Bing, etc.
  • Only AI training crawlers are blocked
  • Your SEO is not affected

VPN/Proxy Users

  • Bot blocking is based on known bot IP ranges
  • Regular users (even with VPNs) are not affected
  • Only identified bot IPs are blocked

Global Whitelist

  • IPs in your global whitelist bypass bot blocking
  • Use this if you need to allow specific bot access

Best Practices

  1. Block all AI bots by default unless you have a reason to allow them
  2. Use 404 response for stealth mode
  3. Monitor your logs to see blocked bot attempts
  4. Review blocked bots periodically – new bots emerge regularly

Troubleshooting

Bot blocking not working

  • Verify the feature is enabled
  • Check that bot providers are selected
  • Ensure DNS is properly configured
  • Note: Some bots may use undisclosed IP ranges

Legitimate traffic being blocked

  • Check if the IP is incorrectly classified
  • Add the IP to your global whitelist
  • Contact support if the issue persists

Legal Considerations

Blocking AI bots is generally legal and within your rights as a website owner. However:

  • Check your hosting provider’s terms of service
  • Consider your own terms of service
  • Be aware of any contractual obligations with AI companies
  • Consult legal counsel if you have specific concerns

Frequently Asked Questions