AI Bot Protection
Bot Protection allows you to block AI crawlers and bots from accessing your website content.
What is Bot Blocking?
AI companies use automated crawlers (bots) to scrape website content for training their AI models. Bot Protection identifies and blocks these crawlers based on their IP addresses.
Why Block AI Bots?
- Protect your content from being used to train AI models without permission
- Reduce server load from aggressive crawling
- Maintain competitive advantage by keeping your content proprietary
- Comply with terms of service that prohibit AI training
Supported Bot Providers
Atomic Edge maintains up-to-date IP lists for major AI bot providers:
OpenAI (GPTBot)
The crawler used by OpenAI to train ChatGPT and other models.
Anthropic (Claude-Web)
Anthropic’s web crawler for training Claude AI.
Google AI (Google-Extended)
Google’s crawler for Bard and other AI products (separate from regular Google search).
DeepSeek
DeepSeek AI’s web crawler for training their models.
GitHub Copilot
GitHub’s crawler for training code completion models.
Perplexity AI
Perplexity’s crawler for their AI search engine.
How to Enable Bot Blocking
- Enable Bot Blocking: Toggle on the feature
- Select Bots to Block: Check the boxes for bot providers you want to block
- Choose Response Code:
- 403 Forbidden: Standard "Access Denied" response
- 404 Not Found: Pretend content doesn’t exist (stealth mode)
- 451 Unavailable: Legal/compliance blocking
- Save: Changes take effect immediately
Response Codes Explained
403 Forbidden
- Most transparent option
- Bot knows it’s being blocked
- Good for clear communication
404 Not Found
- Stealth mode – bot thinks content doesn’t exist
- May cause bot to stop crawling entirely
- Good for hiding content
451 Unavailable For Legal Reasons
- Indicates legal/policy blocking
- Appropriate for terms of service enforcement
- Good for compliance scenarios
IP List Updates
Atomic Edge automatically updates bot IP lists from public sources. You don’t need to manually maintain these lists.
Update frequency: Daily
Coverage: Includes known IP ranges published by bot providers
Important Notes
Legitimate Search Engines
- Bot Protection does NOT block legitimate search engines like Google Search, Bing, etc.
- Only AI training crawlers are blocked
- Your SEO is not affected
VPN/Proxy Users
- Bot blocking is based on known bot IP ranges
- Regular users (even with VPNs) are not affected
- Only identified bot IPs are blocked
Global Whitelist
- IPs in your global whitelist bypass bot blocking
- Use this if you need to allow specific bot access
Best Practices
- Block all AI bots by default unless you have a reason to allow them
- Use 404 response for stealth mode
- Monitor your logs to see blocked bot attempts
- Review blocked bots periodically – new bots emerge regularly
Troubleshooting
Bot blocking not working
- Verify the feature is enabled
- Check that bot providers are selected
- Ensure DNS is properly configured
- Note: Some bots may use undisclosed IP ranges
Legitimate traffic being blocked
- Check if the IP is incorrectly classified
- Add the IP to your global whitelist
- Contact support if the issue persists
Legal Considerations
Blocking AI bots is generally legal and within your rights as a website owner. However:
- Check your hosting provider’s terms of service
- Consider your own terms of service
- Be aware of any contractual obligations with AI companies
- Consult legal counsel if you have specific concerns
Frequently Asked Questions
What AI bots can Atomic Edge block?
Supported AI crawler blocking providersAtomic Edge can block crawlers from OpenAI (GPTBot, ChatGPT-User), Anthropic (ClaudeBot, anthropic-ai), Google AI (Google-Extended), Meta AI (Meta-ExternalFetcher), Microsoft/GitHub Copilot (Amazonbot), Apple (Applebot-Extended), DeepSeek, Perplexity, and other major AI training bots.
Why should I block AI bots from crawling my site?
Protecting your content from AI trainingAI companies scrape websites to train their models, often without permission or compensation. Blocking AI bots prevents your content from being used to train models, protects your intellectual property, reduces server load from aggressive crawlers, and gives you control over how your content is used.
How does AI bot blocking work?
Technical mechanism for blocking AI crawlersAtomic Edge identifies AI bots by their User-Agent strings and known IP ranges. When a request matches a blocked bot, it receives a 403 Forbidden response. This happens at the edge before reaching your server, preventing any content from being scraped.
Will blocking AI bots affect my SEO?
Search engine vs AI crawler distinctionNo, AI training bots are separate from search engine crawlers. Googlebot (search), Bingbot, and other SEO-related crawlers are NOT blocked. Google-Extended (AI training) is separate from Googlebot (search indexing). Your search rankings are unaffected by blocking AI bots.
How do I enable AI bot blocking?
Configuring bot protection in your site settingsNavigate to your site’s Bot Protection section, enable the ‘Enable Bot Blocking’ toggle, then select which AI providers to block. Click Save to apply changes. The blocking takes effect immediately across all Atomic Edge edge servers.
What if a legitimate service is being blocked?
Handling false positives in bot detectionIf a legitimate service is incorrectly classified as an AI bot, add its IP address to your global whitelist. Whitelisted IPs bypass all bot blocking. Contact Atomic Edge support if you believe a service is being misclassified.
Is blocking AI bots legal?
Legal considerations for AI crawler blockingBlocking AI bots is generally legal and within your rights as a website owner. You control who accesses your content. However, check your hosting provider’s terms of service and any contractual obligations. Consult legal counsel if you have specific concerns.
Can AI bots circumvent the blocking?
Limitations of bot blocking technologyMost reputable AI companies respect blocking. However, some may use undisclosed IP ranges or modified User-Agents. Atomic Edge regularly updates bot signatures. For maximum protection, combine bot blocking with rate limiting, captcha challenges, and access control measures.
