Troubleshooting bot access issues
Overview
This guide helps you troubleshoot issues with search engine bots (like Googlebot, Bingbot, or Yahoo Slurp) not being able to access your website. If you’re relying on SEO (Search Engine Optimization) to improve your site’s visibility, ensuring bot access is critical.
Common Causes
Common causes
Bots are typically blocked for one of two reasons:
robots.txtis denying access.- The bot’s IP address is blocked by the server firewall (CSF).
1. Check “robots.txt” permissions
Start by reviewing your robots.txt file. It’s usually located at:
/home/USERNAME/public_html/robots.txtView file contents
Open the file in a text editor:
vim /home/USERNAME/public_html/robots.txtLook for entries like this:
User-agent: Googlebot
Disallow: /If your robots.txt disallows bots you actually want, remove or modify those lines.
Check file permissions
Ensure the file is publicly viewable:
ll /home/USERNAME/public_html/robots.txtExample output:
-rwxrwxr-x 1 user user 23 Aug 21 2012 /home/USERNAME/public_html/robots.txtTest access in browser
Visit:
https://example.com/robots.txtIf the file loads as plain text, bots should be able to read it. You can also ask your customer to test via Google Search Console.
If the file doesn’t load, recheck ownership and permissions.
2. Check for IP blocks (CSF/ModSecurity)
Sometimes bots get blocked due to repeated mod_security triggers, especially if CSF is installed.
Search deny lists
Run the following to check if common bots are blocked:
grep .googlebot.com /etc/csf/csf.deny; grep .googlebot.com /var/lib/csf/csf.tempban
grep .crawl.yahoo.net /etc/csf/csf.deny; grep .crawl.yahoo.net /var/lib/csf/csf.tempban
grep .search.msn.com /etc/csf/csf.deny; grep .search.msn.com /var/lib/csf/csf.tempban
Example result:
66.249.73.40 # lfd: (mod_security) mod_security triggered by 66.249.73.40...Unblock the IP
If you find a legitimate bot IP, remove it from the firewall:
csf -tr 66.249.73.40Prevent future blocks
Add trusted bot hostnames to the CSF ignore list:
Edit /etc/csf/csf.rignore and add:
.googlebot.com
.crawl.yahoo.net
.search.msn.comThen restart CSF:
csf -rSummary
This guide helps troubleshoot issues where legitimate bots (like Googlebot) can’t access a website, which can negatively impact SEO. It covers two main causes: blocks in the robots.txt file and IP blocks by CSF due to mod_security triggers. The guide includes steps to review permissions, test bot access, remove firewall blocks, and prevent future issues by whitelisting trusted bots.