Help Docs Security Overview Bot Management Troubleshooting bot access issues

Troubleshooting bot access issues

Search bots blocked? Troubleshoot `robots.txt` issues & CSF IP blocks (ModSecurity). Guide: check perms, unblock IPs & whitelist trusted bots.
Overview

This guide helps you troubleshoot issues with search engine bots (like Googlebot, Bingbot, or Yahoo Slurp) not being able to access your website. If you’re relying on SEO (Search Engine Optimization) to improve your site’s visibility, ensuring bot access is critical.
Common Causes

Common causes

Bots are typically blocked for one of two reasons:

  1. robots.txt is denying access.
  2. The bot’s IP address is blocked by the server firewall (CSF).
1. Check “robots.txt” permissions

Start by reviewing your robots.txt file. It’s usually located at:

/home/USERNAME/public_html/robots.txt
View file contents

Open the file in a text editor:

vim /home/USERNAME/public_html/robots.txt

Look for entries like this:

User-agent: Googlebot
Disallow: /

If your robots.txt disallows bots you actually want, remove or modify those lines.

Check file permissions

Ensure the file is publicly viewable:

ll /home/USERNAME/public_html/robots.txt

Example output:

-rwxrwxr-x 1 user user 23 Aug 21 2012 /home/USERNAME/public_html/robots.txt
Test access in browser

Visit:

https://example.com/robots.txt

If the file loads as plain text, bots should be able to read it. You can also ask your customer to test via Google Search Console.

If the file doesn’t load, recheck ownership and permissions.

2. Check for IP blocks (CSF/ModSecurity)

Sometimes bots get blocked due to repeated mod_security triggers, especially if CSF is installed.

Search deny lists

Run the following to check if common bots are blocked:

grep .googlebot.com /etc/csf/csf.deny; grep .googlebot.com /var/lib/csf/csf.tempban
grep .crawl.yahoo.net /etc/csf/csf.deny; grep .crawl.yahoo.net /var/lib/csf/csf.tempban
grep .search.msn.com /etc/csf/csf.deny; grep .search.msn.com /var/lib/csf/csf.tempban

Example result:

66.249.73.40 # lfd: (mod_security) mod_security triggered by 66.249.73.40...
Unblock the IP

If you find a legitimate bot IP, remove it from the firewall:

csf -tr 66.249.73.40
Prevent future blocks

Add trusted bot hostnames to the CSF ignore list:

Edit /etc/csf/csf.rignore and add:

.googlebot.com
.crawl.yahoo.net
.search.msn.com

Then restart CSF:

csf -r
Summary

This guide helps troubleshoot issues where legitimate bots (like Googlebot) can’t access a website, which can negatively impact SEO. It covers two main causes: blocks in the robots.txt file and IP blocks by CSF due to mod_security triggers. The guide includes steps to review permissions, test bot access, remove firewall blocks, and prevent future issues by whitelisting trusted bots.

Was this article helpful?