Block AI Bots Server-wide in Plesk or ModSecurity-powered Servers

Introduction

AI bots can be aggressive and cause severe server overload by consuming excessive resources. Their primary purpose is to train their machine learning models using your data, often without your consent. This activity can degrade your server's performance, cause downtime, and compromise your site's availability. This guide explains how to block these unwanted AI bots using ModSecurity on Plesk-managed servers.

Step 1: Enable ModSecurity in Plesk

Log in to Plesk as an administrator.
Go to Tools & Settings > Web Application Firewall (ModSecurity).
Ensure that ModSecurity is enabled and select the Balanced or Fast mode, depending on your server's configuration and performance needs.

Step 2: Add Custom Rules

In the same ModSecurity settings page, scroll to Custom directives.
Add the following custom rule:

# Block AI bots by User-Agent
SecRule HTTP_User-Agent "GPTBot|Amazonbot|Meta" "phase:1,id:100002,deny,status:403,log,msg:'AI bot blocked by User-Agent'"

Rule Breakdown

SecRule HTTP_User-Agent "GPTBot|Amazonbot|Meta": Matches User-Agents containing these bot names.
phase:1: Executes during the request headers phase.
id:100002: Unique rule identifier.
deny,status:403: Denies the request and returns HTTP 403 Forbidden.
log: Logs the request in the ModSecurity audit logs.
msg: Descriptive message for the logs.

Step 3: Save and Apply

Click Save to apply the custom rule.

Restart Nginx and Apache (or httpd) if needed:

systemctl restart nginx
systemctl restart apache2  # Or httpd on CentOS

Testing the Rule

Run a test using curl:

curl -I "https://yourdomain.com" -A "GPTBot"

You should receive a 403 Forbidden response:

HTTP/1.1 403 Forbidden

Logs Verification

Check if requests are logged:

tail -f /var/log/modsec_audit.log

Additional Tips

Add more bots to the rule by expanding the list in the HTTP_User-Agent condition.
Regularly update the list of bots if new unwanted crawlers emerge.

Eseperio/block-ai-bots-server-wide.md