Protect your WordPress site against bad bots

Self-hosting your WordPress website can provide you with a great deal of freedom in how you set up and run your site. Unfortunately, that freedom comes with a cost – you have to ensure your site is protected. You will have to keep your site safe from hackers, comment spam and brute force attacks. Perhaps you have noticed bad bots looking for themes and plugins on your website? They are looking for vulnerabilities to exploit. You may not have spotted them or perhaps you are lucky that your website is behind a web application firewall. Either way, you should consider doing everything you can to protect your WordPress site or risk extended downtimes and the ire of your web host.

Protect your WordPress site

I’ve recently had to deal with several attacks on some client sites. This post shares some of the things I am using to prevent these folks from damaging those sites or slowing down the servers that those sites are hosted on.

Why do you need to protect your WordPress site?

There are different actions how your website get attacked, the most common are:

  • Brute force attacks on your login page – Bots are trying different logins and passwords to get access to your website.
  • Comment spam – Like the name says, bots are trying to post spam comments, even for blog posts where comment posting is disabled!
  • Sniffing for unsafe themes and plugins – A bot is trying to access miscellaneous files on your website.
  • Indexing your blog – A bot checks all your pages like the Google bot. This kind of bots are often operated by companies to collect your data/content for statistics, link profiles, etc.

All these attacks generate often much more load on your web server than your regular visitors. The problem with WordPress, or more precisely the PHP code which is executed for every page view, is that it uses up memory (RAM). Every web server has limited memory which is shared by all of the sites running on it. Bad bots can max out the server memory in a few minutes. Before you know it, your site is down as well as every other site on that server. The advantage is to limit the number of requests on PHP files and to keep unwanted bots off your website.

Install WP Super Cache and use mod_rewrite for file caching

The easiest way to limit down the execution of PHP scripts is to use a cache plugin. A cache plugin creates a copy for each requested page or post and serves a cached file instead of recreating a page all over again. I’m using WP Super Cache for all my WordPress websites, because it’s a well maintained WordPress plugin which is very easy to install. If you enter the WP Super Cache settings page, check the advanced tab for the three options for caching: mod_rewrite, PHP and legacy caching. Don’t use the PHP option, because some PHP code is involved whenever a file cache is created or requested. The mod_rewrite option is much better because after the file cache is created, no PHP code is required to serve the cached page or post to the client or browser.

Keep those bad (and stupid) bots outside

Like using a web application firewall it’s much better to block bad bots before they can access your website or your PHP files. You can do this by using the 5G Blacklist, which offers some smart rules you need to copy/paste into your website’s .htaccess file. These rules will detect and block bad bots by their user-agent name or malicious query strings and URL slugs. It’s not a 100% protection for your WordPress site, but it will help. If you use the 5G Blacklist together with WP Super Cache and mod_write for file caching, you need to remove or disable this rule:

RedirectMatch 403 (\,|//|\)\+|/\,/|\{0\}|\(/\(|\.\.\.|\+\+\+|\|)

Otherwise it will break your website and provides a 403 error for every sub-page! Another way to keep bad bots and spam bots off your website is using the WordPress plugin AVH First Defense Against Spam. This plugin is not only effective to fight spam, it’s also a great way to block bad bots in general before they can “touch” your website. This plugin is using the IP blacklists from Stop Forum Spam, Spamhaus and Project Honey Pot. The configuration is a bit complex because you need two API keys which are freely available for Stop Forum Spam and Honey Pot.

Protect your site against Brute Force Attacks (login)

The login page is also a PHP script that needs memory for the execution. I’m using Protect which is actually a JetPack module, because it’s a plugin that depends also on a cloud based blacklist. While many CAPTCHA or JavaScript based plugins still give access to the PHP login script, Brute Protect will abandon a bot before most of the PHP code in WordPress is executed.

Hide the wp-login.php page and wp-admin directory

Another effective method is the WP Cerber plugin. While using this plugin it’s possible to change the WordPress login URL to anything you like. Beside this feature, it;s also possible to “hide” the wp-admin  directory as well. The plugin works in multiple ways: The non-standard login URL doesn’t work and the IP address is blocked after a couple of hacking attempts. WP Cerber is a great plugin for most websites with a known group of users.

Show a simple 404 – NOT FOUND error using .htaccess

The last protection for your WordPress site is necessary because of some general mod_rewrite rule which exists for (almost) every WordPress installation. It’s about these two mod_rewrite conditions in your .htaccess file, which are created by WordPress, if you setup SEO friendly permalinks.

# condition to check if a file doesn't exists
RewriteCond %{REQUEST_FILENAME} !-f
# condition to check if a directory doesn't exists
RewriteCond %{REQUEST_FILENAME} !-d

These mod_rewrite rules doesn’t check if a requested file belongs to a post or pages. If a bad bot is sniffing on your site for some files (php, css, txt, js, png…) and the file doesn’t exists, WordPress will create a nice 404 page. This page doesn’t have a file cache and needs all the database queries and PHP code. Imagine how much memory is used, if a bot is trying to access 100 missing files in a single minute! I use the following rules in my .htaccess file (paste it above the default code from WordPress) to prevent the creation of unwanted 404 pages by WordPress:

<IfModule mod_rewrite.c>
	RewriteEngine On
	RewriteBase /
	RewriteCond %{REQUEST_FILENAME} !-f
	RewriteCond %{REQUEST_URI} \.(jpg|jpeg|png|gif|bmp|ico|css|js|swf|htm|html|txt|php)$ [NC]
	RewriteRule .* - [R=404]
</IfModule>

It’s possible that this mod_rewrite condition and rule doesn’t work for 100% in your specific situation. It’s always better to try these critical modifications on test location first. What I have learned from these attacks is, that it’s much better to block these bots entirely. Often is this not possible, because the IP address from each bot must be blacklisted before you can filter it. Check your log files frequently and take action if you think that some activity on your website isn’t normal. You might be surprised how many bots tracking your website every day. Website speed is essential these days and it’s important to serve your pages quickly to your visitors (and Google).

Published in: WordPress Development