I have had it with the onslaught of AI and indexing bots that keep my web servers in a constant state of near-crashes.
In particular git.slackware.nl is not handling the load well. It’s powered by cgit, and with tens of thousands of requests per hour to serve the details of git commits, cgit experiences a lot of segfaults. I already had to disable the download of compressed git-snapshot tarballs because all the “xz” processes that were running as a result of serving up “tar.xz” files were eating most of the server’s resources.
The filters that I had built into the Apache httpd server, as well as fail2ban taking care of the really obvious offenders, were not sufficient.
So. I have built a package for anubis, the self-proclaimed Web Application Firewall that protects web sites from AI scraping bots by challenging the visitor with a proof-of-work. This is essentially a calculation the client (a browser, or a web-scraping script) has to make before allowed entry. That calculation takes the shape of a small math problem that is expensive to compute, but easy to verify, like hashing a string with a given number of leading zeroes.
For scraping bots, the cost of these calculations will be big enough that they stop trying. Mere individuals like you and me, we will notice the Anubis loading page for a second and then it will stay out of our way as long as the cookie it places is valid.
Common Linux download programs like wget and curl are not affected by Anubis because it usesĀ a sensible set of defaults in its filtering behavior with the intention to not infuriate the humans accessing the site.
The anubis package that I built, will create an ‘anubis’ user and group when it is installed. It will also install a startup script in ‘/etc/rc.d/’ and a block is added to ‘/etc/rc.d/rc.local’ so that Anubis will start on every boot of the computer.
Anubis can run in multiple separate instances. A necessity because for each web site you want to protect you’ll have to run a separate instance, listening on a separate TCP port.
If there’s interest in the details of setting up Anubis on Slackware, let me know in the comments section below and then I’ll write up that documentation in a follow-up article.
If you experience issues accessing git.slackware.nl because of Anubis, also let me know below!
Cheers, Eric

Recent comments