My thoughts on Slackware, life and everything

Anubis is now guarding the git server

I have had it with the onslaught of AI and indexing bots that keep my web servers in a constant state of near-crashes.
In particular git.slackware.nl is not handling the load well. It’s powered by cgit, and with tens of thousands of requests per hour to serve the details of git commits, cgit experiences a lot of segfaults. I already had to disable the download of compressed git-snapshot tarballs because all the “xz” processes that were running as a result of serving up “tar.xz” files were eating most of the server’s resources.

The filters that I had built into the Apache httpd server, as well as fail2ban taking care of the really obvious offenders, were not sufficient.

So. I have built a package for anubis, the self-proclaimed Web Application Firewall that protects web sites from AI scraping bots by challenging the visitor with a proof-of-work. This is essentially a calculation the client (a browser, or a web-scraping script) has to make before allowed entry. That calculation takes the shape of a small math problem that is expensive to compute, but easy to verify, like hashing a string with a given number of leading zeroes.
For scraping bots, the cost of these calculations will be big enough that they stop trying. Mere individuals like you and me, we will notice the Anubis loading page for a second and then it will stay out of our way as long as the cookie it places is valid.
Common Linux download programs like wget and curl are not affected by Anubis because it uses  a sensible set of defaults in its filtering behavior with the intention to not infuriate the humans accessing the site.

The anubis package that I built, will create an ‘anubis’ user and group when it is installed. It will also install a startup script in ‘/etc/rc.d/’ and a block is added to ‘/etc/rc.d/rc.local’ so that Anubis will start on every boot of the computer.

Anubis can run in multiple separate instances. A necessity because for each web site you want to protect you’ll have to run a separate instance, listening on a separate TCP port.

If there’s interest in the details of setting up Anubis on Slackware, let me know in the comments section below and then I’ll write up that documentation in a follow-up article.

If you experience issues accessing git.slackware.nl because of Anubis, also let me know below!

Cheers, Eric

11 Comments

  1. Tonus

    Great news ! I think that would be a great addition for the server configuration serie you already gave us on your blog.
    Regards

  2. Fabick75

    I don’t know this piece of software, because all the WAF that I know was closed source. I think that have some details on the configuration and the steps to put in place could be very interesting!

  3. Vladimir Vist

    How would you rate the work of Anubis? Have you noticed a change in server load rates?

    • alienbob

      I see a big difference. There’s a whole lot less requests that end up at the cgit interface now and most of the actual abuse is being stopped by Anubis. As a result, the responsiveness of the website improved a lot.
      Some stats taken from Anubis:
      Amount of data received and processed since I activated it 4 days ago: 850 GB
      Amount of challenges issued during that time: almost 550K
      Amount of validated challenges (meaning Anubis won’t bother that client for a week by setting a cookie): 390K

      I still see bots pass the Anubis filter and I can increase the complexity of the challenge if needed, but all scaping bots that pass Anubis are stopped by my Apache bot-filter before they reach cgit.

      • Vladimir Vist

        Good result. Thanks for sharing the information!

  4. Eeel

    Thanks for this package ! took me some time to figure out, if i’m not mistaken this doesn’ work in /etc/anubis/default.env
    SERVE_ROBOTS_TXT=1
    POLICY_FNAME=”/etc/anubis/botPolicies.yaml”
    For me the solution was to use the rc.anubis option:
    ANUBIS_OPTS=”-policy-fname /etc/anubis/botPolicies.yaml -serve-robots-txt 1″

    The botPolicies.yaml must contain at least a valid bots definition, else Anubis load his default included configuration silently. This one block browsers for test:

    bots:
    – name: block-browsers
    user_agent_regex: >-
    Mozilla|Opera
    action: DENY

    Github data folder must be in /etc/anubis/ for
    bots:
    – import: (data)/bots/_deny-pathological.yaml

    Adding log to rc.anubis is a great help for policies.
    $daemon -S -u $command_user -F $pidfile -o /tmp/anubis_${instance}.log — $command $command_args

    A documentation definitly could help.

    • alienbob

      Thanks Eeel. The suggestion to add logging option to the rc.anubis script is a good idea and I will implement it in the sources so that the next package update will carry it.

      Indeed any variable Anubis can use but is not explicitly mentioned in the default.env file should be passed in ANUBIS_OPTS after checking the commandline parameter equivalents.
      I had not yet played with my own policy files so thanks for the example and description.

      • Eeel

        Thanks for the package update Eric.
        I misunderstood the documentation, Github data folder is not required in /etc/anubis/

  5. Francisco

    Hi Eric.,

    I am interested in your config procedure for this anubis package.

    I have used some of your useful articles for my VM in the cloud.

    I kindly suggest reorganize a “web server series – basic entry level” as you did with your “Cloud Server Series”, including the following topics I found particularly useful from your site:

    0. Apache set up (several articles from you are very good, basic things like: “Alien Tip: protected apache URL”)
    1. https support using letsencrypt.
    2. cgit setup
    3. fail2ban
    4. novnc for desktop remote access.
    5. Single Sign On using Keycloak for general purpose (adding new apps)
    6. Anubis.

    When you have some time…. thanks in advance.

    And many thanks for all your contribution to Slackware.

  6. Anti-Trump

    Hi,

    Thank you for this valuable information.

    But why is this website running an old theme, have you considered to update?

    Thanks,
    Sandeep

    • alienbob

      Should I care that the theme I use is ‘old’? Nope. As long as I don’t need the ‘new functionality’ that newer themes offer, why fix something that is not broken?
      The reason to say goodbye to the previous theme was a security issue and the fact that that theme was un-maintained.
      I did look at Anders’ newer themes and none of those make this site look anywhere like the current one. I am really picky about how my websites present themselves, and checked out a multitude of other free themes as well. I found none so far that appeals to me.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2025 Alien Pastures

Theme by Anders NorenUp ↑