Loudwhisper

Basic Security for your Website

The bare minimum to protect your website with Crowdsec

"Table of Contents"

I have shared the last couple of posts of this website on Lemmy and despite the very little audience I have, I have noticed that since I (i.e., this blog) stopped being completely invisible and became only mostly invisible (fun fact: On Kagi.com this blog pops up on the first page when searching for a few things), I started getting a little more junk traffic than usual.

By junk traffic I mean the kind of stuff that start popping up the moment you get a public IP or you create a new DNS record: scanners, web-crawlers, bots, smartasses who launch their tools against everything because they now "use" Kali.

On the very homepage of this website I mention that this blog is a static site, I go even more in detail in this blog post about the technical setup. Even among static sites it is a minimal one: it has almost nothing at all, no Javascript and the whole website can run in a 11MB Docker image - of which 8.5MB are the web-server binary. The reason why I am mentioning this is because essentially this junk traffic doesn't represent a risk at all for this website. The only possibility to have this website compromised is essentially via a vulnerability in the web server, possible but unlikely.

Despite all this, your setup might be different, your attack surface larger and therefore all these things might matter for you. I have then decided to anyway implement some level of security for my website to create a guide (see a rant about security guides to keep me accountable) that can possibly be useful to non-security specialists who still run a public server and a few websites.

This post will then cover the following:

This post will not cover the following:

Introduction - The Attack Surface

Before designing our security strategy, we need to understand what we are actually trying to protect, and from who/what. Since I need to make some assumptions, I will be assuming that your setup is extremely simple (and kind-of-similar to mine): you have/rent a Linux box with a public IP, and you have a few websites running on top of it, which you want people to access globally (maybe a blog, your portfolio, etc.). I will also assume that you are not on the NSA watchlist and that no sophisticated attacker or state-sponsored group is after you. If that is the case, any suggestion in this blog is going to be useless and you should probably avoid the internet completely.

In cases where my assumptions hold, you should pretty much consider that your surface is the following:

Threat Actors and Threats

In terms of the who (threat actors) and what (threats) we are trying to protect ourselves from, I would say it is relatively simple:

Given the actors we are dealing with, the kind of attacks are pretty much:

As we can see, the threats are not sophisticated or particularly dangerous. However, the risk is not 0 and there is no reason for not applying equally basic security measures that can substantially reduce the risk of compromise (or at least of something breaking).

(In retrospective, I see that the above is very close to a threat model. Maybe I will make a followup post describing how to build simple threat models. I don't know, but if you are interested let me know, that might motivate me.)

Security Strategy - The Plan

Now that we have a clearer idea of the situation, we can look at a basic plan to follow.

The last two steps of the plan are going to be done using Crowdsec, everything else is going to be done using simple configuration changes.

SSH Hardening

The SSH service is generally a secure service to expose: it is meant to be exposed publicly and its attack surface is fairly limited. However, vulnerabilities still happen, and in fact less than a month ago a serious one was discovered.

This already places you in front of a choice: expose SSH directly or instead expose a VPN service and then expose SSH only through the VPN.

VPN services like Wireguard historically have had way less vulnerabilities discovered, their attack surface is smaller and can be considered more secure to expose than SSH (it also means you are now using an additional authentication - the VPN credentials). However, this also means that you will only be able to access your server if you can establish the VPN connection, plus you need to maintain another service (a very low-maintenance one in this case).

Whatever decision you will take, consider the VPN connection an extra layer of security, not a replacement for any of the hardening measures we are going to discuss now.

Configuration Hardening

All the changes discussed here should be done to /etc/ssh/sshd_config file.

There are more settings, and if you feel like it, you can simply follow a best-practice guide, since this is nothing creative and it's mostly about following the book. For example:

Some people also swear by other measures, like changing the SSH port to something else. Most people end up using 2222 to easily remember. This is borderline useless, as you can see for yourself. We are going to implement anyway brute-force protection later, so in my opinion this is not worth the extra hassle of choosing and remembering a new port and configuring every client accordingly.

There are also many more options related to cryptographic settings. While these are important in general, none of the attackers in our current threat model is ever going to exploit weak Diffie-Hellmann or things like this. It doesn't matter relative to our threat actors.

If you want to copy paste, these are all the options above in a single block:

PasswordAuthentication no
PubkeyAuthentication yes
KerberosAuthentication no
GSSAPIAuthentication no
ChallengeResponseAuthentication no
X11Forwarding no
PeermitRootLogin no

Slap these in your /etc/ssh/sshd_config file and restart the service. Your SSH service is now protected against 90% of whatever you will ever face. Remember that you can check if your configuration is valid with sshd -t.

Protecting from Dumb Attackers with Crowdsec

Until recently I used fail2ban to do most of the job that I will be doing with Crowdsec, so I want to start from why using one or the other.

From my point of view, the pros/cons are the following:

Given the above, I decided to go with Crowdsec, so that's what I am going to cover here.

Crowdsec Basics - Architecture

Do you remember when I mentioned Crowdsec having a more modular architecture? Well, this also means it has a slightly steeper learning curve. Thankfully it has also excellent documentation, although not too many end-to-end examples (which is partially the reason why I decided to write all of this).

Crowdsec is composed by a bunch of pieces, which I will try to recap here in the way I understood them:

Following the installation instructions, most of the Crowdsec configuration lives in /etc/crowdsec. Here we can find:

tree -L 1
.
├── acquis.d
├── acquis.yaml
├── bouncers
├── collections
├── config.yaml
├── console
├── console.yaml
├── contexts
├── hub
├── local_api_credentials.yaml
├── notifications
├── online_api_credentials.yaml
├── parsers
├── patterns
├── profiles.yaml
├── scenarios
└── simulation.yaml

Note that there is an installation wizard that might create some of the stuff for you, based on your answers.

Looking at the above:

Crowdsec in Action - Traefik

In order to better elaborate all the different moving parts, I will run through the scenario that I had to implement myself: using crowdsec for web related attacks and enumeration attempts when traefik is used as reverse proxy.

This scenario is going to be very similar to what most of people would have, and the situation doesn't change much compared to -say- if you are using nginx or caddy as reverse proxy.

By default there are no rules related to Traefik, which means when you run the configuration wizard Traefik won't be presented as an option. However, there is a "collection" for Traefik which is officially maintained.

In general, considering what we have discussed above, we will need:

The collection above in fact contains exactly what's needed:

The acquisition can be configured manually. In my case I have added a file in the acquis.d directory:

cat acquis.d/traefik.yaml 
filenames:
  - /$(PATH_TO_TRAEFIK)/traefik/logs/traefik.log
labels:
  type: traefik

Curating Scenarios

If you opt for using the collection to install everything related to Traefik (instead of installing the parser separately and then manually install scenarios), you will end up with a number of scenario configurations in the scenarios directory.

Each scenario is pretty self-explained, but I will take one as a model so that you can decide to create additional ones that might fit your use-case.

# http-wordpress-scan.yaml
# Doc at https://doc.crowdsec.net/docs/scenarios/format
# Use a leaky bucket for the counter in this scenario. Other
# options are simple counters, direct triggers and more (see doc).
type: leaky
name: crowdsecurity/http-wordpress-scan
description: "Detect WordPress scan: vuln hunting"
filter: |
  # Only apply to logs with the meta.service tag set to http
  evt.Meta.service == 'http' and 
  # ... and the log_type set to http_access_log or http_error_log
  evt.Meta.log_type in ['http_access-log', 'http_error-log'] and 
  # ... and where the status of the request (extracted by the parser) is 403 or 404
  evt.Meta.http_status in ['404', '403'] and
  # ... and the path for the URL contains the string "/wp-"
  Lower(evt.Meta.http_path) contains "/wp-" and
  # ... and ends for ".php"
  Lower(evt.Meta.http_path) endsWith ".php"
# Events are grouped by source IP (i.e., the bucket is partitioned per source IP)
groupby: evt.Meta.source_ip
# Deduplicate event in a bucket (partition) based on the path.
# 10 log lines for the same "wp-login.php" URL and from the same IP will only 
# enter once in the bucket
distinct: evt.Meta.http_path
# Bucket capacity. After this the bucket is overflown and the alert triggers
capacity: 3
# The timer after which an event in the bucket is dropped.
leakspeed: "10s"
# Silence a bucket triggering for this amount of time. This is used to reduce
# spam from triggered bucket (e.g., the same IP overflowing the bucket 10 times
# in a few seconds). The IP should be anyway banned, so there is no value in
# re-triggering the alert.
blackhole: 5m
# Add context to alerts
labels:
  # Indicate that the offending IP should be banned
  remediation: true
  # ATT&CK classification
  classification:
    - attack.T1595
  # A specific behavior. Part of a fixed taxonomy
  # See https://github.com/crowdsecurity/hub/blob/scenario_taxonomy/taxonomy/behaviors.json
  behavior: "http:scan"
  label: "WordPress Vuln Hunting"
  # 0→3. 0 means the attacker could not be spoofing the attack, 3 means spoofable.
  spoofable: 0
  service: wordpress
  # 0→3. Confidence this scenario does not trigger false positives.
  # 0 means false-positive prone, 3 means high confidence in the alert
  confidence: 3

The configuration format for the scenarios allows also to fetch lists remotely, for example:

# [...]
  any(File("trendy_cves.txt"), { Lower(evt.Meta.http_path) contains #})
# [...]
data:
  - source_url: https://hub-data.crowdsec.net/web/trendy_cves.txt
    dest_file: trendy_cves.txt
    type: string

In this case a file hosted by crowdsec is used to source paths that are indicative of recent CVEs being exploited.

Selecting Scenarios

For the use-case considered in this post, I would say that the basic scenarios that are included in the base-http-scenarios collection are already sufficient. Obviously you might be running a specific application that might have its own set of valid scenarios. In the hub there are quite a lot of community-provided configuration for a number of services. If there are not, you might want to create your own.

Considering the threats we have identified above, it's important to catch:

As a benefit, there are also a few more rules that block known bad User-agents. Most of these rules are mostly going to reduce noise by blocking IPs that misbehave. Keep in mind that these rules are public, so feel free to change the time parameters to make it harder for attackers to rate limit their traffic right under the rule threshold. Also remember that this is not a fully fledged WAF, although crowdsec does have also some WAF capabilities for virtual patching (i.e., building a rule that blocks exploits rather than fixing the vulnerability) and uses ModSecurity rules (standard in the industry, but awful to work with, sorry!).

What about SSH?

We discussed a lot about SSH previously, so what about SSH brute-force and attacks?

Well, this is quite easy to do with crowdsec because everything is pre-made, so it's boring to discuss. However, it's important to have the ssh-bf, ssh-slow-bf and since it's recent, also the ssh-cve-2024-6387 scenarios (regarding the last, you should update OpenSSH rather than counting on this rule to protect you).

All of this is done automatically if you run the configuration wizard and select SSH as one of the possible targets. Alternatively, you can just install this collection.

Bouncing IPs

The (almost) final piece of the puzzle requires configuring the type of actions that you want to have taken in response to any of your scenarios triggering.

In my case, I chose the firewall bouncer, which simply adds the offending IPs (remember, the one you encounter and all the ones gathered by the rest of crowdsec users!).

Other options including Cloudflare Firewall, or Haproxy ACLs etc. I am probably going to have some fun writing a custom bouncer that adds IPs to Hetzner firewalls, even if I absolutely do not need it.

Installing the pre-made components is quite straightforward, but observing it in action is quite interesting! For example, using the iptables one, crowdsec curates an ipset, where IPs are continually added and removed. Note that these IPs are blocked fully!

 pkts bytes target     prot opt in     out     source               destination                                                                                                                                                        
  277 16352 DROP       0    --  *      *       0.0.0.0/0            0.0.0.0/0            match-set crowdsec-blacklists src

This means that if you by mistake block your IP to test a rule you will also get kicked out of an established SSH session (ask me how I know!). You can use the CLI to unban yourself if you don't want to wait (it needs a console, for example through your provider website).

To inspect the current banned IPs you can check the current "decisions":

cscli decision list
╭────────┬──────────┬───────────────────┬───────────────────────────────────┬────────┬─────────┬───────────────────────┬────────┬────────────────────┬──────────╮
   ID   │  Source  │    Scope:Value    │               Reason              │ Action │ Country │           AS          │ Events │     expiration     │ Alert ID │
├────────┼──────────┼───────────────────┼───────────────────────────────────┼────────┼─────────┼───────────────────────┼────────┼────────────────────┼──────────┤
 555412 │ crowdsec │ Ip:206.168.34.218 │ crowdsecurity/http-bad-user-agent │ ban    │ US      │ 398324 CENSYS-ARIN-01 │ 2      │ 3h49m22.407970114s │ 449      │
 540411 │ crowdsec │ Ip:107.173.200.14 │ crowdsecurity/http-bad-user-agent │ ban    │ US      │ 36352 AS-COLOCROSSING │ 2      │ 3h26m27.50365056s  │ 447      │
╰────────┴──────────┴───────────────────┴───────────────────────────────────┴────────┴─────────┴───────────────────────┴────────┴────────────────────┴──────────╯

These are the ones that triggered from my own servers. However, this is not the whole list of banned IPs, remember that there are the crowdsourced ones!

sudo ipset list crowdsec-blacklists | awk '{print $1}' | tail +9 | sort | uniq | wc -l
27215

We can also check using the cscli decisions list -a the full list.

Keeping an Eye on Crowdsec

You can use cscli metrics to look at how rules and parsers are performing. These metrics are also exposed, see the main config.yaml:

prometheus:
  enabled: true
  level: full
  listen_addr: 127.0.0.1
  listen_port: 6060

It's also generally important to monitor how the rules are performing, as it is to add more scenarios over time, for example if a serious vulnerability is discovered for a specific technology you use.

Conclusion

This post ended up being a bit more generic than I expected, but that's the result of taking into consideration the basic scenarios that are relevant for most of us, and also of the fact that crowdsec already covers many scenarios automatically, so there is no much that we need to handcraft unless we need to cover more corner cases.

I hope that it was still a useful introduction to the tool and to the general approach I recommend for implementing the bare minimum level of security for your websites.


If you find an error, want to propose a correction, or you simply have any kind of comment and observation, feel free to reach out via email or via Mastodon. If you have any suggestion about other topics to cover, or maybe you would like me to elaborate better something I have already written about, let me know!

Tags: #crowdsec #SSH #selfhosting #security Categories: #tech #guide