CIS Benchmark Hardening on Ubuntu 24.04 — Taking Nine Servers from 49% to 73% Compliance

Every server starts at roughly 49% compliance against the CIS Ubuntu Linux benchmark. That means half the security controls an industry-standard framework recommends are missing out of the box. We took nine production servers — six web nodes and three database nodes spread across Sydney, Brisbane, and Melbourne — and systematically hardened every one of them to 73%, the practical ceiling for cloud-hosted VPS infrastructure. Here is what CIS benchmarks are, how we tested compliance automatically with Wazuh SCA, what we actually changed on each server, and what we learned doing it at scale.

What Are CIS Benchmarks?

The Center for Internet Security (CIS) publishes detailed hardening guides for every major operating system, cloud platform, and application. The CIS Ubuntu Linux 24.04 Benchmark contains over 180 individual security checks covering everything from SSH configuration to kernel parameters to filesystem permissions.

Each check is a specific, testable recommendation. Not vague advice like “secure your SSH” — concrete rules like “ensure MaxAuthTries is set to 4 or fewer” or “ensure IPv4 forwarding is disabled.” They are divided into two levels:

  • Level 1 — Essential security settings that can be applied to any server without breaking functionality. These are the baseline.
  • Level 2 — Deeper hardening for high-security environments that may restrict some functionality in exchange for stronger defence.

CIS benchmarks are not theoretical. They are the basis for compliance frameworks used by governments, financial institutions, and enterprises worldwide. When an auditor asks “is your infrastructure hardened?”, CIS compliance is the standard answer.

Why Not Just Follow a Hardening Blog Post?

Random hardening guides cover a handful of settings. CIS benchmarks are comprehensive — 182 checks for Ubuntu 24.04 alone. They cover areas most guides miss entirely: audit logging configuration, PAM authentication lockout policies, kernel module restrictions, cron permission hardening, and filesystem mount options. More importantly, every recommendation has a testable rule, which means you can verify compliance automatically rather than hoping you remembered everything.

What Is SCA Testing?

Security Configuration Assessment (SCA) is automated compliance scanning. Instead of manually checking 182 settings on each server, an SCA engine reads a policy file that describes every check as a machine-readable rule, runs those rules against the live system, and reports pass, fail, or not applicable for each one.

We use Wazuh as our SIEM (Security Information and Event Management) platform. Wazuh includes a built-in SCA engine that runs on every agent. Each agent loads the CIS benchmark policy file, evaluates every rule locally, and reports results back to the central Wazuh manager. This gives us a single dashboard showing compliance scores across every server in the cluster.

The SCA rules use three types of checks:

TypeSyntaxWhat It Does
Commandc:sshd -TRuns a command and matches output against a regex
Filef:/etc/ssh/sshd_configReads a file and checks for specific content
Directoryd:/etc/sudoers.dScans all files in a directory for matching content

For example, the check for SSH MaxAuthTries runs sshd -T, captures the output, and verifies that maxauthtries is set to 4 or fewer. If the value is 6 (the Ubuntu default), the check fails. If it is 4 or fewer, it passes. Every check works this way — objective, repeatable, and automated.

Starting Point: 49% Across Nine Servers

Before any hardening, every server in our HA WordPress cluster scored 49% on the CIS benchmark — 89 checks passing, 92 failing, 1 invalid. That is a stock Ubuntu 24.04 server on BinaryLane with our application stack deployed.

Ninety-two failing checks sounds alarming, but it is normal. Ubuntu ships with sensible defaults for general-purpose use, not for a hardened production environment. The CIS benchmark represents what security should look like, and closing that gap is the work of hardening.

ServerRoleRegionBefore
wp-web-1-sydWebSydney49%
wp-web-2-sydWebSydney49%
wp-web-3-bneWebBrisbane49%
wp-web-4-bneWebBrisbane49%
wp-web-5-melWebMelbourne49%
wp-web-6-melWebMelbourne49%
wp-db-primaryDatabaseSydney49%
wp-db-replicaDatabaseBrisbane49%
wp-db-replica-melDatabaseMelbourne49%

What We Hardened

We developed a comprehensive hardening script that addresses every fixable CIS check. Here is what it covers, grouped by domain:

SSH Hardening (14 Checks)

SSH is the front door to every server. The CIS benchmark is thorough about locking it down:

  • LogLevel VERBOSE — log detailed authentication information including key fingerprints
  • MaxAuthTries 4 — limit authentication attempts per connection to slow brute force attacks
  • PermitRootLogin prohibit-password — allow root login only with SSH keys, never passwords
  • Strong cryptography only — restrict ciphers to AES-GCM and AES-CTR, MACs to SHA-512 and SHA-256 ETM, key exchange to Curve25519 and DH Group 16/18
  • Disable unnecessary features — X11 forwarding, TCP forwarding, agent forwarding, GSSAPI, Kerberos
  • Session limits — 15-second keepalive interval, 60-second login grace time, 10:30:60 rate limiting on new connections
  • Access control — AllowUsers restricted to root only (these are infrastructure servers, not shared hosts)

Kernel Hardening (16 Checks)

The Linux kernel has dozens of tuneable parameters that affect network security. Ubuntu ships most of them in their permissive defaults:

  • Disable IP forwarding — these are endpoints, not routers
  • Disable ICMP redirects — prevent route poisoning attacks on all interfaces
  • Enable SYN cookies — protect against SYN flood denial of service
  • Enable reverse path filtering — drop packets with spoofed source addresses
  • Log martian packets — record packets with impossible source addresses
  • Restrict dmesg and ptrace — prevent unprivileged users from reading kernel messages or debugging other processes
  • ASLR enabled — randomise memory layout to make exploitation harder

Filesystem and Module Restrictions (8 Checks)

  • Disable unused filesystem modules — cramfs, hfs, hfsplus, squashfs, udf, and USB storage cannot be loaded
  • Core dumps disabled — prevent sensitive memory contents from being written to disk
  • /tmp mounted with noexec — prevent execution of files from temporary directories
  • /dev/shm hardened — nodev, nosuid, noexec on shared memory

Authentication and Password Policy (12 Checks)

  • Password quality enforcement — minimum 14 characters, all four character classes required (uppercase, lowercase, digit, special)
  • Account lockout — lock accounts after 4 failed attempts within 15 minutes, auto-unlock after 10 minutes
  • Password aging — maximum 365 days, minimum 1 day between changes, 7-day warning before expiry
  • Inactive account lockout — accounts unused for 29 days are automatically disabled
  • Password reuse prevention — last 5 passwords are remembered and cannot be reused
  • su restriction — only members of a dedicated group can use su to switch users

Audit Logging (11 Checks)

The Linux audit framework records security-relevant system events. CIS requires comprehensive audit rules covering:

  • Time changes — any modification to system time (adjtimex, settimeofday, clock_settime)
  • Identity changes — modifications to /etc/passwd, /etc/shadow, /etc/group
  • Network configuration — hostname changes, modifications to /etc/hosts, /etc/network
  • AppArmor policy changes — modifications to /etc/apparmor and /etc/apparmor.d
  • Privilege escalation — changes to /etc/sudoers, user emulation via execve with different effective UIDs
  • Session tracking — login/logout events via utmp, wtmp, btmp, lastlog, and faillock
  • Immutable audit rules — once loaded at boot, audit rules cannot be modified without a reboot

System Services and Packages (8 Checks)

  • Chrony for NTP — replace systemd-timesyncd with chrony for more robust time synchronisation
  • Apport disabled — crash reporting service sends data externally and is not needed on servers
  • Telnet and rsync removed — insecure remote access and unencrypted file transfer utilities purged
  • systemd-journal-remote installed and disabled — CIS requires it to be present but not active unless configured
  • Journald configured — persistent storage, compression enabled, forwarding to syslog
  • rsyslog file permissions — log files created with mode 0640 by default

Miscellaneous Hardening (5 Checks)

  • GRUB bootloader — audit enabled at boot, AppArmor enforced, audit backlog buffer sized for busy systems
  • Session timeout — idle shell sessions terminated after 15 minutes
  • Cron restricted — only explicitly allowed users can create scheduled jobs
  • Sudo logging — all sudo commands logged to a dedicated file for audit trail
  • File permissions — correct ownership and modes on /etc/passwd, /etc/shadow, /etc/group, /etc/gshadow

The 73% Ceiling: What Cannot Be Fixed

After applying all possible fixes, every server scores 73%. The remaining 47 failing checks are structurally unfixable on BinaryLane cloud VPS infrastructure — not because we are cutting corners, but because the CIS benchmark assumes a physical server with full control over partitioning, firewall framework choice, and boot process.

CategoryFailing ChecksWhy
Partition layout18CIS wants separate partitions for /var, /var/tmp, /var/log, /home with mount options. BinaryLane provisions a single root partition — repartitioning would require data migration and is not supported.
Firewall framework conflicts13CIS checks all three firewall frameworks (ufw, nftables, iptables) simultaneously. We use nftables exclusively. The ufw and iptables checks fail because those tools are not installed.
SCA policy bugs5The CIS policy adapted from Ubuntu 22.04 has regex bugs that make certain checks impossible to pass on 24.04 regardless of configuration.
NTP implementation conflicts3CIS checks all three NTP implementations. We use chrony. The systemd-timesyncd and ntpd checks fail because they are not installed.
AppArmor profiles2CIS wants all AppArmor profiles in enforce mode. Blanket enforcement breaks SSH and other critical services — we enforce selectively.
AIDE not installed2AIDE (file integrity checker) duplicates functionality already provided by Wazuh FIM. Adding it means maintaining two integrity monitoring systems.
Bootloader password1GRUB passwords prevent remote recovery via BinaryLane’s VNC console — the risk outweighs the benefit for cloud VPS.
SSH root login1CIS wants PermitRootLogin set to no. BinaryLane servers are root-only with key authentication — disabling root login locks out SSH entirely.
PAM hashing conflict1Two CIS checks contradict each other — one requires yescrypt in the PAM line, the other requires no hashing keyword. We keep yescrypt.
sshd_config permissions1CIS wants chmod 600. But the Wazuh agent needs to read sshd_config to evaluate 13 SSH checks. We use 640 — sacrificing 1 check to keep 13 passing.

73% is the real-world maximum for this infrastructure. Every fixable check is fixed. Every unfixable check has a documented technical reason. That documentation matters for audits — it shows you understand the benchmark, made deliberate decisions, and can justify every exception.

The VPC-Only Challenge: Database Server Hardening

Our six web servers have internet access and can install packages directly. The three database servers sit inside a private VPC with no internet connectivity — by design. They can only communicate with other servers on the 10.241.0.0/16 private network.

This means apt-get install does not work. Every package needed for CIS hardening had to be downloaded on the jumpbox (which has internet), transferred to each database server via SCP over the VPC, and installed with dpkg locally. That includes the packages themselves and all their dependencies:

PackageDependencies Required
chronytzdata-legacy (and systemd-timesyncd must be removed first — they conflict)
libpam-pwqualitylibcrack2, cracklib-runtime, libpwquality-common, libpwquality1
systemd-journal-remotelibmicrohttpd12t64, libsystemd-shared (version must match)

Dependency resolution that apt handles automatically becomes manual work on air-gapped servers. Each missing dependency only reveals itself when dpkg fails with “dependency problems — leaving unconfigured.” You fix one, hit the next, fix that, and eventually the chain resolves. We cached all required .deb files on the jumpbox so subsequent servers could be hardened in a single pass.

Why Not Just Give Database Servers Internet Access?

It would make package management easier, but it violates the security principle that put them on a private VPC in the first place. Database servers hold the most sensitive data in the stack. They should not be able to reach — or be reached from — the public internet. The inconvenience of manual dependency resolution is the correct tradeoff for a properly isolated database tier.

SCA Policy Patching: When the Tests Are Wrong

Not all failing checks represent actual security issues. The CIS SCA policy for Ubuntu 24.04 was adapted from the 22.04 version and contains several regex and logic bugs that cause checks to fail regardless of configuration. We identified and patched 12 bugs in the policy file on every agent:

  • Service state detection — Ubuntu 24.04 returns “not-found” for purged packages, but the SCA rules only expected “disabled” or “masked.” Fixed by adding |not-found to the regex patterns for autofs, apport, and rsync checks.
  • Audit rule matching — The kernel represents auid!=unset as auid!=-1 in the loaded ruleset. SCA patterns did not account for this representation.
  • Double-arrow typos — Two ClientAlive checks had -T -> -> instead of -T ->, causing the command parser to fail silently.
  • Trailing slash mismatches — SCA rules looked for /etc/network/ in audit output, but auditctl strips trailing slashes. The rules and the reality did not match.
  • ForwardToSyslog double-colon — A journald check had r::^ instead of r:^, making the regex anchor fail.

These patches are applied via a Python script that does exact string replacement in the YAML policy file. They correct the test, not the system — the security configuration is correct, the check logic was wrong.

Results: Nine Servers, One Consistent Score

ServerRoleRegionBeforeAfterPassFail
wp-web-1-sydWebSydney49%73%13347
wp-web-2-sydWebSydney49%73%13348
wp-web-3-bneWebBrisbane49%73%13347
wp-web-4-bneWebBrisbane49%73%13348
wp-web-5-melWebMelbourne49%73%13348
wp-web-6-melWebMelbourne49%73%13348
wp-db-primaryDatabaseSydney49%73%13348
wp-db-replicaDatabaseBrisbane49%73%13348
wp-db-replica-melDatabaseMelbourne49%73%13348

133 checks passing consistently across nine servers. The minor variation (47 vs 48 failures) is due to a single SCA policy bug affecting the rsync check on servers where the package was purged vs removed — functionally identical security, different systemctl output that the buggy regex handles inconsistently.

Ongoing Compliance: Drift Detection

Hardening a server once is necessary but not sufficient. Configuration drift — where settings gradually revert due to package updates, manual changes, or automation errors — is the real threat to sustained compliance. Our Wazuh SCA agents run continuous compliance checks and report any score changes immediately.

We also run weekly Ansible-based drift detection that re-evaluates the full hardening configuration in check mode (read-only) and alerts if any setting has changed from the expected state. Between Wazuh’s real-time SCA monitoring and Ansible’s weekly drift checks, configuration changes are caught within hours at most.

Lessons Learned at Scale

Hardening one server is straightforward. Hardening nine servers across three cities, with two different network profiles (internet-connected and VPC-isolated), surfaced issues that only appear at scale:

  • Package manager locks are realunattended-upgrades runs automatically on Ubuntu and holds the dpkg lock. On VPC-only servers without internet, it gets stuck indefinitely trying to reach repositories. The fix is to mask the service, but if it is already running when you start hardening, you have to wait for the lock to clear.
  • Identical scripts, different results — the same hardening script produced different failure counts on different servers because they had different starting states. One server had telnet installed, another did not. One had apport active, another had it masked. Idempotent scripts that check before modifying are essential.
  • SCA cache survives reboots — Wazuh caches SCA results and does not always re-evaluate after changes. You must manually clear /var/ossec/queue/sca and restart the agent to get accurate scores after hardening.
  • Reboots are mandatory — audit rules with the immutable flag (-e 2) only take effect after a reboot. GRUB changes only apply after update-grub and a reboot. There is no shortcut.
  • Test your SSH config before restarting — always run sshd -t before systemctl restart ssh. A syntax error in SSH config means you cannot reconnect after the service restarts. On a VPC-only database server with no console access, that is a very bad day.

What This Means for Security

A 73% CIS score does not mean the servers are 73% secure. Security is not a percentage. What it means is:

  • 133 out of 182 industry-standard security controls are implemented and verified
  • Every exception is documented with a technical justification
  • Compliance is continuously monitored, not manually checked
  • The same configuration is enforced consistently across all nine servers

This is the difference between “we think our servers are secure” and “we can prove our servers meet the CIS benchmark, here are the checks, here are the results, here are the documented exceptions.” The first is a hope. The second is an audit trail.

💡 Try It Yourself

The infrastructure tools that made this possible are open-source:

Start with a single server, install the Wazuh agent, run an SCA scan, and see your baseline score. Then start fixing checks one by one. The benchmark tells you exactly what to change and why.