Security drift is silent. A server that was locked down at deploy slowly accumulates stale accounts, unpatched packages, and forgotten services. The problem is rarely malice — it is entropy. Configs get tweaked during incidents and never reverted. New hires get sudo and never lose it. A firewall rule gets added “temporarily.”
Giant quarterly audits catch drift eventually, but they are slow, disruptive, and late. A short checklist run weekly catches problems while they are still small. The twelve checks below take roughly 15 minutes on a typical server. None of them require downtime. All of them have caught real problems in production.
12 Checks
1. Disable root SSH login
Direct root login over SSH is the single most brute-forced vector on any public-facing box. Confirm it is off:
grep -i "^PermitRootLogin" /etc/ssh/sshd_config
You want PermitRootLogin no. If you see prohibit-password, that blocks password auth but still allows key-based root login — acceptable in some setups, but no is safer. After changing it, reload the daemon:
sudo systemctl reload sshd
2. Disable password authentication (keys only)
Passwords get brute-forced. Keys do not. Verify password auth is disabled:
grep -i "^PasswordAuthentication" /etc/ssh/sshd_config
The output should be PasswordAuthentication no. Also check for overrides in /etc/ssh/sshd_config.d/ — drop-in files there take precedence and can silently re-enable passwords:
grep -ri "PasswordAuthentication" /etc/ssh/sshd_config.d/
3. Restrict SSH users and tighten SSH config
Limit who can log in over SSH with AllowUsers or AllowGroups in sshd_config. While you are in there, confirm these hardening options:
MaxAuthTries 3 LoginGraceTime 30 X11Forwarding no AllowAgentForwarding no AllowUsers deploy admin
The AllowUsers directive is a whitelist — anyone not listed is denied. This is the most effective way to prevent surprise logins from service accounts or forgotten users.
4. Confirm firewall policy and allowed ports
Check what your firewall actually allows. For ufw:
sudo ufw status verbose
For firewalld:
sudo firewall-cmd --list-all
For raw iptables / nftables:
sudo iptables -L -n --line-numbers sudo nft list ruleset
Look for rules you do not recognize, overly broad ACCEPT rules, and ports that should have been closed after a migration. The default policy should be DROP or REJECT for incoming traffic.
5. Apply pending security updates
Check what is waiting. On Debian/Ubuntu:
sudo apt update && apt list --upgradable 2>/dev/null | grep -i securi
On RHEL-family systems:
sudo dnf check-update --security
If security updates are pending, apply them. If you cannot apply immediately, note the CVEs and schedule a maintenance window. Stale security patches are the most common finding in penetration tests.
6. Remove unused packages and services
Every installed package is attack surface. List running services and ask whether each one belongs:
systemctl list-units --type=service --state=running
Common offenders: cups (printing on a headless server), avahi-daemon (mDNS), rpcbind (NFS you are not using), and postfix (if nothing sends mail). Disable and mask what you do not need:
sudo systemctl disable --now cups sudo systemctl mask cups
7. Check open listening ports
Cross-reference what the firewall allows with what is actually listening:
sudo ss -tlnp
Look at the Local Address column. Services bound to 0.0.0.0 or :: are reachable from the network. If a service only needs to serve localhost (database, cache), bind it to 127.0.0.1. If a port is listening and you cannot identify the process, investigate immediately — that is a red flag.
8. Verify Fail2Ban (or equivalent)
Confirm Fail2Ban is running and has active jails:
sudo fail2ban-client status sudo fail2ban-client status sshd
You should see the sshd jail active with a non-zero number of currently banned IPs (on any internet-facing box, bots are constant). If the banned count is always zero, check that the jail is pointing at the correct log path and that the filter matches your SSH log format. Also verify the ban time is reasonable — 10 minutes is too short; 1 hour or more is better.
9. Review sudoers and admin group memberships
Check who has elevated privileges:
# Users in sudo/wheel group
getent group sudo wheel
# Custom sudoers rules
sudo cat /etc/sudoers.d/*
# Users with UID 0
awk -F: '$3 == 0 {print $1}' /etc/passwd Only root should have UID 0. Every user in the sudo or wheel group should be a current team member. Watch for NOPASSWD rules in sudoers drop-ins — they are sometimes needed for automation but should be scoped to specific commands, never ALL.
10. Ensure unattended security updates are enabled
Manual patching does not scale. Verify automatic security updates are active. On Ubuntu/Debian:
cat /etc/apt/apt.conf.d/20auto-upgrades
You should see:
APT::Periodic::Update-Package-Lists "1"; APT::Periodic::Unattended-Upgrade "1";
On RHEL-family systems, check that dnf-automatic is installed and its timer is active:
systemctl is-enabled dnf-automatic-install.timer
Unattended upgrades should cover security patches at minimum. Whether you auto-apply all updates or just security ones depends on your risk tolerance, but security updates should never wait for a human.
11. Check logs for anomalies and retention
Glance at auth logs for unusual activity:
# Recent auth failures journalctl -u sshd --since "7 days ago" | grep -i "failed\|invalid" # Successful logins from unexpected sources last -ai | head -20 # Check log retention journalctl --disk-usage ls -lh /var/log/auth.log* /var/log/secure* 2>/dev/null
Look for login attempts from unexpected IPs, successful logins at odd hours, and sudo usage by accounts that should not need it. Also confirm logs are being retained long enough for incident response — 90 days minimum. If journalctl --disk-usage shows only a few megabytes, your journal may be rotating too aggressively.
12. Verify backup integrity and restore path
A backup you have never tested is not a backup. Verify:
- Recency: When did the last backup complete? Check timestamps in your backup tool's logs or destination directory.
- Completeness: Does the backup include everything needed for a full restore — config files, databases, application data, TLS certificates?
- Integrity: Can you actually read the backup? Decompress a tarball, list files in a snapshot, or run a checksum validation.
- Restore path: Do you have a written, tested procedure for restoring from backup? Who can execute it? How long does it take?
# Example: check most recent backup timestamp ls -lt /var/backups/ | head -5 # Example: verify a tarball is readable tar -tzf /var/backups/latest.tar.gz | tail -5
If you cannot answer all four questions confidently, your backup strategy has gaps. Schedule a restore drill.
Making it stick: weekly cadence and automation
Fifteen minutes once is useful. Fifteen minutes every week is a security practice. Here is how to make it sustainable:
- Pick a day and stick to it. Monday morning or Friday afternoon — whatever works for your team. Put it on the calendar.
- Script what you can. Checks 1–4 and 7–10 can be a single shell script that outputs pass/fail for each item. Run it across your fleet with Ansible, SSH loops, or your config management tool.
- Track results. A shared spreadsheet, a wiki page, or a ticket per run. The goal is to notice trends: the same check failing three weeks in a row is a process problem, not a server problem.
- Rotate the reviewer. Do not let this become one person's job. Rotating builds team knowledge and catches things that familiarity blinds you to.
- Automate alerting for the critical items. Fail2Ban down, firewall disabled, or root SSH enabled should page someone, not wait for a weekly check.
Security is not a state you achieve. It is a practice you maintain. Twelve checks, fifteen minutes, every week. That is the habit that keeps your servers hardened.
This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License .