Linux System Administration

systemd & Service Management

18 min Lesson 1 of 28

systemd & Service Management

Every production Linux system you will ever operate runs systemd as PID 1 — the init system that boots the operating system, manages every service lifecycle, and is the direct parent of all other processes. Understanding systemd deeply is not optional for a DevOps engineer; it is the foundation on which you build reliable, observable services at any scale.

Units: The Building Blocks of systemd

systemd organises everything it manages into units. A unit is just a structured text file with a .service, .socket, .timer, .mount, .path, .target, or other suffix. You will spend most of your time with .service and .target units.

Unit files live in three locations — systemd merges and prioritises them in this order:

/lib/systemd/system/ — shipped by packages (read-only; never edit these directly)
/etc/systemd/system/ — local overrides and custom units (your playground)
/run/systemd/system/ — transient, runtime-generated units (ephemeral)

Key idea: A file in /etc/systemd/system/ with the same name as one in /lib/systemd/system/ completely masks the package-provided file. This is how you safely override vendor defaults without touching package files — they survive package upgrades.

systemctl: The Control Interface

All day-to-day interaction with systemd goes through systemctl. Here are the commands you will use constantly in production:

# --- Lifecycle management ---
systemctl start nginx          # Start a unit now (not persistent across reboots)
systemctl stop nginx           # Stop gracefully (sends SIGTERM, then SIGKILL)
systemctl restart nginx        # Stop then start
systemctl reload nginx         # Send SIGHUP (reload config without downtime — requires ExecReload=)
systemctl status nginx         # Rich status: active state, PID, last 10 log lines, exit code

# --- Boot persistence ---
systemctl enable nginx         # Create symlink so unit starts at boot
systemctl disable nginx        # Remove symlink
systemctl enable --now nginx   # Enable AND start in one command (preferred in automation)
systemctl is-enabled nginx     # Prints enabled / disabled / static / masked

# --- Introspection ---
systemctl list-units --type=service           # All loaded service units
systemctl list-units --type=service --failed  # Services that have failed
systemctl cat nginx                           # Print the effective unit file (including overrides)
systemctl show nginx                          # Dump all properties as key=value
systemctl show nginx -p Restart,RestartSec    # Filter specific properties

# --- Dependencies ---
systemctl list-dependencies nginx             # Tree of what nginx requires/wants
systemctl list-dependencies --reverse nginx   # What depends ON nginx

Pro tip: In production automation (Ansible, cloud-init, deploy scripts), always use systemctl enable --now rather than two separate commands. It is atomic and idempotent — running it twice is safe.

Writing a Service Unit from Scratch

Knowing how to write a correct, production-grade unit file is the most important systemd skill. Below is a real-world example for a Node.js API, annotated with the reasoning behind each directive.

# /etc/systemd/system/myapi.service

[Unit]
Description=My Production API Service
Documentation=https://internal-wiki.company.com/myapi
# Hard dependency: fail to start if network is not up
Requires=network-online.target
# Order: start after this target is reached
After=network-online.target
# Soft dependency on PostgreSQL: try to start after it, but do not fail if absent
Wants=postgresql.service
After=postgresql.service

[Service]
Type=notify          # Process signals systemd when it is READY (preferred for daemons)
User=myapi           # Run as a non-root, dedicated system user
Group=myapi
WorkingDirectory=/opt/myapi
EnvironmentFile=/etc/myapi/env  # Load secrets from a file (not the unit itself)
ExecStart=/usr/bin/node /opt/myapi/server.js
ExecReload=/bin/kill -HUP $MAINPID

# Restart policy — production standard
Restart=on-failure      # Restart on non-zero exit or signal, but NOT on clean stop
RestartSec=5s           # Wait 5 seconds before attempting restart
StartLimitIntervalSec=60s
StartLimitBurst=5       # Allow max 5 restarts in 60 s, then give up and alert

# Resource limits — always set these to prevent runaway processes
LimitNOFILE=65536
MemoryMax=512M
CPUQuota=80%

# Hardening — deny what the service does not need
NoNewPrivileges=true
ProtectSystem=strict
PrivateTmp=true
ReadWritePaths=/var/lib/myapi /var/log/myapi

[Install]
WantedBy=multi-user.target  # Start during normal multi-user boot

After writing or editing any unit file you must reload the daemon before systemctl will see your changes:

systemctl daemon-reload
systemctl enable --now myapi.service

# Verify it started cleanly
systemctl status myapi.service
journalctl -u myapi.service -n 50 --no-pager

Production pitfall: Forgetting systemctl daemon-reload after editing a unit file means systemd keeps running the old definition in memory. Your edits have zero effect, and the status output will not warn you. Make it muscle memory: edit → daemon-reload → restart/reload.

Service Types

The Type= directive tells systemd how to determine when a service has finished starting up. Choosing the wrong type causes silent race conditions at boot:

Type=simple (default) — systemd considers the service started the instant ExecStart is forked. Correct only if the process never backgrounds itself and signals readiness immediately.
Type=notify — process calls sd_notify("READY=1") when truly ready. systemd waits for this signal. Use this for any service that takes time to initialise (database connections, cache warming). Requires the app to support it.
Type=forking — legacy; for old-school daemons that double-fork. Avoid in new code.
Type=oneshot — for scripts that run and exit. Combine with RemainAfterExit=yes so systemd considers it "active" after it finishes.
Type=exec — like simple but waits for the execve() call to succeed before considering the unit started. A safer default than simple for most new services.

Targets and the Boot Dependency Graph

A target is a synchronisation point — a named milestone in the boot sequence. Targets group services and establish ordering. They replace the old SysV runlevels:

systemd boot target chain — services declare which target they belong to via WantedBy=.

The key targets to know for server workloads:

sysinit.target — filesystem mounts, kernel modules, early device setup
basic.target — sockets, timers, paths — the minimum for a working system
network-online.target — network interfaces are up AND have an IP. Always use this (not network.target) as a dependency for services that make outbound connections at startup
multi-user.target — the normal "server running, all services active" state; equivalent to SysV runlevel 3
graphical.target — multi-user.target plus a display manager; irrelevant on headless servers

network.target vs network-online.target: network.target is reached when the network configuration has been applied, not when the network is actually usable. A service depending only on network.target can start before DHCP assigns an IP. Always use network-online.target for services that connect to databases, message brokers, or any remote endpoint at startup.

Drop-in Overrides — The Safe Way to Customise Vendor Units

When you need to tweak a vendor-provided unit (say, increase the open-file limit for nginx) without replacing the entire file, use a drop-in. Drop-ins are surgically merged on top of the base unit by systemd:

# Create the drop-in directory and file
mkdir -p /etc/systemd/system/nginx.service.d/

cat > /etc/systemd/system/nginx.service.d/override.conf <<'EOF'
[Service]
LimitNOFILE=100000
EOF

systemctl daemon-reload
systemctl restart nginx

# Verify the merged result — shows base file, then "--- /etc/.../override.conf ---"
systemctl cat nginx

# The shortcut: opens $EDITOR with the right file automatically
systemctl edit nginx

Pro tip: systemctl edit nginx is the preferred workflow on any system — it creates the drop-in directory and file for you, and runs daemon-reload automatically when you save. Use it instead of manual file creation to avoid typos in paths.

Dependency Directives: Requires vs Wants

The most common source of boot-time bugs in custom unit files is misusing dependency directives. Here is the precise semantics:

Requires= — hard dependency. If the required unit fails to start, this unit is also stopped. Use sparingly.
Wants= — soft dependency. systemd tries to start the wanted unit, but this unit continues even if the wanted unit fails. Preferred for most service dependencies.
After= / Before= — ordering only, no dependency. Without these, systemd starts units in parallel. Always pair a dependency directive with an ordering directive.
BindsTo= — like Requires= but even stricter: if the bound unit stops for any reason (not just startup failure), this unit is also stopped immediately. Use for units that are truly meaningless without another (e.g., a service tied to a network interface).

The most common pattern for application services: Wants= + After= for soft dependencies (database, cache), and Requires= + After= only for hard infrastructure (network, required mounts).