Robust Scripts: Errors & Safety
Robust Scripts: Errors & Safety
The difference between a hobbyist shell script and one that runs unattended in production at 3 AM is not cleverness — it is defensive engineering. Bash's default behavior is dangerously permissive: it continues executing after a command fails, silently expands unset variables to empty strings, and lets a failed command inside a pipeline hide behind the exit code of the last command. None of that is acceptable when your script is deleting old backups, rotating secrets, or deploying to production. This lesson covers the canonical set of guards that every professional shell script must have, and the tooling that enforces them automatically.
The Big Three: set -euo pipefail
These three flags, combined on one line, transform Bash from a forgiving interpreter into a strict one. Place them immediately after the shebang — no exceptions.
Here is what each flag does and why it matters:
-e(errexit): Exit immediately when any command returns a non-zero status. Without this flag, Bash happily executesrm -rf /var/dataeven if the precedingmkdir /var/datasilently failed. With-e, the script stops at the point of failure rather than cascading into a worse state.-u(nounset): Treat any reference to an unset variable as an error. The classic disaster looks like:rm -rf "${DEPLOY_DIR}/"whereDEPLOY_DIRis a typo or was never exported. Without-u, this silently expands torm -rf "/". With-u, the script aborts with DEPLOY_DIR: unbound variable.-o pipefail(pipefail): Make a pipeline return the exit status of the rightmost command that failed, rather than always returning the exit status of the last command. Without this,false | trueexits 0 — a silent failure swallowed by the pipeline. Withpipefail, it exits 1.
-e and subshells: set -e does not propagate into subshells spawned with ( ) or command substitutions unless you also set it there. Always test failure paths, not just the happy path.
Traps: Guaranteed Cleanup
A trap is a handler that the shell executes when a specified signal or pseudo-signal occurs. The two traps every production script needs are EXIT and ERR.
EXIT fires whenever the script terminates — whether it exits normally, hits a set -e failure, or receives a signal. Use it to clean up temporary files, release locks, or log completion. ERR fires specifically when a command fails (works in concert with set -e), making it the right place to emit a structured error message.
Several points deserve attention. First, cleanup captures $? immediately — any subsequent command would overwrite it. Second, the function checks that the temp directory variable is non-empty and that the path actually exists before attempting deletion; a guard against the case where the script failed before mktemp ran. Third, cleanup re-exits with the original exit code so that whatever invoked your script (CI runner, systemd, cron) receives the right status.
flock or by writing the PID to /var/run/myscript.pid; remove it in the EXIT trap. This pattern is used in production at scale by tools like mysqld_safe and nginx.
Defensive Variable Handling
Beyond -u, Bash provides expansion operators that let you express intent precisely and fail fast with a meaningful message.
The :? form is the idiomatic guard at the top of any script that depends on environment variables injected by a CI system or secrets manager. When DATABASE_URL is missing, the script stops immediately with a clear message rather than passing an empty string to a downstream command that produces a cryptic error pages later.
Safe Temporary Files
Hardcoded temp paths like /tmp/my-script.tmp are a security vulnerability (symlink attacks) and a concurrency bug (two instances collide). Always use mktemp.
Static Analysis with ShellCheck
ShellCheck is a static analysis tool that finds bugs in shell scripts before you run them. It catches unquoted variables, wrong conditional syntax, POSIX-vs-bash incompatibilities, and dozens of other common mistakes. At big-tech companies, ShellCheck runs in the CI pipeline as a mandatory lint gate — a shell script that does not pass ShellCheck does not merge.
Install ShellCheck and run it locally before committing:
ShellCheck integrates with VS Code (shellcheck extension), Vim (ALE), and GitHub Actions (ludeeus/action-shellcheck). A typical CI step looks like:
set -euo pipefail, a cleanup trap, an error trap, and validated required variables. Keep such a template in your team's internal toolbox and enforce it through a linter. Scripts that deviate require an explicit written justification, not just a comment.
Combining Everything: A Safe Script Skeleton
Here is the canonical skeleton that combines all the techniques in this lesson. Copy it as the starting point for every new production script.
The main "$@" pattern — placing all logic in a main function and calling it at the end — ensures that the entire script is parsed before any code runs, which prevents subtle bugs caused by calling a function before it is defined. It also makes the script easier to test in isolation and to source safely from other scripts.