Playbooks
Playbooks
If ad-hoc commands are Ansible's screwdriver, playbooks are its architecture blueprint. A playbook is a YAML file that declares what should be true about a set of hosts — which packages are installed, which services are running, which config files contain which content. Ansible reads the playbook and drives the hosts to that state, idempotently and in order. At companies like Stripe, Cloudflare, and Shopify, playbooks encode decades of operational knowledge and can bring an entire fleet from bare metal to fully configured in minutes.
This lesson covers the four structural elements that make up every professional playbook: plays, tasks, handlers, and the notify directive that connects them. Get these four right and you have the mental model for everything else Ansible does.
Playbook Anatomy: The Play
A playbook is a list of plays. Each play maps a set of hosts to a set of tasks and defines the execution context for those tasks. A playbook can have one play or fifty — each play runs sequentially, and all hosts in a play run in parallel (up to the forks limit, default 5).
The essential keys of a play:
name— human-readable label; shows up in output and is the single best documentation for what the play does.hosts— inventory pattern (a group name, a glob,all, or a comma-separated list).become— escalate to root via sudo for the entire play. Can be overridden per task.gather_facts— defaulttrue; collects system facts (OS, IP, memory) before tasks run. Disable withfalsewhen facts are unused and you need speed.vars— play-scoped variables (covered in depth in lesson 5).tasks— ordered list of actions to execute on the matched hosts.handlers— tasks triggered only when notified (covered below).
ansible-playbook invocation.
Tasks: The Unit of Work
A task is one call to an Ansible module. Each task has a name (required in production — never skip it), a module key, and the module's arguments. Tasks run top-to-bottom within a play. If a task fails, Ansible stops the play on that host by default (fail-fast per host; other hosts in the play continue unless you set any_errors_fatal: true).
Here is a complete, production-quality playbook that installs Nginx, drops a configuration file, and ensures the service is running and enabled:
Run it with:
--check --diff before running against production. Check mode runs the playbook without making changes and prints what would change. --diff shows the before/after diff of file content. Together they are your dry-run. At scale, enforce this in CI: run --check on every PR, apply only on merges to main.
Handlers and notify: Event-Driven Side Effects
Handlers are tasks that run at the end of a play — but only if at least one task notified them during that play's execution. This pattern solves a fundamental operations problem: you want to restart a service only when its configuration actually changed, not on every run.
The mechanics:
- A task declares
notify: <Handler Name>. The name must match the handler'snamefield exactly (case-sensitive). - If that task reports changed (not skipped, not ok — changed), Ansible marks the named handler as pending.
- After all tasks complete, Ansible flushes pending handlers once, in the order they are declared in the
handlerssection (not in the order they were notified). - Even if ten tasks notify the same handler, it runs exactly once.
Handler Pitfalls in Production
Handlers are elegant but have sharp edges that trip up engineers in production:
- Handlers do not run if the play fails mid-way. If Task 3 errors before tasks complete, pending handlers are skipped. Use
meta: flush_handlersas a task to force handlers to run at a specific point in the play — for example, right after deploying a config file but before starting dependent services. - Handler name must match exactly. A typo in
notifysilently does nothing — Ansible does not raise an error for a handler that was never triggered. Always test with--checkand look forNOTIFIEDin the output. - Handler order is declaration order, not notification order. If Handler B depends on Handler A completing first, declare A before B.
- One reload, not ten. Ten tasks all changing config snippets can all notify the same handler — it runs once at the end. This is the correct pattern for assembling config from multiple sources.
The Complete Multi-Play Playbook Pattern
Real infrastructure playbooks orchestrate multiple tiers. Here is the canonical pattern for a two-tier app stack deployment — databases first, then app servers:
ansible.builtin.package, not just package. FQCNs make it unambiguous which collection a module comes from, survive collection version upgrades without silent shadowing, and are enforced by linters like ansible-lint. Google, HashiCorp, and every enterprise Ansible shop mandates FQCNs in shared playbook codebases.
Idempotency: The Contract You Must Uphold
Every task in a professional playbook must be idempotent — running the playbook ten times produces the same result as running it once. The package, service, and template modules are idempotent by design. Shell and command tasks are not — use creates, removes, or changed_when guards to make them idempotent or to suppress false-positive change reports:
Breaking idempotency is one of the most common production Ansible bugs. A playbook that flips a config on every run will continually restart services, generate change-event noise in your CMDB, and make it impossible to tell from the run report whether something actually changed. Treat every CHANGED line in a playbook run as a real event that should require explanation.