Configuration Management with Ansible

Playbooks

18 min Lesson 4 of 30

Playbooks

If ad-hoc commands are Ansible's screwdriver, playbooks are its architecture blueprint. A playbook is a YAML file that declares what should be true about a set of hosts — which packages are installed, which services are running, which config files contain which content. Ansible reads the playbook and drives the hosts to that state, idempotently and in order. At companies like Stripe, Cloudflare, and Shopify, playbooks encode decades of operational knowledge and can bring an entire fleet from bare metal to fully configured in minutes.

This lesson covers the four structural elements that make up every professional playbook: plays, tasks, handlers, and the notify directive that connects them. Get these four right and you have the mental model for everything else Ansible does.

Playbook Anatomy: The Play

A playbook is a list of plays. Each play maps a set of hosts to a set of tasks and defines the execution context for those tasks. A playbook can have one play or fifty — each play runs sequentially, and all hosts in a play run in parallel (up to the forks limit, default 5).

The essential keys of a play:

  • name — human-readable label; shows up in output and is the single best documentation for what the play does.
  • hosts — inventory pattern (a group name, a glob, all, or a comma-separated list).
  • become — escalate to root via sudo for the entire play. Can be overridden per task.
  • gather_facts — default true; collects system facts (OS, IP, memory) before tasks run. Disable with false when facts are unused and you need speed.
  • vars — play-scoped variables (covered in depth in lesson 5).
  • tasks — ordered list of actions to execute on the matched hosts.
  • handlers — tasks triggered only when notified (covered below).
One playbook, multiple plays: A single YAML file can contain plays targeting different host groups in sequence. For example: play 1 configures database servers, play 2 configures app servers (which may depend on the databases being ready). This lets you orchestrate multi-tier deployments in one file with a single ansible-playbook invocation.

Tasks: The Unit of Work

A task is one call to an Ansible module. Each task has a name (required in production — never skip it), a module key, and the module's arguments. Tasks run top-to-bottom within a play. If a task fails, Ansible stops the play on that host by default (fail-fast per host; other hosts in the play continue unless you set any_errors_fatal: true).

Here is a complete, production-quality playbook that installs Nginx, drops a configuration file, and ensures the service is running and enabled:

--- # site.yml — baseline web server configuration - name: Configure web servers hosts: webservers become: true gather_facts: true vars: nginx_worker_processes: "auto" nginx_worker_connections: 1024 tasks: - name: Ensure Nginx is installed ansible.builtin.package: name: nginx state: present - name: Deploy Nginx main config ansible.builtin.template: src: templates/nginx.conf.j2 dest: /etc/nginx/nginx.conf owner: root group: root mode: "0644" validate: "nginx -t -c %s" notify: Reload Nginx - name: Ensure Nginx is started and enabled ansible.builtin.service: name: nginx state: started enabled: true handlers: - name: Reload Nginx ansible.builtin.service: name: nginx state: reloaded

Run it with:

# Dry-run first — see what would change without touching hosts ansible-playbook site.yml --check --diff # Real run against production inventory ansible-playbook -i inventories/production/hosts.ini site.yml # Limit to a single host for a canary test ansible-playbook -i inventories/production/hosts.ini site.yml --limit web01.prod.example.com # Verbose output — print task results ansible-playbook site.yml -v # module results ansible-playbook site.yml -vvv # connection + SSH details
Always use --check --diff before running against production. Check mode runs the playbook without making changes and prints what would change. --diff shows the before/after diff of file content. Together they are your dry-run. At scale, enforce this in CI: run --check on every PR, apply only on merges to main.

Handlers and notify: Event-Driven Side Effects

Handlers are tasks that run at the end of a play — but only if at least one task notified them during that play's execution. This pattern solves a fundamental operations problem: you want to restart a service only when its configuration actually changed, not on every run.

The mechanics:

  1. A task declares notify: <Handler Name>. The name must match the handler's name field exactly (case-sensitive).
  2. If that task reports changed (not skipped, not ok — changed), Ansible marks the named handler as pending.
  3. After all tasks complete, Ansible flushes pending handlers once, in the order they are declared in the handlers section (not in the order they were notified).
  4. Even if ten tasks notify the same handler, it runs exactly once.
Ansible playbook execution flow: tasks then handlers Play: Configure web servers Task 1 Install Nginx Task 2 Deploy nginx.conf Task 3 Start Nginx service CHANGED notify Handler Reload Nginx (once) OK (no change) OK After all tasks: flush handlers — Reload Nginx runs ONCE (even if notified by multiple tasks) If Task 2 reported OK instead of CHANGED, the handler would be skipped entirely.
Tasks run sequentially; handlers fire once at the end — only when at least one task notified them with a CHANGED result.

Handler Pitfalls in Production

Handlers are elegant but have sharp edges that trip up engineers in production:

  • Handlers do not run if the play fails mid-way. If Task 3 errors before tasks complete, pending handlers are skipped. Use meta: flush_handlers as a task to force handlers to run at a specific point in the play — for example, right after deploying a config file but before starting dependent services.
  • Handler name must match exactly. A typo in notify silently does nothing — Ansible does not raise an error for a handler that was never triggered. Always test with --check and look for NOTIFIED in the output.
  • Handler order is declaration order, not notification order. If Handler B depends on Handler A completing first, declare A before B.
  • One reload, not ten. Ten tasks all changing config snippets can all notify the same handler — it runs once at the end. This is the correct pattern for assembling config from multiple sources.
Never restart a service inside a task block to work around handlers. Engineers new to Ansible sometimes write a task that directly restarts a service unconditionally — to avoid learning handlers. This causes unnecessary downtime on every playbook run, even when nothing changed. Handlers exist precisely to solve this: only restart when something actually changed.

The Complete Multi-Play Playbook Pattern

Real infrastructure playbooks orchestrate multiple tiers. Here is the canonical pattern for a two-tier app stack deployment — databases first, then app servers:

--- # full-stack.yml - name: Configure database tier hosts: dbservers become: true gather_facts: true tasks: - name: Install PostgreSQL ansible.builtin.package: name: postgresql state: present - name: Deploy pg_hba.conf ansible.builtin.template: src: templates/pg_hba.conf.j2 dest: /etc/postgresql/15/main/pg_hba.conf owner: postgres group: postgres mode: "0640" validate: "pg_hba: %s" # custom validation script notify: Reload PostgreSQL - name: Ensure PostgreSQL is started and enabled ansible.builtin.service: name: postgresql state: started enabled: true handlers: - name: Reload PostgreSQL ansible.builtin.service: name: postgresql state: reloaded - name: Configure application tier hosts: appservers become: true gather_facts: true tasks: - name: Deploy application config ansible.builtin.template: src: templates/app.env.j2 dest: /opt/myapp/.env owner: myapp group: myapp mode: "0600" notify: Restart application - name: Ensure application service is running ansible.builtin.service: name: myapp state: started enabled: true handlers: - name: Restart application ansible.builtin.service: name: myapp state: restarted
Use FQCN (Fully Qualified Collection Name) for all modules. Write ansible.builtin.package, not just package. FQCNs make it unambiguous which collection a module comes from, survive collection version upgrades without silent shadowing, and are enforced by linters like ansible-lint. Google, HashiCorp, and every enterprise Ansible shop mandates FQCNs in shared playbook codebases.

Idempotency: The Contract You Must Uphold

Every task in a professional playbook must be idempotent — running the playbook ten times produces the same result as running it once. The package, service, and template modules are idempotent by design. Shell and command tasks are not — use creates, removes, or changed_when guards to make them idempotent or to suppress false-positive change reports:

# Idempotent shell task — only run if the output file does not exist - name: Generate TLS certificate ansible.builtin.command: cmd: openssl req -x509 -newkey rsa:4096 -keyout /etc/ssl/private/app.key -out /etc/ssl/certs/app.crt -days 365 -nodes -subj "/CN=app.internal" creates: /etc/ssl/certs/app.crt # skip if this file already exists # Suppress spurious changed status for read-only checks - name: Check kernel parameter ansible.builtin.command: sysctl net.ipv4.ip_forward register: sysctl_result changed_when: false # this task never changes state — report as OK always

Breaking idempotency is one of the most common production Ansible bugs. A playbook that flips a config on every run will continually restart services, generate change-event noise in your CMDB, and make it impossible to tell from the run report whether something actually changed. Treat every CHANGED line in a playbook run as a real event that should require explanation.