Configuration Management with Ansible

Ad-Hoc Commands & Modules

18 min Lesson 3 of 30

Ad-Hoc Commands & Modules

Before you write your first playbook you need to understand the atomic unit of Ansible work: the module. Everything Ansible does — copying a file, restarting a service, installing a package, querying an API — is accomplished by running a module on a managed node. Ad-hoc commands let you invoke a single module directly from your terminal, without a playbook, making them the fastest way to explore Ansible's module library and debug live infrastructure. At big-tech scale, ad-hoc commands are also the safe first step when you need to inspect a fleet before you commit an automated change.

The Module Model

An Ansible module is a self-contained program — usually Python, but also PowerShell, Ruby, or a shell script — that Ansible transfers to the managed node, executes, and then removes. Each module accepts a set of named parameters, performs exactly one logical action, and returns a JSON result that includes a changed flag and any relevant data. Ansible's control node never runs the module itself; it only orchestrates the transfer and collects the result.

This design means modules are platform-aware. The package module detects whether a host uses apt, yum, dnf, or homebrew and delegates to the right tool automatically. You write one task; Ansible adapts to each node's OS family. This abstraction is what makes Ansible playbooks portable across heterogeneous fleets.

Key idea — module return value anatomy: Every module returns at minimum {"changed": false, "failed": false}. The changed key is what drives idempotency reporting and handler notifications. If a task returns changed: false, Ansible skips any notify targets — no spurious service restarts on repeated runs.

Idempotency: The Contract Every Module Must Keep

Idempotency means running the same Ansible task twice produces exactly the same end-state as running it once, with no side effects on the second run. This is not a nice-to-have property — it is the engineering contract that makes Ansible safe to run in automated pipelines, cron jobs, and remediation loops without human supervision.

A well-written module checks the current state before acting. The file module checks whether a file already has the target permissions before calling chmod. The user module checks whether an account already exists before calling useradd. If the desired state already matches reality, the module returns changed: false and exits immediately. When you build your own modules or shell-out tasks, you are responsible for implementing this check yourself — failing to do so turns Ansible into a destructive script runner rather than a declarative configuration tool.

Production pitfall — shell and command are never idempotent by default. ansible all -m shell -a "useradd deploy" will fail on every subsequent run because useradd errors when the account already exists. Use the user module instead. Reserve shell/command for operations where no dedicated module exists, and always add a creates or removes argument (or a when condition backed by a fact) to restore idempotency.

Ad-Hoc Command Syntax

The general form of an ad-hoc command is:

ansible <pattern> -m <module> -a "<module_args>" [options]

The <pattern> is any inventory group, hostname, or glob. Common options you will use daily:

-i — path to an inventory file or directory (default: /etc/ansible/hosts)
-u — remote user (default: current local user)
-b — become (escalate to sudo)
--become-method — escalation method (sudo, su, doas)
-f — fork count / parallelism (default: 5; set to 50+ for large fleets)
-v / -vvv — verbosity; -vvvv shows SSH debug output
--check — dry-run mode; modules report what they would do without making changes
--diff — show before/after diff for file-modifying tasks

The Modules You Will Use Every Day

Ansible ships over 7,000 modules across its collections. In practice, 90 % of infrastructure work is covered by a core dozen. Here is each one with a real, production-representative command:

# --- CONNECTIVITY CHECK ---
# ping is not ICMP; it verifies Python is reachable and SSH auth works.
# Use this as your first command against any new fleet.
ansible webservers -i inventory/prod -m ping

# --- PACKAGE MANAGEMENT ---
# Install nginx on all web servers (idempotent: skips if already installed)
ansible webservers -i inventory/prod -m package -a "name=nginx state=present" -b

# Install a pinned version (critical in prod — never allow floating versions)
ansible webservers -i inventory/prod \
  -m package -a "name=nginx-1.26.1 state=present" -b

# Remove a package
ansible legacy -i inventory/prod -m package -a "name=telnetd state=absent" -b

# --- SERVICE MANAGEMENT ---
# Ensure nginx is started and enabled at boot
ansible webservers -i inventory/prod \
  -m service -a "name=nginx state=started enabled=true" -b

# Restart nginx (use sparingly in prod — prefer handlers in playbooks)
ansible webservers -i inventory/prod -m service -a "name=nginx state=restarted" -b

# --- FILE OPERATIONS ---
# Create a directory with specific ownership and mode
ansible all -i inventory/prod \
  -m file -a "path=/var/app/releases state=directory owner=deploy group=deploy mode=0755" -b

# Remove a file
ansible all -i inventory/prod -m file -a "path=/tmp/old-config.conf state=absent" -b

# --- COPY A FILE TO REMOTE HOSTS ---
# Copy a local file; module calculates SHA256 checksum and skips if identical
ansible webservers -i inventory/prod \
  -m copy -a "src=files/nginx.conf dest=/etc/nginx/nginx.conf owner=root mode=0644 backup=yes" -b

# --- FETCH A FILE FROM REMOTE HOSTS ---
# Pull /var/log/app/error.log from each host into local ./fetched/<hostname>/
ansible webservers -i inventory/prod \
  -m fetch -a "src=/var/log/app/error.log dest=fetched/ flat=no" -b

# --- RUN A SHELL COMMAND (with idempotency guard) ---
# 'creates' tells Ansible to skip if the path already exists
ansible dbservers -i inventory/prod \
  -m shell -a "pg_basebackup -D /data/replica creates=/data/replica/PG_VERSION" -b

# --- GATHER FACTS FROM ALL HOSTS ---
# Returns JSON: IP, OS, CPU, memory, disk, kernel version, etc.
ansible all -i inventory/prod -m setup

# Filter to only network facts (faster for large fleets)
ansible all -i inventory/prod -m setup -a "filter=ansible_default_ipv4"

# --- USER MANAGEMENT ---
# Create a system user (idempotent)
ansible all -i inventory/prod \
  -m user -a "name=deploy shell=/bin/bash groups=docker append=yes state=present" -b

# --- LINEINFILE: ensure a config line exists ---
# Idempotent: adds the line only if not already present
ansible dbservers -i inventory/prod \
  -m lineinfile -a "path=/etc/postgresql/16/main/postgresql.conf \
  line='max_connections = 500' regexp='^max_connections' state=present" -b

Ansible module execution lifecycle: the module is transferred over SSH, checks current state, and only applies a change when the desired state differs — returning a JSON result with the changed flag.

The setup Module — Your Fleet's Live Inventory

The setup module (also called facts gathering) deserves special attention. When Ansible runs any playbook it calls setup first (unless you set gather_facts: false) to build a rich JSON inventory of every managed node: kernel version, all IP addresses and interfaces, memory and CPU topology, disk mount points, package manager type, Python interpreter path, and dozens more. Ad-hoc facts queries are indispensable in production:

# Find all hosts running kernel older than 6.1 (useful before a kernel-level CVE patch)
ansible all -i inventory/prod -m setup -a "filter=ansible_kernel" -o \
  | grep -v " 6\."

# Check free memory across the db tier before a memory-heavy migration
ansible dbservers -i inventory/prod -m setup \
  -a "filter=ansible_memfree_mb" -o

# One-liner to list every host's primary IPv4 and OS family
ansible all -i inventory/prod -m setup \
  -a "filter=ansible_default_ipv4,ansible_os_family" -o

Pro tip — fact caching for large fleets: Calling setup across 500 hosts at playbook start adds 20-60 seconds of serial SSH overhead. Enable fact caching in ansible.cfg (fact_caching = redis or jsonfile) to persist facts between runs. Google-scale Ansible deployments always cache facts; fresh facts are only forced for hosts that just joined the fleet or just had a kernel update.

Parallelism and the -f Flag

Ansible's default fork count is 5 — it runs tasks on at most 5 hosts simultaneously. For a fleet of 200 nodes, that means a 30-second task takes two minutes. Raise -f to match your control node's available connections (typically 50–100 for a dedicated Ansible bastion) and your SSH daemon's MaxSessions setting:

# Restart the app service across 200 nodes, 50 at a time
ansible appservers -i inventory/prod -m service \
  -a "name=myapp state=restarted" -b -f 50

# Set a permanent default in ansible.cfg (committed to your ops repo)
# [defaults]
# forks = 50

Key idea — check mode is your safety net. Before running any destructive ad-hoc command against production, always prefix a --check --diff run first. Most modules honour check mode and report exactly what they would change without touching anything. This is the ad-hoc equivalent of a Terraform plan — and skipping it is how engineers accidentally restart 200 production services in parallel at 2 AM.

Ad-hoc commands are the fastest learning loop in Ansible: run a module, read the JSON, adjust the arguments, run again. By building muscle memory with these commands you will write smarter playbooks — because a playbook is just a repeatable, version-controlled sequence of the same module invocations you mastered here.