Linux Fundamentals

Package Management

18 min Lesson 8 of 26

Package Management

In production Linux environments, software is installed, updated, and removed through a package manager — not by downloading and compiling source code manually. Package managers resolve dependency trees, verify cryptographic signatures, track installed software, and let you reproduce the exact same software state across hundreds of servers. This is the foundation of reliable infrastructure.

Two ecosystems dominate the DevOps world: APT (Advanced Package Tool) on Debian-based distributions such as Ubuntu, and DNF/YUM on Red Hat-based distributions such as RHEL, CentOS Stream, Rocky Linux, and Amazon Linux 2023. Understanding both is non-negotiable for any senior engineer — cloud providers use both, and you will encounter both in production.

How Packages and Repositories Work

A package is a compressed archive (`.deb` for Debian/Ubuntu, `.rpm` for RHEL/CentOS) that contains binaries, configuration files, and metadata describing its dependencies. A repository is a curated server that hosts thousands of packages alongside an index so your package manager can search, resolve, and download exactly what is needed. When you run an install command, the package manager: reads its local repository index, resolves the entire dependency graph, downloads all required packages, verifies each GPG signature, and atomically applies the changes — leaving your system consistent even if the process is interrupted.

Package manager flow: client resolves deps, fetches from repo, verifies and installs CLI Client apt / dnf Dep Resolver builds install plan Repository mirrors.ubuntu.com GPG Verify signature check Install / Unpack atomic, rollback-safe Filesystem /usr, /etc, /var Package Manager Install Flow 1 2 3 4 5
How apt/dnf install a package: resolve dependencies, fetch from repository, GPG-verify, then atomically unpack to the filesystem.

APT — Debian & Ubuntu

APT is the package manager for Ubuntu, Debian, and their derivatives. The high-level command you use daily is apt; the lower-level plumbing is dpkg. Repository sources are stored in /etc/apt/sources.list and files under /etc/apt/sources.list.d/.

## ── APT: everyday commands ──────────────────────────────────────── # Sync the local package index with all configured repositories sudo apt update # Upgrade all installed packages (keeps existing versions if deps conflict) sudo apt upgrade -y # Full upgrade — may add or remove packages to satisfy new deps (preferred in CI) sudo apt full-upgrade -y # Install a package sudo apt install nginx -y # Install a specific version sudo apt install nginx=1.24.0-1ubuntu1 -y # Remove a package but keep its config files sudo apt remove nginx # Remove package AND purge all config files (clean-state) sudo apt purge nginx -y sudo apt autoremove -y # remove orphaned deps # Search available packages apt search "web server" # Show detailed package info: version, deps, installed size apt show nginx # List installed packages dpkg -l | grep nginx # Check which package owns a file dpkg -S /usr/sbin/nginx
Key principle — always run apt update before installing. The local index (cached under /var/lib/apt/lists/) can be stale. Installing without updating can pull an outdated version or fail with a "package not found" error even when the package exists. In CI pipelines, apt-get update && apt-get install -y ... is always written as a single RUN layer in a Dockerfile so the index is never cached separately from the install step.

Adding Third-Party APT Repositories

Many production tools — Docker, Kubernetes, Node.js, PostgreSQL — publish their own APT repositories because they release faster than Ubuntu's main repo. The modern pattern uses signed-by keyring files, which scopes the GPG key to a specific repository (more secure than the legacy apt-key add approach).

## ── Add the official Docker APT repository (Ubuntu 24.04) ──────── # 1. Install prerequisites sudo apt-get update sudo apt-get install -y ca-certificates curl # 2. Import Docker's GPG key into a scoped keyring sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \ -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # 3. Add the repository source, referencing the key echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \ https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null # 4. Update index so the new repo is visible, then install sudo apt-get update sudo apt-get install -y docker-ce docker-ce-cli containerd.io
Production pitfall — never use curl | sudo bash install scripts in production. Many vendor "quick-start" docs tell you to run curl https://get.example.com/install.sh | sudo bash. That executes arbitrary code from the internet as root with no integrity check. Always use the signed repository method: download the GPG key separately, verify it, add the repo with signed-by, then install with apt-get. Your security team will flag the curl-pipe pattern in audits.

DNF & YUM — RHEL, Rocky, Amazon Linux

On Red Hat-based systems, DNF (Dandified YUM) is the modern package manager. yum still exists on older systems (RHEL 7, CentOS 7) and on Amazon Linux 2; on RHEL 8+ and Amazon Linux 2023, yum is a compatibility shim that calls DNF. The commands are nearly identical, so learning DNF covers both.

## ── DNF: everyday commands ──────────────────────────────────────── # Check for updates (does NOT apply them; safe to run in prod) sudo dnf check-update # Apply all available updates sudo dnf upgrade -y # Install a package sudo dnf install nginx -y # Install from a specific repository sudo dnf install --enablerepo=epel nginx -y # Remove a package sudo dnf remove nginx -y # Search dnf search "web server" # Show info dnf info nginx # List installed dnf list installed | grep nginx # List all configured repositories dnf repolist -v # Add EPEL (Extra Packages for Enterprise Linux — essential on RHEL/Rocky) sudo dnf install epel-release -y # Clean cached metadata (use when repos behave unexpectedly) sudo dnf clean all sudo dnf makecache

Pinning, Holds, and Version Locking

In production you often need to prevent a package from being upgraded automatically — for example, locking a specific kernel version on a PCI-DSS system, or keeping a known-good version of a database server. On APT you use apt-mark hold; on DNF you use the versionlock plugin.

## ── APT: pin / hold a package version ─────────────────────────── # Hold the current installed version (will not be upgraded by apt upgrade) sudo apt-mark hold nginx # Verify held packages apt-mark showhold # Release the hold when you are ready to upgrade sudo apt-mark unhold nginx ## ── DNF: lock a package to a specific version ──────────────────── # Install the versionlock plugin sudo dnf install python3-dnf-plugin-versionlock -y # Lock the currently installed version of PostgreSQL sudo dnf versionlock add postgresql-server # List locked packages sudo dnf versionlock list # Remove a lock sudo dnf versionlock delete postgresql-server
Pro practice — document every hold/lock in your infrastructure code. If you pin nginx to version 1.24 in a Puppet/Ansible playbook or a Dockerfile, add a comment explaining why (e.g., "pinned because v1.26 breaks our ModSecurity ruleset — re-test before upgrading"). Undocumented holds turn into mystery failures 18 months later when someone runs apt upgrade and wonders why one service did not update.

Package Management in Automation and CI/CD

At scale you never install packages by hand. Ansible, Puppet, Chef, and SaltStack all have native package modules. In containers, all installs happen at image-build time. The patterns below are what production Dockerfiles and Ansible roles look like.

## ── Dockerfile: production best practices for apt ──────────────── FROM ubuntu:24.04 # Single RUN layer: update + install + clean in one step. # Combining them prevents Docker caching a stale apt index. # --no-install-recommends reduces image size significantly. # Cleaning apt lists keeps the final image lean. RUN apt-get update \ && apt-get install -y --no-install-recommends \ nginx=1.24.* \ curl \ ca-certificates \ && rm -rf /var/lib/apt/lists/* ## ── Ansible: idempotent package task ───────────────────────────── # tasks/main.yml - name: Install nginx at exact version ansible.builtin.apt: name: nginx=1.24.0-1ubuntu1 state: present update_cache: true # equivalent to apt update before install become: true - name: Hold nginx at current version ansible.builtin.dpkg_selections: name: nginx selection: hold become: true

Auditing Installed Packages

Security compliance requires knowing exactly what is installed on every server. Common production audit commands:

  • dpkg-query -W -f='${Package} ${Version}\n' — full installed package list on Debian/Ubuntu, suitable for diffing between servers.
  • rpm -qa --queryformat '%{NAME} %{VERSION}-%{RELEASE}\n' | sort — equivalent on RPM-based systems.
  • apt list --upgradable 2>/dev/null — see what has pending updates; run this in a cron job and alert if security patches are more than 7 days old.
  • unattended-upgrades (APT) and dnf-automatic (DNF) — daemons that apply security-only updates automatically. Standard practice on production hosts that cannot wait for a maintenance window.
Summary — the mental model for package management at scale: repositories are the single source of truth; GPG signatures are the trust boundary; package managers are declarative tools (you declare desired state, they figure out how to reach it); automation (Ansible, Dockerfile) replaces manual apt install on anything beyond a single server; version locks and holds are technical debt that must be documented and regularly reviewed.