Storage: Disks, Partitions & Filesystems
Storage: Disks, Partitions & Filesystems
Every production server you will ever operate stores its data on block devices — raw sequences of fixed-size sectors exposed by the kernel as /dev/sda, /dev/nvme0n1, /dev/xvdb, and so on. Before a block device is usable it must be partitioned and formatted. Understanding this stack is non-negotiable for a DevOps engineer: botched partitioning is irreversible data loss, and a misunderstood /etc/fstab entry can prevent a server from booting at all.
Inspecting Block Devices with lsblk
lsblk (list block devices) is always your first tool. It reads the kernel's sysfs tree and shows every disk, its partitions, and where each is mounted — without requiring root.
Key fields to internalize: NAME is the kernel device name; FSTYPE is the filesystem type (empty means unformatted); MOUNTPOINTS shows where it is accessible in the directory tree. A device with no FSTYPE and no mount is raw storage — unallocated disk space.
Creating Partitions with fdisk
fdisk is the standard interactive partition editor for MBR and GPT disks. On any disk larger than 2 TB, or on any system using UEFI, always create a GPT partition table — MBR cannot address more than 2 TB and supports only four primary partitions.
/dev/sda might be your root disk. Running fdisk on the wrong device is instant, unrecoverable data loss. Always confirm with lsblk and cross-check size and current mount state first. A common pattern at Google and Amazon: tag data disks with a unique label (mkfs -L data-vol-1) and reference them by label, not by device path, which can shift on reboot.
Formatting: mkfs
Once partitioned, the partition must be formatted with a filesystem. mkfs is the umbrella command; the actual tool is the mkfs.TYPE variant. Choose your filesystem by use case:
- ext4 — default for most Linux workloads. Mature, journaled, well-understood fsck behavior. Preferred for root volumes and general-purpose data disks.
- xfs — preferred by RHEL/Amazon Linux for large-file and high-throughput workloads (databases, log aggregation). Scales better at multi-TB sizes; faster large-file writes.
- tmpfs — RAM-backed, used for ephemeral directories (
/run,/tmp). Automatically sized to a fraction of RAM; contents lost on reboot. - btrfs / ZFS — copy-on-write filesystems with built-in snapshots and checksums. Common in NAS and Kubernetes CSI drivers; operationally more complex.
Mounting: mount and /etc/fstab
A formatted filesystem is mounted to attach it to a directory path (the mountpoint). Temporary mounts survive until reboot; /etc/fstab makes mounts permanent across reboots.
nofail for non-root volumes on cloud VMs. If you attach an EBS volume, snapshot it, and restore to a new instance without re-attaching the volume, the boot will hang indefinitely at "A start job is running for..." without nofail. On-call engineers have lost hours to this. Use nofail for every secondary mount in fstab.
LVM: The Logical Volume Manager
Raw partitions have a major limitation: resizing requires unmounting (or rebooting). LVM solves this by inserting a flexible layer between physical storage and filesystems. It is the standard storage abstraction on enterprise Linux systems, Kubernetes persistent volume backends, and any workload that needs dynamic disk growth without downtime.
The three LVM abstractions:
- Physical Volume (PV) — a raw disk or partition initialized with
pvcreate. LVM writes its metadata at the start of the PV. - Volume Group (VG) — one or more PVs pooled together with
vgcreate. Think of the VG as a single large storage pool. - Logical Volume (LV) — a named slice of the VG created with
lvcreate. Appears as/dev/vg-name/lv-name. Format and mount it exactly like a partition.
sudo lvcreate -L 10G -s -n lv-app-snap /dev/vg-data/lv-app. Mount the snapshot read-only, run the backup against it, then remove it. Your live volume is never locked, and you have a consistent point-in-time copy. This is how managed database services (RDS, AlloyDB) implement their snapshot APIs under the hood.
Common Failure Modes in Production
Understanding failure modes separates engineers who can recover from incidents from those who escalate them. The most common storage failures you will encounter:
- Root volume fills up: Applications stop writing logs, databases crash, SSH may still work but almost nothing runs. Diagnose with
df -hT. Immediate relief:journalctl --vacuum-size=200M(see Lesson 2), then find and delete or archive large files withdu -sh /* 2>/dev/null | sort -rh | head. - Wrong device in fstab: System cannot find the UUID on boot, drops to emergency shell. Fix from the recovery console by editing
/etc/fstaband correcting the UUID. Always runsudo mount -aafter editing fstab to catch errors before rebooting. - Filesystem corruption: After a hard power loss, ext4 journal usually self-heals on next mount. If not,
sudo fsck -n /dev/nvme1n1p1(read-only check first), thensudo fsck -y /dev/nvme1n1p1(repair). Never run fsck on a mounted filesystem. - LVM metadata mismatch after snapshot abuse: Incomplete snapshot removal can corrupt VG metadata. Always
lvremovesnapshots explicitly; never let them fill up (a full snapshot becomes invalid).