Linux System Administration

Storage: Disks, Partitions & Filesystems

22 min Lesson 3 of 28

Storage: Disks, Partitions & Filesystems

Every production server you will ever operate stores its data on block devices — raw sequences of fixed-size sectors exposed by the kernel as /dev/sda, /dev/nvme0n1, /dev/xvdb, and so on. Before a block device is usable it must be partitioned and formatted. Understanding this stack is non-negotiable for a DevOps engineer: botched partitioning is irreversible data loss, and a misunderstood /etc/fstab entry can prevent a server from booting at all.

The storage stack in one sentence: A physical or virtual disk is divided into partitions, each partition is formatted with a filesystem, and the filesystem is mounted at a directory path so the OS can read and write files through it. LVM adds a flexible logical layer between partitions and filesystems.

Inspecting Block Devices with lsblk

lsblk (list block devices) is always your first tool. It reads the kernel's sysfs tree and shows every disk, its partitions, and where each is mounted — without requiring root.

# Show all block devices in a tree with filesystem type, size and mountpoint
lsblk -f

# Example output on a typical cloud VM:
# NAME        FSTYPE   LABEL UUID                                 MOUNTPOINTS
# nvme0n1
# ├─nvme0n1p1 vfat           XXXX-XXXX                            /boot/efi
# ├─nvme0n1p2 ext4           xxxxxxxx-xxxx-...                    /
# └─nvme0n1p3 swap           xxxxxxxx-xxxx-...                    [SWAP]
# nvme1n1                                                          (raw — unpartitioned)

# Also useful: show sizes in human-readable form
lsblk -o NAME,SIZE,FSTYPE,MOUNTPOINT,UUID

# Detailed per-device info (geometry, type, sector size)
sudo fdisk -l /dev/nvme1n1

Key fields to internalize: NAME is the kernel device name; FSTYPE is the filesystem type (empty means unformatted); MOUNTPOINTS shows where it is accessible in the directory tree. A device with no FSTYPE and no mount is raw storage — unallocated disk space.

Creating Partitions with fdisk

fdisk is the standard interactive partition editor for MBR and GPT disks. On any disk larger than 2 TB, or on any system using UEFI, always create a GPT partition table — MBR cannot address more than 2 TB and supports only four primary partitions.

# Partition a fresh data disk (non-interactive, using a here-document)
# /dev/nvme1n1 is the unpartitioned disk identified by lsblk above.
# WARNING: this wipes the disk. Verify the device name before running.

sudo fdisk /dev/nvme1n1 <<'EOF'
g
n
1


w
EOF

# Explanation of the keystrokes sent to fdisk:
#   g  — create a new GPT partition table (wipes existing data)
#   n  — new partition
#   1  — partition number 1
#   (two blank Enter lines) — accept default first and last sector (whole disk)
#   w  — write changes and exit

# Verify the result
lsblk /dev/nvme1n1
# NAME        MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# nvme1n1     259:1    0 100G  0 disk
# └─nvme1n1p1 259:2    0 100G  0 part

Production rule — verify the device name before every destructive command. On a cloud VM, /dev/sda might be your root disk. Running fdisk on the wrong device is instant, unrecoverable data loss. Always confirm with lsblk and cross-check size and current mount state first. A common pattern at Google and Amazon: tag data disks with a unique label (mkfs -L data-vol-1) and reference them by label, not by device path, which can shift on reboot.

Formatting: mkfs

Once partitioned, the partition must be formatted with a filesystem. mkfs is the umbrella command; the actual tool is the mkfs.TYPE variant. Choose your filesystem by use case:

ext4 — default for most Linux workloads. Mature, journaled, well-understood fsck behavior. Preferred for root volumes and general-purpose data disks.
xfs — preferred by RHEL/Amazon Linux for large-file and high-throughput workloads (databases, log aggregation). Scales better at multi-TB sizes; faster large-file writes.
tmpfs — RAM-backed, used for ephemeral directories (/run, /tmp). Automatically sized to a fraction of RAM; contents lost on reboot.
btrfs / ZFS — copy-on-write filesystems with built-in snapshots and checksums. Common in NAS and Kubernetes CSI drivers; operationally more complex.

# Format the new partition as ext4, with a human-readable label
sudo mkfs.ext4 -L data-vol-1 /dev/nvme1n1p1

# For XFS (often preferred on RHEL/Amazon Linux for database disks):
# sudo mkfs.xfs -L db-vol-1 /dev/nvme1n1p1

# Check the result — FSTYPE should now be ext4
lsblk -f /dev/nvme1n1

Mounting: mount and /etc/fstab

A formatted filesystem is mounted to attach it to a directory path (the mountpoint). Temporary mounts survive until reboot; /etc/fstab makes mounts permanent across reboots.

# Create the mountpoint directory
sudo mkdir -p /data

# Temporary mount (lost on reboot)
sudo mount /dev/nvme1n1p1 /data

# Confirm it is mounted
mount | grep /data
# or
df -hT /data

# --- Make it permanent in /etc/fstab ---
# Get the UUID (never use device paths like /dev/nvme1n1p1 in fstab — they can change)
sudo blkid /dev/nvme1n1p1
# Output: /dev/nvme1n1p1: LABEL="data-vol-1" UUID="a1b2c3d4-..." TYPE="ext4"

# Add this line to /etc/fstab:
# UUID=a1b2c3d4-...   /data   ext4   defaults,nofail   0   2
#
# Field breakdown (space- or tab-separated):
#  1  UUID=...       — filesystem identifier (UUID is stable; device path is not)
#  2  /data          — mountpoint
#  3  ext4           — filesystem type
#  4  defaults,nofail — mount options:
#                       defaults = rw, suid, dev, exec, auto, nouser, async
#                       nofail   = do not halt boot if disk is absent (critical for cloud VMs)
#  5  0              — dump backup flag (always 0 in modern setups)
#  6  2              — fsck pass order (1=root, 2=other, 0=skip)

# Test the fstab entry without rebooting:
sudo mount -a    # mounts everything in fstab that is not yet mounted
sudo df -hT      # verify /data is mounted

Always use nofail for non-root volumes on cloud VMs. If you attach an EBS volume, snapshot it, and restore to a new instance without re-attaching the volume, the boot will hang indefinitely at "A start job is running for..." without nofail. On-call engineers have lost hours to this. Use nofail for every secondary mount in fstab.

LVM: The Logical Volume Manager

Raw partitions have a major limitation: resizing requires unmounting (or rebooting). LVM solves this by inserting a flexible layer between physical storage and filesystems. It is the standard storage abstraction on enterprise Linux systems, Kubernetes persistent volume backends, and any workload that needs dynamic disk growth without downtime.

LVM storage stack: physical disks are combined into a Volume Group, then carved into flexible Logical Volumes that are formatted and mounted independently.

The three LVM abstractions:

Physical Volume (PV) — a raw disk or partition initialized with pvcreate. LVM writes its metadata at the start of the PV.
Volume Group (VG) — one or more PVs pooled together with vgcreate. Think of the VG as a single large storage pool.
Logical Volume (LV) — a named slice of the VG created with lvcreate. Appears as /dev/vg-name/lv-name. Format and mount it exactly like a partition.

# Full LVM workflow: two disks -> one VG -> two LVs

# 1. Initialize Physical Volumes
sudo pvcreate /dev/nvme1n1 /dev/nvme2n1
sudo pvs        # verify PVs

# 2. Create Volume Group named vg-data
sudo vgcreate vg-data /dev/nvme1n1 /dev/nvme2n1
sudo vgs        # verify VG (should show combined size)

# 3. Create Logical Volumes
sudo lvcreate -L 50G  -n lv-app  vg-data
sudo lvcreate -L 150G -n lv-logs vg-data
sudo lvs        # verify LVs

# 4. Format and mount (exactly like a regular partition)
sudo mkfs.ext4 /dev/vg-data/lv-app
sudo mkfs.xfs  /dev/vg-data/lv-logs
sudo mkdir -p /app /var/log/app
sudo mount /dev/vg-data/lv-app  /app
sudo mount /dev/vg-data/lv-logs /var/log/app

# 5. Add to /etc/fstab using device path (LVM paths are stable — no UUID needed,
#    but UUID is still fine and slightly more portable):
# /dev/vg-data/lv-app   /app            ext4  defaults,nofail  0  2
# /dev/vg-data/lv-logs  /var/log/app    xfs   defaults,nofail  0  2

# --- Online resize (no downtime for LVM volumes) ---
# Extend lv-logs by 50 GB (add free space to VG first if needed)
sudo lvextend -L +50G /dev/vg-data/lv-logs
# Grow the XFS filesystem to fill the new LV size
sudo xfs_growfs /var/log/app

# For ext4, use resize2fs instead:
# sudo resize2fs /dev/vg-data/lv-app

LVM snapshot for zero-downtime backups. Before running a database backup or a risky schema migration, create a snapshot: sudo lvcreate -L 10G -s -n lv-app-snap /dev/vg-data/lv-app. Mount the snapshot read-only, run the backup against it, then remove it. Your live volume is never locked, and you have a consistent point-in-time copy. This is how managed database services (RDS, AlloyDB) implement their snapshot APIs under the hood.

Common Failure Modes in Production

Understanding failure modes separates engineers who can recover from incidents from those who escalate them. The most common storage failures you will encounter:

Root volume fills up: Applications stop writing logs, databases crash, SSH may still work but almost nothing runs. Diagnose with df -hT. Immediate relief: journalctl --vacuum-size=200M (see Lesson 2), then find and delete or archive large files with du -sh /* 2>/dev/null | sort -rh | head.
Wrong device in fstab: System cannot find the UUID on boot, drops to emergency shell. Fix from the recovery console by editing /etc/fstab and correcting the UUID. Always run sudo mount -a after editing fstab to catch errors before rebooting.
Filesystem corruption: After a hard power loss, ext4 journal usually self-heals on next mount. If not, sudo fsck -n /dev/nvme1n1p1 (read-only check first), then sudo fsck -y /dev/nvme1n1p1 (repair). Never run fsck on a mounted filesystem.
LVM metadata mismatch after snapshot abuse: Incomplete snapshot removal can corrupt VG metadata. Always lvremove snapshots explicitly; never let them fill up (a full snapshot becomes invalid).