Linux Fundamentals

Why Linux Runs the World

18 min Lesson 1 of 26

Why Linux Runs the World

Before you run a single kubectl command, deploy a container, or write a CI pipeline, there is one piece of context you must internalize: Linux is the operating system of the internet. Over 96% of the world's top one million web servers run Linux. Every major cloud provider — AWS, GCP, Azure — runs your workloads on Linux VMs by default. Kubernetes nodes are Linux. Docker containers share the Linux kernel. If you are serious about DevOps, Linux is not optional background knowledge — it is the substrate everything else runs on.

Why Not Windows Server?

Windows Server is a capable platform and many enterprises rely on it, but it loses to Linux at scale for a handful of compounding reasons:

  • Licensing cost: A Windows Server 2022 Datacenter license can cost thousands of dollars per server per year. Linux is free. At 10,000 VMs, the savings are enormous.
  • Footprint: A minimal Alpine Linux container image is about 7 MB. A Windows Server Core image starts at 4+ GB. Smaller images mean faster pull times, lower egress costs, and a smaller attack surface.
  • Scripting and automation: Linux ships with a battle-tested shell, cron, pipes, and a massive ecosystem of CLI tools. Every DevOps tool — Ansible, Terraform, Helm — was designed first for Linux.
  • Stability and predictability: Linux systems routinely run for years without a reboot. Many production databases at large tech companies have uptimes measured in hundreds of days.
  • Open source ecosystem: The entire cloud-native stack (Kubernetes, etcd, containerd, Prometheus, Grafana) is open source and Linux-native.
Key Insight: When you SSH into an EC2 instance, a GKE node, or a bare-metal server in a colo facility, you are almost certainly staring at a Linux shell. Everything in this tutorial series teaches you to operate that environment with confidence.

The Linux Kernel and the Userspace Split

People often say "Linux" when they mean the entire operating system, but technically Linux is only the kernel — the core piece of software that manages hardware resources. Everything else (the shell, the filesystem tools, the package manager, the init system) is userspace software that sits on top of the kernel.

Understanding this split is not pedantry — it directly affects how you debug production problems and how containers work:

  • The kernel handles CPU scheduling, memory management, device drivers, networking, and system calls.
  • Userspace processes make system calls (like read(), write(), fork(), socket()) to ask the kernel to do privileged work on their behalf.
  • A Docker container runs its own userspace but shares the host kernel. That is why a container is not a VM — the kernel is not virtualized, only isolated.
Linux Kernel and Userspace Stack Hardware CPU · RAM · Disk · Network Interface Cards · GPU Linux Kernel Scheduler Memory Manager VFS / Block I/O TCP/IP Stack System Call Interface (read · write · fork · exec · socket · mmap · ...) USERSPACE Shell + Core Utils bash · ls · grep · awk · sed Init System systemd · manages services Package Manager apt · yum · dnf · apk Applications (Userspace) nginx / Apache Docker / containerd sshd / PostgreSQL your app KERNEL SPACE Containers share the kernel but isolate userspace (namespaces + cgroups)
The Linux software stack: hardware at the bottom, the kernel in the middle, and all userspace software above the system call boundary.

Linux Distributions: Choosing the Right One

A Linux distribution (distro) bundles the kernel with a curated set of userspace tools, a package manager, default configuration, and a support lifecycle. The kernel itself is the same across all distros; the differences lie in the tooling and the target use case.

In production DevOps environments you will encounter primarily three families:

  • Debian/Ubuntu: apt package manager. Ubuntu LTS (22.04, 24.04) is the dominant choice for cloud VMs at most mid-to-large companies and is the default image on AWS, GCP, and Azure. Excellent community documentation.
  • RHEL/CentOS/Amazon Linux: yum / dnf package manager. RHEL (Red Hat Enterprise Linux) is the standard in regulated industries (finance, healthcare) because of its 10-year support lifecycle and FIPS certification. Amazon Linux 2023 is RHEL-compatible and the default for AWS-native workloads.
  • Alpine Linux: apk package manager. Chosen almost exclusively for Docker base images because of its tiny footprint (~5 MB). It uses musl libc instead of glibc, which occasionally causes subtle compatibility issues with pre-compiled binaries.
Production practice: Standardize your organization on one distro family. Mixing Ubuntu and RHEL means two different package managers, two different service manager conventions, and twice the runbook complexity. Most companies pick Ubuntu LTS for cloud workloads and stick to it.

Checking What You Are Running

On any new server, the first thing a DevOps engineer does is identify the OS, kernel version, and architecture. These three commands give you the full picture:

# Identify the distro and version cat /etc/os-release # Kernel version and architecture uname -r # e.g. 6.1.0-21-amd64 uname -m # x86_64 or aarch64 (ARM) # Full summary in one shot uname -a # Linux ip-10-0-1-42 6.1.0-21-amd64 #1 SMP Debian 6.1.90-1 (2024-05-03) x86_64 GNU/Linux # How long the system has been running (uptime + load averages) uptime

The kernel version matters when you are installing kernel modules (e.g., for eBPF-based observability tools like Cilium or Falco), or when a CVE is kernel-specific and you need to know if your fleet is patched.

How the Cloud Changed Linux Operations

In the pre-cloud era, "setting up a Linux server" meant walking to a rack, inserting a USB installer, and waiting. Today, in a mature DevOps organization, no one manually installs an OS. Instead:

  • Images: Cloud providers maintain curated, hardened AMIs (AWS), images (GCP/Azure). You boot from a known-good image and never modify the base — if you need a change, you bake a new image with Packer.
  • Immutable infrastructure: Servers are treated as cattle, not pets. A broken VM is terminated and replaced from the same image, not SSH'd into and debugged.
  • User-data / cloud-init: Light bootstrapping (installing an agent, setting a hostname, adding SSH keys) happens at first boot via cloud-init, not via manual steps.
# Example: launch an Ubuntu 24.04 EC2 instance with cloud-init # (AWS CLI v2, assuming credentials are configured) aws ec2 run-instances \ --image-id ami-0c02fb55956c7d316 \ # Ubuntu 24.04 LTS us-east-1 --instance-type t3.micro \ --key-name my-keypair \ --user-data file://bootstrap.sh \ --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=web-01}]'
Production pitfall: Never use a "latest" AMI alias in production automation without pinning the version. Cloud providers update base images regularly; an unplanned kernel upgrade in your launch template can break a kernel module your application depends on. Always reference a specific AMI ID or image family with a pinned version tag in Terraform or Pulumi.

Why This Matters for Everything That Follows

Every topic in this course — shell scripting, process management, networking, containers, CI/CD — assumes you are operating on Linux. When a container crashes, you read Linux kernel logs. When a web server is slow, you inspect Linux network buffers. When a deployment fails, you trace it through systemd journal logs. The mental model you build in this tutorial series is the operating model for production systems at Google, Amazon, Meta, and every cloud-native startup following their example.

Start by knowing which Linux you are on, which kernel version is running, and whether the system is behaving as expected. That discipline — observe first, act second — is the foundation of reliable operations.