Runtime Security
Runtime Security
Static controls — image scanning, Pod Security Standards, network policies — stop known-bad configurations before a workload starts. Runtime security is the discipline that answers the harder question: what is a container actually doing once it is running? A supply-chain attack, a zero-day exploit, or a malicious insider can bypass every pre-flight check and land a threat inside a legitimate container. Runtime security is your last line of detection before exfiltration.
The three pillars of cloud-native runtime security are syscall-level detection (via Falco or eBPF-based engines), seccomp syscall profiles (deny at the kernel level), and container drift detection (alert when a running container no longer matches its immutable image). This lesson covers all three, including the failure modes that bite production teams.
How Falco Works: Syscall Visibility from the Kernel
Falco (CNCF graduated) sits between the kernel and user space. It uses either a kernel module or an eBPF probe to intercept every syscall made by every process in every container on the node. Those raw events are fed into a rule engine that evaluates them against a set of declarative rules. When a rule fires, Falco emits a structured alert — to stdout, syslog, a webhook, or directly to a SIEM like Splunk or Elastic.
Falco ships with a default ruleset covering the most critical categories: shell spawned inside a container, sensitive file reads (/etc/shadow, kubeconfig), unexpected network outbound connections, privilege escalation syscalls (setuid, ptrace), and modifications to container filesystems after startup. Production teams layer custom rules on top.
driver.kind=ebpf) or the modern CO-RE eBPF driver (driver.kind=modern-ebpf). The kernel module requires recompilation on every kernel upgrade and can destabilize the node; eBPF is safer and the path all major cloud vendors support. AWS EKS and GKE both support CO-RE eBPF without node customization.
Seccomp Profiles: Denying at the Kernel Boundary
Detection tells you when something bad happened. Seccomp (Secure Computing Mode) prevents it by restricting which syscalls a container is even allowed to make. A container running a Node.js API does not need ptrace, mount, or kexec_load. A seccomp profile is a JSON allowlist (or blocklist) enforced by the kernel — attempting a blocked syscall returns EPERM or kills the process.
Kubernetes supports two ways to apply seccomp: the built-in RuntimeDefault profile (the container runtime's own default, which blocks ~44 syscalls that should never be needed) and custom Localhost profiles stored on the node.
Generating a tight custom profile from scratch is tedious. The practical workflow at scale: run the workload with type: Unconfined and Falco logging syscalls, or use inspektor-gadget (ig advise seccomp-profile) to record the actual syscall profile of a running container and generate a JSON policy automatically. Then promote to Localhost in staging, validate, and ship to production.
--seccomp-default on the kubelet to apply RuntimeDefault to every Pod that does not specify a seccomp profile. This is a zero-friction cluster-wide win that blocks dozens of dangerous syscalls without touching any workload manifest.
Container Drift Detection
An immutable container image is a security contract: what you scanned and signed at build time is exactly what runs in production. Container drift breaks that contract — a process installs a tool, downloads a binary, or modifies a config file inside the running container's writable layer. Even a legitimate developer doing kubectl exec ... apt-get install curl for debugging creates drift that could persist if the container is not restarted.
Drift detection approaches range from lightweight to comprehensive:
- Falco rules (layer 1): Detect writes to binary directories, package manager executions, or new binary downloads via syscall events — fires within milliseconds.
- Read-only root filesystems (layer 2): Set
readOnlyRootFilesystem: truein the container'ssecurityContext. Any write attempt returnsEROFS. Combined withemptyDirortmpfsmounts for paths that genuinely need writes (logs, caches), this eliminates an entire class of drift. - Image digest pinning (layer 3): Reference images by SHA256 digest (
myorg/api@sha256:abc123...) rather than a mutable tag. A drifted tag can silently pull a different image on Pod restart; a digest cannot. - Admission-time attestation (layer 4): Tools like Sigstore/Cosign + Kyverno or OPA enforce that only images signed by your CI pipeline are admitted. Drift via image substitution is blocked entirely.
kubectl exec audit gap: Most security teams monitor image deployments but forget that kubectl exec into a running Pod bypasses all admission controls and image scanning. A developer who installs a package inside a container has created an ephemeral rootkit. Ensure your Kubernetes audit policy (--audit-policy-file) logs exec events at the Request level and pipes them into your SIEM. Falco also has a default rule for this: Terminal shell in container.
Tying It Together: Detection-to-Response in Production
Detection without automated response is just alerting fatigue. The production pattern at big-tech scale is a tiered response loop: Falco fires a webhook to Falcosidekick, which routes CRITICAL events to a response bot. The bot calls the Kubernetes API to cordon the node, isolate the Pod with a network policy, capture a forensic snapshot (container state, open file descriptors via /proc), and page the on-call team. Lower-severity events go to the SIEM for analyst review.
Invest in tuning rule noise before you wire automated response. A rule with a 5% false-positive rate that kills production Pods will erode trust faster than the threat it was meant to stop. Start with audit mode, build a two-week baseline, suppress known-benign patterns with Falco's exceptions field, then flip to response automation.