Multi-Cloud: Azure & GCP

GCP Compute & Networking

18 min Lesson 6 of 28

GCP Compute & Networking

Google Cloud Platform was engineered from the inside out — built on the same global infrastructure that serves Google Search, Gmail, and YouTube. That origin story is not marketing; it directly determines how GCP computes and routes traffic, and why its architecture differs from both AWS and Azure in ways that matter enormously at scale. In this lesson you will learn GCE (the raw VM layer), GCP's uniquely global VPC model, and the load balancing tier that operates at Google's own network edge.

Google Compute Engine (GCE)

GCE is GCP's IaaS VM service. On the surface it resembles EC2 — you pick a machine type, image, disk, and network — but three design decisions set it apart.

Custom machine types. Rather than choosing from a fixed list of instance families, GCE lets you specify an exact vCPU-and-memory ratio. Need 6 vCPUs and 22 GB RAM for a legacy app that doesn't fit standard shapes? That is a first-class option. This matters operationally because right-sizing VMs is the single highest-ROI cost optimisation and GCE makes it precise.

Live migration. Google migrates running VMs transparently across physical hosts during maintenance events — no forced reboots, no interruption windows to schedule around. AWS and Azure offer scheduled maintenance windows; GCE makes it invisible by default. This changes how you architect for HA: you still need redundancy, but you are not fighting your cloud's maintenance schedule.

Preemptible and Spot VMs. GCE's equivalent of AWS Spot are Spot VMs (the modern term; older docs say "preemptible"). They are up to 91% cheaper but can be reclaimed with a 30-second notice. They are ideal for stateless batch jobs, CI runners, and ML training checkpointing every few minutes.

# Create a Spot VM with a custom machine type and startup script
gcloud compute instances create batch-worker-1 \
  --zone=us-central1-a \
  --machine-type=custom-8-32768 \       # 8 vCPU, 32 GB RAM
  --provisioning-model=SPOT \
  --instance-termination-action=DELETE \
  --image-family=debian-12 \
  --image-project=debian-cloud \
  --boot-disk-size=50GB \
  --boot-disk-type=pd-ssd \
  --metadata=startup-script='#!/bin/bash
    echo "Worker started" >> /var/log/startup.log
    /opt/job/run.sh' \
  --scopes=storage-rw,logging-write

# List running instances and their status
gcloud compute instances list --filter="status=RUNNING" --format="table(name,zone,machineType,status)"

GCE machine types follow the naming convention {family}-{vCPU}[-{memory}]. General-purpose: n2-standard-4 (4 vCPU, 16 GB). Compute-optimised: c3-highcpu-8. Memory-optimised: m3-megamem-64. Custom: custom-6-22528 (6 vCPU, 22 GB). Check gcloud compute machine-types list --zones us-central1-a for current availability.

GCP's Global VPC Model

This is where GCP diverges most fundamentally from AWS. In AWS, a VPC is regional — you create separate VPCs per region and connect them with peering or Transit Gateway. In GCP, a single VPC spans all regions globally. You add subnets in specific regions, but they all belong to one logical network and can communicate using internal RFC-1918 addresses without any peering or gateway.

The practical implications are significant:

A GKE cluster in us-central1 and a Cloud SQL instance in europe-west4 can speak to each other on private IPs with no extra networking config — same VPC.
Firewall rules are applied to the VPC as a whole, using network tags or service accounts as targets rather than security groups scoped to a VPC. A tag like allow-http can be attached to any VM anywhere in the VPC.
Subnet IP ranges are regional but do not need to be unique across regions (though they must not overlap if you use Shared VPC or peering).

GCP's auto-mode VPC creates a subnet in every region automatically. It looks convenient but is a trap in production: you lose control of CIDR ranges, and the default 10.128.0.0/9 block conflicts with many on-premises networks. Always create custom-mode VPCs in production and define your own CIDRs deliberately.

GCP's single global VPC spans all regions natively; AWS requires explicit peering or Transit Gateway for cross-region private traffic.

Shared VPC and VPC Peering

At enterprise scale, a single VPC is not enough. GCP offers two multi-network models: Shared VPC and VPC Network Peering. Shared VPC designates one project as the host and allows multiple service projects to attach their resources (VMs, GKE nodes) to the host's subnets. This centralises network administration — firewall rules and subnets are managed by the platform team while application teams deploy freely into those subnets. VPC Peering, by contrast, connects two separate VPC networks without making one subordinate to the other; useful for SaaS providers needing private connectivity to customer VPCs.

# Enable Shared VPC on the host project (run as Org Admin)
gcloud compute shared-vpc enable my-host-project

# Attach a service project to the host VPC
gcloud compute shared-vpc associated-projects add my-app-project \
  --host-project=my-host-project

# Create a subnet in the host project accessible to service projects
gcloud compute networks subnets create prod-app-subnet \
  --network=prod-vpc \
  --region=us-central1 \
  --range=10.20.0.0/24 \
  --project=my-host-project

# Grant the app project's service account Compute Network User on the subnet
gcloud compute networks subnets add-iam-policy-binding prod-app-subnet \
  --region=us-central1 \
  --member="serviceAccount:app-sa@my-app-project.iam.gserviceaccount.com" \
  --role="roles/compute.networkUser" \
  --project=my-host-project

Cloud Load Balancing: The Google Difference

GCP's load balancing is not a fleet of virtual appliances deployed into your VPC — it is a globally distributed system running on Google's own edge infrastructure (the same infrastructure that absorbs multi-Tbps DDoS attacks). Traffic enters the Google network at the closest point of presence, and the load balancer makes routing decisions there before the packet ever reaches your VMs. This has three concrete consequences for DevOps engineers:

Anycast IPs. A single global IP routes to the nearest healthy backend pool worldwide. AWS ALB gives you a DNS name that resolves to regional IPs; GCP's global HTTP(S) LB gives you one IP that works in every region. No Traffic Manager or Route53 latency policies needed.
Near-instant failover. Because backend health checks run from Google's infrastructure, failover is detected and propagated in under 10 seconds globally — critical for cross-region HA.
Built-in Cloud Armor. GCP's WAF (Cloud Armor) integrates directly with the HTTP(S) LB at the edge. DDoS mitigation and OWASP rule evaluation happen before traffic reaches your VPC, protecting capacity and reducing compute costs.

GCP offers several LB tiers. Know which to use:

Global HTTP(S) LB — Layer 7, anycast, multi-region backends, Cloud CDN integration. Use for public-facing web services.
Regional Internal HTTP(S) LB — L7, private, intra-VPC. Use for service-mesh east-west traffic.
TCP/UDP Network LB — L4, regional, high-throughput. Use for non-HTTP protocols.
Internal TCP/UDP LB — L4, private, for internal services.

# --- Terraform: Global HTTP(S) Load Balancer for a GCE instance group ---

resource "google_compute_instance_group_manager" "app" {
  name               = "app-mig"
  base_instance_name = "app"
  zone               = "us-central1-a"
  target_size        = 3

  version {
    instance_template = google_compute_instance_template.app.id
  }

  named_port {
    name = "http"
    port = 8080
  }

  auto_healing_policies {
    health_check      = google_compute_health_check.app.id
    initial_delay_sec = 60
  }
}

resource "google_compute_health_check" "app" {
  name               = "app-health-check"
  check_interval_sec = 10
  timeout_sec        = 5

  http_health_check {
    port         = 8080
    request_path = "/healthz"
  }
}

resource "google_compute_backend_service" "app" {
  name                  = "app-backend"
  protocol              = "HTTP"
  port_name             = "http"
  load_balancing_scheme = "EXTERNAL_MANAGED"  # Global HTTP(S) LB
  timeout_sec           = 30

  backend {
    group           = google_compute_instance_group_manager.app.instance_group
    balancing_mode  = "UTILIZATION"
    max_utilization = 0.8
  }

  health_checks = [google_compute_health_check.app.id]

  log_config {
    enable      = true
    sample_rate = 1.0
  }
}

resource "google_compute_url_map" "app" {
  name            = "app-url-map"
  default_service = google_compute_backend_service.app.id
}

resource "google_compute_managed_ssl_certificate" "app" {
  name = "app-cert"
  managed {
    domains = ["api.example.com"]
  }
}

resource "google_compute_target_https_proxy" "app" {
  name             = "app-https-proxy"
  url_map          = google_compute_url_map.app.id
  ssl_certificates = [google_compute_managed_ssl_certificate.app.id]
}

resource "google_compute_global_forwarding_rule" "app" {
  name                  = "app-https-rule"
  target                = google_compute_target_https_proxy.app.id
  port_range            = "443"
  load_balancing_scheme = "EXTERNAL_MANAGED"
  ip_protocol           = "TCP"
}

Enable Premium Tier networking for all production load balancers. GCP has two network service tiers: Premium routes traffic over Google's private backbone from entry point to destination (lowest latency, highest reliability); Standard uses public internet to the region. Premium is the default for global LBs and the right choice for anything customer-facing. Standard is acceptable for batch egress where cost matters more than latency.

Cloud NAT and Private Google Access

VMs in a private subnet need egress without a public IP. GCP's Cloud NAT is a fully managed, highly available NAT service — no NAT gateway instances to patch or scale. It automatically scales port allocation and logs every translation to Cloud Logging. Pair it with Private Google Access on your subnets: when enabled, VMs with only internal IPs can still reach Google APIs (Cloud Storage, Pub/Sub, BigQuery) over Google's private network without a NAT hop, saving egress cost and keeping data off the public internet entirely.

A common production incident: a VM in a private subnet fails to pull container images from Artifact Registry. The root cause is almost always that Private Google Access is not enabled on the subnet, or the Cloud NAT router covering the subnet does not include a route for 0.0.0.0/0. Verify with gcloud compute routers get-nat-mapping-info <router-name> --region=<region>.

Production Failure Modes to Know

These are the GCP compute and networking issues that surface most often in production on-call rotations:

Quota exhaustion. GCE has per-project, per-region quotas on CPUs, IPs, and persistent disk. Autoscaling jobs fail silently when quota is hit. Monitor serviceruntime.googleapis.com/quota/exceeded in Cloud Monitoring and request quota increases ahead of growth events.
Managed instance group (MIG) healing loops. If a health check path returns non-200 during application startup (before the process is ready), the MIG will destroy and recreate the instance repeatedly. Always implement a dedicated /healthz endpoint and set initial_delay_sec on the auto-healing policy to cover your worst-case cold-start time.
Firewall rule priority conflicts. GCP firewall rules are ordered by priority (0–65535, lower = higher priority). The default deny-all-ingress implicit rule at priority 65535 is often overridden correctly, but a misconfigured allow rule at a higher-priority number (e.g., 65534) will silently lose to a deny rule at 1000. Always verify effective rules with gcloud compute firewall-rules list --sort-by=priority.
Backend drain time misconfiguration. The LB backend service has a connection_draining_timeout_sec (default 300s). During rolling updates, if your app takes longer than this to finish in-flight requests, connections are forcibly terminated. Tune this to match your p99 request latency plus buffer.

GCP's compute and networking layer rewards engineers who take the time to understand its global-first design. The VPC model, the LB architecture, and live migration together enable a class of always-on, globally distributed systems that would require significantly more configuration complexity on other clouds.