Networking & Communication

Long Polling, SSE & WebSockets Compared

18 min Lesson 9 of 10

Long Polling, SSE & WebSockets Compared

The web was built on a request/response model: the client asks, the server answers, the connection closes. That model works beautifully for loading pages and submitting forms, but it breaks down when you need the server to push an event to the client — a new chat message, a live stock tick, a build-pipeline status update, a friend's location on a map. Three architectural patterns fill that gap: Long Polling, Server-Sent Events (SSE), and WebSockets. Each makes a different trade-off between simplicity, scalability, and capability.

Why Standard HTTP Polling Fails

The naive approach is short polling: the client fires a request every N seconds and asks "any news?" If N = 1 s and you have 100 000 connected users, you generate 100 000 HTTP round-trips per second — mostly returning empty responses. Latency is bounded below by N, server CPU is wasted, and CDNs cannot cache dynamic feeds. Short polling is rarely acceptable beyond a handful of users or tolerances above 30 s.

Long Polling

Long polling is a clever workaround within plain HTTP: the client sends a request, and the server holds it open until an event occurs (or a timeout fires, typically 20–30 s). When the server has something to say it responds immediately. The client processes the response and immediately issues the next pending request, recreating the "always-waiting" effect.

Latency: near-zero — the server responds as soon as the event fires.
Overhead: each message requires a full HTTP handshake (headers, TLS record, etc.). At high event rates this becomes expensive.
Scaling: each waiting request holds a thread or file descriptor on the server. With 10 000 concurrent users you need 10 000 open connections simultaneously — manageable with async I/O but painful with thread-per-connection servers.
Infrastructure: works everywhere — load balancers, proxies, CDNs, firewalls all understand HTTP. No special configuration needed.

When to pick Long Polling: when you cannot upgrade the infrastructure (legacy proxies, strict firewalls), when event frequency is low (a few per minute), or when you need to support very old browsers or REST-only API gateways. Twilio and early Slack used long polling as the fallback transport.

Long Polling: the server holds each request until an event fires or a timeout expires, then the client immediately re-queues the next request.

Server-Sent Events (SSE)

SSE is an official W3C standard built on top of a plain HTTP/1.1 response with Content-Type: text/event-stream. The client opens one persistent HTTP connection; the server streams newline-delimited text events down it indefinitely. The browser's built-in EventSource API handles reconnection automatically.

Direction: strictly server → client (unidirectional). The client cannot send data back over the same channel — it uses separate XHR/fetch for that.
Latency: same as WebSockets — events delivered as soon as written to the stream, typically sub-10 ms on LAN, 30–100 ms over the internet.
Protocol overhead: very low after the initial handshake. Each event is just plain text (a few tens of bytes). HTTP/2 multiplexing means SSE streams do not block other requests on the same connection.
Reconnection & IDs: the server sends an id: field; the browser stores it and sends Last-Event-ID on reconnect, enabling gapless resumption.
Browser support: all modern browsers. Not supported in IE11 (irrelevant for most new systems).

HTTP/2 and SSE: under HTTP/1.1 browsers limit connections per origin to ~6, so more than 6 SSE streams from the same page would be blocked. HTTP/2 multiplexes streams over a single TCP connection, removing this cap. Always ensure your SSE endpoint is served over HTTP/2 in production.

WebSockets

WebSockets establish a full-duplex, persistent TCP channel. The connection starts as an HTTP/1.1 Upgrade request (Connection: Upgrade, Upgrade: websocket). Once the server responds with 101 Switching Protocols, the HTTP layer is discarded and both sides communicate with lightweight binary frames — no headers re-sent per message.

Direction: bidirectional. Client and server both push at any time.
Overhead per message: 2–14 bytes of framing header (vs. hundreds of bytes for HTTP headers). At 1 000 messages/s this difference is significant.
Latency: lowest achievable — typically 1–5 ms on LAN, 30–80 ms over the internet for a co-located data centre pair.
State: the connection is stateful — the server must track which process/thread holds each socket. This complicates horizontal scaling.
Infrastructure friction: some older load balancers, proxies, and corporate firewalls do not support WebSocket upgrades. You may need explicit configuration (proxy_read_timeout in nginx, sticky sessions or WebSocket-aware routing in ALBs).

SSE (left) provides a one-way server-to-client stream over standard HTTP; WebSockets (right) provide full-duplex framing over a single upgraded TCP connection.

Side-by-Side Comparison

Dimension	Long Polling	SSE	WebSockets
Direction	Server → Client (simulated)	Server → Client (native)	Bidirectional
Protocol	HTTP/1.1+	HTTP/1.1+ (H2 recommended)	ws:// / wss://
Per-message overhead	High (full HTTP headers)	Low (text lines only)	Minimal (2–14 B frames)
Reconnection	Manual (client re-polls)	Automatic (browser EventSource)	Manual (app library)
Proxy/firewall compat.	Excellent	Good (plain HTTP)	Moderate (needs upgrade support)
Horizontal scaling	Easy (stateless between polls)	Moderate (long-lived conn.)	Hard (stateful, sticky sessions)
Best for	Low-frequency updates, legacy infra	Live feeds, dashboards, notifications	Chat, games, collaborative editing

Scaling Considerations

Each of these transports creates a persistent open connection (long polling approximates one), which means your server holds state for each user. At 500 000 concurrent users:

A naive thread-per-connection server would need 500 000 threads — impossible. Use event-driven I/O (Node.js, Go, Nginx, Netty, or async Python) which can handle hundreds of thousands of open sockets on a single process using OS-level epoll/kqueue.
For WebSockets, you need sticky sessions (consistent hashing at the load balancer) or a pub/sub bus (Redis Pub/Sub, Kafka) so any server node can fan out a message to sockets held on other nodes.
SSE avoids the sticky-session problem if you route events through a shared queue — any node can pick up the event and write it to its local SSE clients.

The "thundering herd" on reconnect: if 200 000 SSE or WebSocket clients disconnect simultaneously (server restart, network blip) and all reconnect at once, the spike can overwhelm your service. Add jittered exponential backoff in the client reconnection logic (e.g., wait 1–16 s + random jitter). The browser's native EventSource does NOT add backoff — you must implement it yourself if you wrap SSE.

Real-World Usage at Scale

GitHub Actions live logs — SSE. Simple one-way stream; plain HTTP, works through corporate proxies.
Slack — started with long polling, migrated to WebSockets for real-time messaging. Now uses a custom protocol on top of WebSockets with heartbeats and message acknowledgements.
Figma / Google Docs collaborative editing — WebSockets with Operational Transform (OT) or CRDT over the duplex channel.
Financial dashboards (Bloomberg, Robinhood) — SSE for market data feeds (server-initiated, high-frequency, no need for client→server on the data channel).
Uber driver location — WebSockets (bidirectional: driver sends GPS, rider receives updates).

WebSocket horizontal scaling: a pub/sub bus (Redis or Kafka) lets any backend node fan out a message to sockets held on different WS nodes, removing the need for every message to hit the same process.

Decision Framework

Ask yourself two questions to pick the right transport:

Does the client need to send data in real-time over the same channel? If yes, you need WebSockets. If no, SSE or long polling can work.
How frequently do events arrive? Low-frequency (a few per minute) and legacy infrastructure → long polling. Medium-to-high frequency, browser clients, server-to-client only → SSE. High-frequency, bidirectional, latency-critical → WebSockets.

Hybrid approach: many production systems use SSE for receiving real-time data (lower infrastructure cost, works everywhere) and plain fetch() for client-to-server commands. This is simpler to scale and debug than WebSockets for most use cases. Reach for WebSockets only when you genuinely need bidirectionality at low latency.