Long Polling, SSE & WebSockets Compared
Long Polling, SSE & WebSockets Compared
The web was built on a request/response model: the client asks, the server answers, the connection closes. That model works beautifully for loading pages and submitting forms, but it breaks down when you need the server to push an event to the client — a new chat message, a live stock tick, a build-pipeline status update, a friend's location on a map. Three architectural patterns fill that gap: Long Polling, Server-Sent Events (SSE), and WebSockets. Each makes a different trade-off between simplicity, scalability, and capability.
Why Standard HTTP Polling Fails
The naive approach is short polling: the client fires a request every N seconds and asks "any news?" If N = 1 s and you have 100 000 connected users, you generate 100 000 HTTP round-trips per second — mostly returning empty responses. Latency is bounded below by N, server CPU is wasted, and CDNs cannot cache dynamic feeds. Short polling is rarely acceptable beyond a handful of users or tolerances above 30 s.
Long Polling
Long polling is a clever workaround within plain HTTP: the client sends a request, and the server holds it open until an event occurs (or a timeout fires, typically 20–30 s). When the server has something to say it responds immediately. The client processes the response and immediately issues the next pending request, recreating the "always-waiting" effect.
- Latency: near-zero — the server responds as soon as the event fires.
- Overhead: each message requires a full HTTP handshake (headers, TLS record, etc.). At high event rates this becomes expensive.
- Scaling: each waiting request holds a thread or file descriptor on the server. With 10 000 concurrent users you need 10 000 open connections simultaneously — manageable with async I/O but painful with thread-per-connection servers.
- Infrastructure: works everywhere — load balancers, proxies, CDNs, firewalls all understand HTTP. No special configuration needed.
Server-Sent Events (SSE)
SSE is an official W3C standard built on top of a plain HTTP/1.1 response with Content-Type: text/event-stream. The client opens one persistent HTTP connection; the server streams newline-delimited text events down it indefinitely. The browser's built-in EventSource API handles reconnection automatically.
- Direction: strictly server → client (unidirectional). The client cannot send data back over the same channel — it uses separate XHR/fetch for that.
- Latency: same as WebSockets — events delivered as soon as written to the stream, typically sub-10 ms on LAN, 30–100 ms over the internet.
- Protocol overhead: very low after the initial handshake. Each event is just plain text (a few tens of bytes). HTTP/2 multiplexing means SSE streams do not block other requests on the same connection.
- Reconnection & IDs: the server sends an
id:field; the browser stores it and sendsLast-Event-IDon reconnect, enabling gapless resumption. - Browser support: all modern browsers. Not supported in IE11 (irrelevant for most new systems).
WebSockets
WebSockets establish a full-duplex, persistent TCP channel. The connection starts as an HTTP/1.1 Upgrade request (Connection: Upgrade, Upgrade: websocket). Once the server responds with 101 Switching Protocols, the HTTP layer is discarded and both sides communicate with lightweight binary frames — no headers re-sent per message.
- Direction: bidirectional. Client and server both push at any time.
- Overhead per message: 2–14 bytes of framing header (vs. hundreds of bytes for HTTP headers). At 1 000 messages/s this difference is significant.
- Latency: lowest achievable — typically 1–5 ms on LAN, 30–80 ms over the internet for a co-located data centre pair.
- State: the connection is stateful — the server must track which process/thread holds each socket. This complicates horizontal scaling.
- Infrastructure friction: some older load balancers, proxies, and corporate firewalls do not support WebSocket upgrades. You may need explicit configuration (
proxy_read_timeoutin nginx, sticky sessions or WebSocket-aware routing in ALBs).
Side-by-Side Comparison
| Dimension | Long Polling | SSE | WebSockets |
|---|---|---|---|
| Direction | Server → Client (simulated) | Server → Client (native) | Bidirectional |
| Protocol | HTTP/1.1+ | HTTP/1.1+ (H2 recommended) | ws:// / wss:// |
| Per-message overhead | High (full HTTP headers) | Low (text lines only) | Minimal (2–14 B frames) |
| Reconnection | Manual (client re-polls) | Automatic (browser EventSource) | Manual (app library) |
| Proxy/firewall compat. | Excellent | Good (plain HTTP) | Moderate (needs upgrade support) |
| Horizontal scaling | Easy (stateless between polls) | Moderate (long-lived conn.) | Hard (stateful, sticky sessions) |
| Best for | Low-frequency updates, legacy infra | Live feeds, dashboards, notifications | Chat, games, collaborative editing |
Scaling Considerations
Each of these transports creates a persistent open connection (long polling approximates one), which means your server holds state for each user. At 500 000 concurrent users:
- A naive thread-per-connection server would need 500 000 threads — impossible. Use event-driven I/O (Node.js, Go, Nginx, Netty, or async Python) which can handle hundreds of thousands of open sockets on a single process using OS-level
epoll/kqueue. - For WebSockets, you need sticky sessions (consistent hashing at the load balancer) or a pub/sub bus (Redis Pub/Sub, Kafka) so any server node can fan out a message to sockets held on other nodes.
- SSE avoids the sticky-session problem if you route events through a shared queue — any node can pick up the event and write it to its local SSE clients.
EventSource does NOT add backoff — you must implement it yourself if you wrap SSE.
Real-World Usage at Scale
- GitHub Actions live logs — SSE. Simple one-way stream; plain HTTP, works through corporate proxies.
- Slack — started with long polling, migrated to WebSockets for real-time messaging. Now uses a custom protocol on top of WebSockets with heartbeats and message acknowledgements.
- Figma / Google Docs collaborative editing — WebSockets with Operational Transform (OT) or CRDT over the duplex channel.
- Financial dashboards (Bloomberg, Robinhood) — SSE for market data feeds (server-initiated, high-frequency, no need for client→server on the data channel).
- Uber driver location — WebSockets (bidirectional: driver sends GPS, rider receives updates).
Decision Framework
Ask yourself two questions to pick the right transport:
- Does the client need to send data in real-time over the same channel? If yes, you need WebSockets. If no, SSE or long polling can work.
- How frequently do events arrive? Low-frequency (a few per minute) and legacy infrastructure → long polling. Medium-to-high frequency, browser clients, server-to-client only → SSE. High-frequency, bidirectional, latency-critical → WebSockets.
fetch() for client-to-server commands. This is simpler to scale and debug than WebSockets for most use cases. Reach for WebSockets only when you genuinely need bidirectionality at low latency.