Published: August 30, 2021
Someone recently asked me about scaling web applications which use websockets and I realized I wasn’t sure how that could be architected. I recognized the challenges, and the word “Redis” popped into my brains, but I didn’t have a solution at hand. So this post is a discovery on why websockets need a bit of extra consideration before attempting to scale horizontally (adding more application instances).
Often, to scale a web application across multiple instances, it may be sufficient to deploy another application instance, configure a load balancer to point to each of the instances, and away you go. (I’ve done this a bunch of times). But, because websocket connections are more “persistent” than “normal” HTTP sessions, a few considerations need to happen on the both the load balancer (or proxy) and the application itself.
For this example I am using the following architecture:
docker-compose
A simple example of load balancing with nginx can be configured as such:
upstream websockets-webapp {
server webapp-1;
server webapp-2;
}
In this case, each HTTP request will be round-robin’d to each of the servers configured. But, per the docs):
Please note that with round-robin or least-connected load balancing, each subsequent client’s request can be potentially distributed to a different server. There is no guarantee that the same client will be always directed to the same server.
This is fine for many applications, but won’t work for websockets.
Because websockets have a more persistent connection, and are full-duplex, a client request may end up on a server in which no established session is in place and will be rejected as invalid.
So, we need to make the sessions persistent, or “sticky” using the ip_hash
mechanism in nginx.
With ip-hash, the client’s IP address is used as a hashing key to determine what server in a server group should be selected for the client’s requests. This method ensures that the requests from the same client will always be directed to the same server except when this server is unavailable.
An alternative is to use the hash $remote_addr
mechanism.
This post describes the merits of either.
I am using this to better spread the “clients” across the backend for testing and demonstration purposes.
So, now the proxy is load balancing NEW clients across all instances.
In order for nginx to process websockets, we need to update the configuration with the following lines:
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $host;
proxy_http_version 1.1;
: Ensure HTTP/1.1 is used for the proxy connection (vs the default 1.0)proxy_set_header Upgrade $http_upgrade;
: Add a header (Upgrade: TBD)
. Per the nginx documentation.. “Upgrade” is a hop-by-hop header, it is not passed from a client to proxied server
proxy_set_header Connection "Upgrade";
: Per the documentation:hop-by-hop headers including “Upgrade” and “Connection” are not passed from a client to proxied server, therefore in order for the proxied server to know about the client’s intention to switch a protocol to WebSocket, these headers have to be passed explicitly.
If you check out the demo code, you can view the /headers
endpoint to see the headers coming across for the requests.
In order to cross-communicate in a pub/sub construct between application instances, we can use an in memory key-value cache like Redis, or a message streaming tool like Kafka. This is outlined in the Flask-SocketIO docs. In this example, we can use the Redis container to act as a message queue. This doc from socket.io has a great diagram visualizing how the external queue is used to communicate between instance, as well as enables additional external services to communicate with the websockets.
A bit out of scope for this post, but worth mentioning is application instance failure. In the case of an instance failure, the load balancer will re-route the connection to a new backend instance. But, this backend instance will not have the OTHER instance’s session information. For example, room joins. So, any client joins that occurred previously, are now gone. I haven’t quite cracked the nut on this yet, but with the pub/sub capabilities of Redis, I am cautiously optimistic that I can persist join information in such a way that re-connections automatically rejoin the rooms. OPPORTUNITY FOR FUTURE POST!
Websockets are awesome, but more complex than old-school HTTP or even AJAX type polling. The complexity adds a bit of fragility that needs some extra attention and consideration. Luckily, websockets have been around long enough now that packages and patterns exist for the popular languages and platforms. With a bit of investigation and thoughtfulness, it appears to be fairly straightforward to architect a scalable solutions to support applications with websockets.
Check out https://github.com/briangreunke/load-balanced-websockets for a working code example of the details I discussed previously!