In nature, the cells can regenerate themselves and ideally you would like to have your servers to do the same thin
At Ahoy.io, we have used KOPS tool to set up a Kubernetes cluster. It is really easy but if you have aversion to command line, you can just set up a cluster with few clicks on Google Container Engine.
The cluster consists of:
- 3 masters
- N number of nodes
Master responsibilities are looking over the nodes and ensuring that state of server reflects our definition: which containers are running, which should be killed and removed.
Each master is deployed in separate availability zone (a,b,c). In our case, it’s eu-west (Ireland) as this is the cheapest one.
In the case of one master failure, AWS AutoScaling groups will recognise the failure and automatically recreate another master.
In the meantime, the alive masters are still operating and keeping our servers tidy.
That makes sure server can auto-heal its own supervisors.
Node is a server that only responsibility is running containers with your code.
If you tell Kubernetes that you want to have one set of containers for your main API, replicated twice, then Kubernetes masters will make sure that this is the case and the containers are scheduled on random nodes.
Let’s say we have a set of containers (Kubernetes calls it a Pod) for our backend API and frontend web app.
Containers for API:
- python-django-app with all the code & logic
- nginx for serving static files (images, stylesheets, fonts)
- memcache: fast & efficient cache
Containers for Frontend:
- nginx for serving pre-built frontend web app
Obviously, we want to have this replicated, so we tell Kubernetes that this should have two copies each.
What Kubernetes needs to do now is deploy 8 containers (3x for api, 1x for frontend; times two) to random nodes.
It also needs to constantly make sure that the containers are there are still running and responsive.
That’s why we have masters, who do not run any containers and are just supervising.
Because masters are constantly looking over a cluster, in the case of node failure (server going down), then AWS AutoScaling group will recreate another node in our datacenter and Kubernetes will see that some containers died and reschedule them.
That means that servers can die like flies and with enough of replicas, users won’t even see this. That’s why we needed at least 2 replicas: if a node failed, then containers on that node are not accessible.
You also need to have enough servers (nodes) to handle full user traffic even when part of the nodes are dead.
But wait, how can you just do replicas
Ok, so I haven’t mentioned an important thing about our configuration.
All our containers are stateless.
What does stateless mean? It means that inside the servers and containers we do not keep any important data.
Definition of important data is: if you can remove it and no one realises, it was not important.
That’s why we are safely hosting cache, as this is a nonpersistent helper to our operations.
Like a tenth scratch pad, you carry with you, because you lost previous nine. We can lose another one and just buy it new.
Where is the state then?
Having self-healing & scalable state is hard.
Like really hard.
lazy efficient, so we don’t want to spend time fixing database errors, making backups and making sure there are no data corruptions.
What do you do when you don’t know how to do stuff? You hire an expert!
All Ahoy’s state is kept in PostgreSQL databases managed by Compose.com (who are hosting it in the same datacenter as our cluster).
For keeping files around (uploaded avatars and generated invoices) we are using B2 Cloud from Backblaze, although AWS S3 is a better option if you are hosted in AWS already. (we will be moving someday there).
Is this really self-healing?
If any of our nodes die:
- users are not affected and do not even realise
- AWS will recreate itself
- Kubernetes will reschedule missing Pods to another node
Since data is an another castle, we can create new nodes and kill old ones as we wish, giving us the power to scale both vertically (faster servers) and horizontally (more servers).
Nature is not perfect
The scratch or small cut will heal itself. Lost limb won’t regrow.
In nature, if there is too much damage, then no healing can happen. The same situation is possible here:
- all master can die, so there is no one to supervise nodes
- if all nodes hosting same replica (our API) dies at the same time, then API is not available until new nodes are created and containers rescheduled.
- too much dying nodes can slow down our servers, as there is less resource to handle traffic
Being anti-fragile is not about being invincible. It’s about adjusting to constant errors happening around us.
If you accept that every 12 hours your server dies, and every 1h random container is removed, then you start thinking about your system differently.
At Ahoy each separate API assumes that others can be dead. If B2 Cloud filesystem does not respond, you can still search for flights, but your invoice won’t have a PDF. Once it’s back, PDF will be regenerated.
This can be achieved only when you always assume the worst shit happens.