Minimum Requirements to run a Docker Swarm Cluster
The minimum requirements are minimal indeed to create a Docker Swarm cluster. In fact, it is definitely feasible (though perhaps not best practice) to run the Swarm daemon on an existing Docker Host making it possible to implement it without adding any more hardware or virtual resources. In addition, when running the file or nodes1 based discovery mechanism there is no other infrastructure (besides of course Docker) that is required to run a basic Docker Swarm cluster.
I personally believe that spinning up another machine to run the Swarm master
itself is a good idea. The machine does not have to be heavy in resources, but
it does need to have a lot of file descriptors to handle all of the tcp connections
coming and going. In the examples, I use
dockerswarm01 as a dedicated Swarm
There are a variety of configuration settings in Swarm that are sane by default, but give a lot of flexibility when it comes to running the daemon and its supporting infrastructure. Listed below are the different categories of config options and the options of how they can be configured.
Discovery is the mechanism Swarm uses in order to maintain the status of the cluster. It can operate with a variety of backends, but it’s all pretty much the same concept:
- The backend maintains a list of Docker nodes that should be part of the cluster.
- Using the list of nodes, Swarm healtchecks each one and keeps track of the nodes that are in and out of the cluster
Node discovery requires that everything be passed in on the command line. This is the most basic type of discovery mechanism as it requires no maintenance of config files or anything like that. An example startup command for the Swarm daemon using node discovery would look like:
File discovery utilizes a configuration file placed on the filesystem
/etc/swarm/cluster_config) with the format of
<IP>:<Port> to list the
Docker hosts in the cluster. Even though the list is static, healthchecking is
used to determine the list of healthy and unhealthy nodes and filter requests
going to the unhealthy nodes. An example of a file based discovery startup line
and configuration file would be:
Consul discovery is also supported out of the box by Docker Swarm. It works by
utilizing Consul’s key value store to keep it’s list of
<IP>:<Port>’s used to
form the cluster. In this configuration mode, each Docker host runs a Swarm daemon in join mode
that is pointed at the Consul cluster’s HTTP interface. This provides a little
overhead to the configuration, runtime, and security of a Docker host, but not
a significant amount. The Swarm client would be fired up as such:
The Swarm master then reads it’s host list from Consul. It would be run with a startup line of:
These key/value based configuration modes raise the question of how healthchecks within Swarm work in combination with the Swarm client in join mode. Since the list in key/value store is itself dynamic, is it required to run the internal Swarm healthchecks too? I’m not familiar with that area of functionality and so can’t speak to it but it’s worth noting.
EtcD discovery works in much the same way as Consul discovery. Each Docker host in the cluster runs a Swarm daemon in join mode pointed at an EtcD endpoint. This provides a heartbeat to EtcD to maintain a list of active servers in the cluster. A Docker host running the standard Docker daemon would concurrently run a Swarm client with a configuration similar to:
The Docker Swarm master would connect to EtcD, look at the path provided, and generate it’s list of nodes by starting with the following command:
Zookeeper discovery follows the same pattern as the other key/value store based
configuration modes. A ZK ensemble is created to hold the host list information
and a client runs alongside Docker in order to heartbeat in to the k/v store;
maintaining the list in near real-time. The Swarm master is also connected to
the ensemble and uses the information under
/swarm to maintain its list of
hosts (which it then healthchecks).
Swarm Client (alongside Docker):
Hosted Token Based Discovery (default)
I have not used this functionality and at this point have very little reason to.
Scheduling is the mechanism for choosing where a container should be created and started. It is made up of a combination of a packing algorithm and filters (or tags). Each Docker daemon is started with a set of tags like this:
Then when a Docker container is started Swarm will choose a group of machines based on the filters, and then distributes each run command according to its scheduler. Filters tell Swarm where a container can and cannot run, while the scheduler places it amongst the available hosts. There are a few filtering mechanisms:
- Constraint: This utilizes the tags that a Docker daemon was starting with. Currently it supports only ‘=’, but at some point in the future it may support ‘!=’. A node must match all of the constraints provided by a container in order to fit into scheduling. Starting a container with a few constraints would look like:
- Affinity: Affinity can work in two ways: affinity to containers or affinity to images. In order to start two containers on the same host the following command would be run:
Since Swarm does not handle image management, it is also possible to set affinity for an image. This means a container will only be started on a node that already contains the image. This negates the need to wait for an image to be pulled in the background before starting a container. An example:
Port: The port filter will not allow any two containers with the same static port mapping to be started on the same host. This makes a lot of sense as you cannot duplicate a port mapping on a Dockerhost. For example, two nodes started with
-p 80:80will not be allowed to run on the same Dockerhost.
Healthy: This prevents the scheduling of containers on unhealthy nodes.
Once Swarm has narrowed the host list down to a set that matches the above filters, it then schedules the container on one of the nodes. Currently the following schedulers are built in:
- Random: Randomly distribute containers across available backends.
- Binpacking: Fill up a node with containers and then move to the next. This mode has the increased complexity of having to assign static resource amounts to each container at runtime. This means setting a limit on a container’s memory and cpu which may or may not seem OK. I personally like letting the containers fight amongst themselves to see who gets the resources.
I am happy to say that Swarm works with TLS enabled. This makes it more secure between both the client and Swarm daemon as well as between the Swarm daemon and the Docker daemons. This is good because my security guy says that there are no more borders in networks. Yey.
It does require a full PKI including CA, but I have this solved in another post already :) This is how to generate the required TLS certs for Docker and Swarm.
Once the certificates have been generated and installed as per my other blog post, the Docker and Swarm daemons can be fired up like this:
Then the client must know to connect via TLS. This is done with the following environment variables:
You are now setup for TLS. WCGW?
More to come!
Well there is a lot to talk about when it comes to configuration of complex clustered software, but I feel this is a good enough overview to get you up and running and thinking about how to configure your Swarm cluster. In the next episode I’ll lay out some example architectures for your Swarm cluster. Stay tuned and please feel free to comment below!
All of the research behind these blog posts was made possible due to the awesome company I work for: Rally Software in Boulder, CO. We get at least 1 hack week per quarter and it enables us to hack on awesome things like Docker Swarm. If you would like to cut to the chase and directly start playing with a Vagrant example, here is the repo that is the output of my Q1 2014 hack week efforts: