In a previous article we described why we think Docker is a great choice for building Continuous Delivery pipelines. We’ve found it to simplify deployments, speed them up and generally improve the quality and robustness of our application delivery.
This has been particularly true in our MicroService based environment. MicroService architectures bring a lot of advantages, but the DevOps and deployment challenges surrounding that architecture are significant. We’ve found that Docker has been just the tool we needed to meet this challenge.
As promised, in this second part we will drill down into more of the technical detail as to how we have delivered a Docker Based Continuous Deployment pipeline for one of our clients.
We architected the system in question as a set of interacting MicroServices.
As the application is a web application, the idea was that any given page would be rendered from multiple backend collaborating MicroServices. We also wanted to get to the stage where each MicroService could be released independently, allowing different work streams to move forward at different paces without compromising system stability.
The system is designed in a Cloud environment on Amazon EC2. In a Cloud environment, horizontal deployment is important for both scalability and resilience.
This combination of factors gives us an interesting problem. Say we end up having ~25 services, with each being deployed to 2 boxes in 3 EC2 availability zones. This gives us a massive 150 processes to build, run, deploy and monitor. MicroServices quickly explode in number and you need significant DevOps skills and resource to tame this problem.
This was one of the primary reasons that we looked at Docker. By adding in the container abstraction layer, all 25 services could essentially be treated the same within our build, deployment and runtime processes. Regardless of what is in those containers and whether those containers are pushed to 1 or 1000 machines is pretty much immaterial – it’s all just about moving containers around.
From this starting point, we also wanted to get to the point of having a really elastic pool of Docker images that could scale up and down in response to load. The idea was that new Docker images could be added to a pool, dynamically registering and deregistering with Elastic Load Balancers as they come online or offline. New EC2 instances could also come up and join that same pool when more capacity is needed.
How Our Builds Are Packaged
Each of our Microservices exists in it’s own Github repository.
Each repository has an associated Dockerfile that describes how the corresponding docker image for that project is built. That Dockerfile might extend from some base Docker image, add in application code, property files and other dependencies.
We generally put one MicroService inside one Docker image. In many Docker use cases it would definitely make sense to put a group of services or infrastructure such as a message broker or database server into the same image as the application, but we like to put all of our components into separate images for deployment flexibility and horizontal scaling.
The Dockerfile in each of our repositories inherits from a base image which brings in elements common to all of our services. For instance, we want all of our Docker images to expose SSHD, to be running SupervisorD and to have certain base frameworks in place such as Node.js and Java.
This base layer adds consistency to our environments, makes it easier to maintain for future upgrades, and further speed up builds. Though we only have one layer in our tree of images, it would definitely make sense to make this tree deeper to build up a layer of abstractions within your Docker image definitions to aid maintenance.
We initially evaluated ThoughtWorks GO as our CI server, but found that in a MicroService environment, we didn’t need the complex release orchestration or pipelines that GO exposes and visualises – everything was a one step build. We also missed some of the plugins in the Jenkins ecosystem, so we moved to back to Jenkins even though our impression of ThoughtWorks GO was generally positive.
Every time we checkin to GitHub, Jenkins is called via a post commit hook and builds, unit tests, and integration tests our code as is typical in a continuous integration setup. If that passes, we then proceed to building a Docker image with a simple ‘docker build’ command that uses the bundled Dockerfile to build the image.
When the Docker image is built, the Continuous Integration server pushes the docker image into a private Docker registry hosted on EC2. The Docker registry was just an image that was pulled down from index.docker.io and ran on a big private EC2 instance. That Docker registry was hardened, adding NGINX in front of it and mapping onto local storage so that images are retained between restarts of the registry.
As images are pushed into the Docker registry they are versioned using the Jenkins build number. This gives us an audit trail as to which images correspond with which builds. Versions also form the basis of our rollback.
We use Makefiles to wrap up some of the complexity of this. Our Jenkins simply runs a ‘make docker-deploy’ which wraps the command for pushing the image to the server. Make is obviously a bit long in the tooth but does the job for us without any ceremony.
For configuration management we use Ansible. All of our infrastructure, SSH keys, users, Linux server configuration etc are managed through source controlled Ansible playbooks. We preferred Ansible over Puppet or Chef due to it’s lightweight, agentless model, though we do also have a lot of love for Puppet & Chef in the right context.
We also make use of Ansible for pushing and pulling our Docker images around. Though we have 25 MicroServices and 5 or 6 different classes of machine, the code to do this very short and concise – generally just one line of code to pull the image onto one machine of interest. Ansible does provide a specific Docker module, but we generally just use the SSH module as we find the code is simpler to understand anyway that way.
The flow here is that after Jenkins has built an image and pushed it into the registry successfully, it then triggers an Ansible playbook for distributing and running the image on the correct set of servers. Jenkins manages the build of the image, but Ansible manages the deployment and orchestration.
A particular win for us was to use Ansibles dynamic inventory features. Rather than hard coding machine names and categorising them in inventory files, we build our list of servers to target dynamically from Amazon EC2 tags. Ansible supports this via the Boto Python library. Documentation to accomplish this is here.
One piece where we had to do some work was around versioning. The Docker registry doesn’t inherently support the concept of versioning, so we have to manually add it using the Jenkins version numbers.
We also wanted to get to the point of maintaining, say, 3 Docker images for each service on the relevant boxes to facilitate rollback. This involved a little shell scripting to ensure that only 3 were kept in place on each deployment, with the oldest one being removed when appropriate. Cleaning up the corresponding images was also important here to avoid eating up space.
We have a number of environments in our Continuous Delivery pipeline where different types of tests occur. Docker is obviously a huge win in this situation as we can ensure that the same set of Docker.
Our images are physically promoted through environments through the Jenkins build promotion process. In some instances we automatically promote builds, in others we have a manual process as we. The promotion simply calls the same Ansible playbook as above but with a different environment passed in as a parameter.
Service Discovery comes up a lot as a challenge in a containerised environment, but for the most part, this hasn’t been an issue for us.
Most of our services communicate via REST over HTTP, so we find it easiest to use Elastic Load Balancers for our services to discover one another. For instance, our services can point at shopping-cart-service.continocloud.com and as long as something is behind there, they will get their request serviced.
We do have a simple system to extract URLs such as the above into a central configuration file to avoid the situation where endpoint URLs become spread through the codebase. This also works well as a form of simple service discovery where we need it. For instance, when requesting a URL we can also specific an availability zone as a filter if we want to be sure we will retrieve an endpoint in a given availability zone.
At an appropriate point we will introduce ZooKeeper or ETCD when we feel that we need additional features in this space, but it is not proving a problem for us yet.
Environment Specific Configuration
This also comes up a lot as a question around Docker in production. We obviously have the same Docker image but
The way we have done this is to ship configuration for all environments within the Docker image. Then, when we start the Docker image and turn it into a container, we pass in an environment variable which the application uses to know it’s environment.
This sounds messy and against one of the principles of good Continuous Delivery, but is actually fairly idiomatic – for instance how rails is started with RAILS_ENV=prod or node is started with NODE_ENV=test.
This is one area where we a solution such as ZooKeeper or ETCD would add sophistication, but it’s not something that is causing us any pains as yet.
A big decision with Docker deployment is how you manage ports.
We took the decision to defer the decision to runtime. Essentially, all of our services, whether it’s a web server, a database server, or a messaging endpoint or an application all listen on port 5000 as their primary point. It is when we bring them up on a host that we specify a port
Ports are further abstracted away by sitting behind Elastic Load Balancers. For instance, shopping-cart-service.continocloud.com listens on port 80, which might map to port 5060 on the host, which maps to 5000 inside the Docker container.
The main tool in our arsenal is a simple tool that we wrote that simply pings REST endpoints and URLs to ensure they are up and available, displaying any failed responses on a simple dashboard. With so many endpoints, this simple dashboard is proving invaluable in quickly checking the health of the estate.
As well as this, we use a combination of AWS monitoring, CollectD, StatsD and Graphite to monitor key metrics about the boxes and the application.
We do still have work to do on monitoring, but we feel that the combination above feels as though it covers the main areas.
So that’s the technical details and reasoning behind our choice to use Docker as core to our Continuous Delivery efforts. We firmly believe that it’s a massive enabler for Continuous Delivery, especially in a SOA or MicroService based environment such as ours. Please reach out with any questions, comments or feedback.
Are you interested in building a similar Continuous Delivery pipeline or leveraging Docker in another way as part of your build, deploy, or operational processes? If so, please get in touch with us for an informal chat.