Automating the Modern Datacenter with Terraform and Consul
At CraftConf 2015, Mitchell Hashimoto argued that current provisioning and configuration tooling is not adequate for orchestrating the ‘modern datacenter’. The modern datacenter is agile and elastic, and ‘services’ that are required for deploying applications, such as compute resource, DNS and CDNs may be spread across potentially disparate vendor platforms. Hashimoto introduced two Hashicorp tools, Terraform and Consul, which may be used to provide automation in these challenging environments.
Hashimoto, founder of Hashicorp and project lead of Vagrant and Packer, began the talk by providing a historical overview of datacenter technology. From an organisation’s perspective, utilisation of a typical datacenter has evolved from a single physical server, through to several bare metal servers, and ultimately multiple virtualised instances. The latest trend in this evolution is the move towards containerisation. The complexity of provisioning, deployment and maintenance has increased along this evolution, and the need for automation became paramount. Tooling such as CFEngine, Chef, Puppet and Ansible have emerged to meet initial requirements.
Hashimoto discussed that with the prevalence of public and private cloud technology we are now operating within a ‘modern datacenter’, with new challenges. Technologies that were once integrated with the core infrastructure stack are now shifting to service-based offerings, for example, DNS, CDNs and databases. Organisations are also increasingly building their infrastructure platforms using multiple disparate vendors. These two changes add an additional layer of complexity over traditional provisioning, and Hashimoto argued that this requirement is not met with current tooling:
You can have as much Chef and Puppet automation as you want to set up your application, but if you don’t have an automated way to set up all of the services it needs as well, then what’s the point? Your applications won’t work anyway...
The core activities when working with a datacenter often consist of acquisition, provision, update and destruction of resources, such as servers, data stores and load balancers. Historically these processes were slow and the results were relatively static, but now these activities are fast and the output elastically scalable. An example of this can be seen when provisioning compute resource. Within a traditional datacenter a server had to be purchased, physically racked and provisioned, and was deployed as a fixed unit. However, within a modern datacenter a compute instance is acquired via an API call, configuration is specified at boot time, and the instance can often be scaled horizontally in-situ or more instances easily added for horizontal scaling.
Hashimoto argued that a human cannot take advantage of the speed and elasticity offered within a modern datacenter - we must automate. The requirements for automating the modern datacenter were presented as:
- Zero to deployment in one command
- Resiliency through distributed systems
- Autoscaling, auto-healing
- Better teamwork through codified knowledge
Hashimoto introduced Hashicorp’s Terraform tool, which attempts to meet the identified requirements for building, combining and launching infrastructure efficiently across datacenters and disparate vendors. For example, Terraform can launch an Amazon Web Service (AWS) EC2 compute instance, launch a DigitalOcean Droplet compute instance, and then configure access to these via a Dyn DNS service. Terraform allows the declarative definition of infrastructure in a human-friendly text format, and Terraform modules can be created that are responsible for conducting specific lower-level configuration actions.
Terraform can be activated by a single command, ‘terraform apply’. However, a preview of the activities that will be conducted by an apply can be obtained by using the ‘terraform plan’ command. The output of a planning run is an ordered list of changes to be undertaken that will converge the current infrastructural state towards the required declarative definition. A planning run also indicates whether the changes can be performed in-situ, or will be destructive in nature (for example, re-starting a server). This information can be used to determine whether an operation is appropriate for the specific time, for example, if a maintenance window is open.
The output of the plan command can also be persisted to a file to allow deterministic execution of infrastructure changes at a later date. Hashimoto argued that the ability to preview infrastructure changes is one of most important features of Terraform. The combination of changes to the infrastructure code and the resulting provisioning plan can be used within current development workflows, such as creating pull-requests, reviewing diffs and accepting changes.
Hashimoto suggested that prior to Terraform, there could be an incredible responsibility placed on an operations team when managing production stack, as they had to deeply understand the current cloud platform(s), determine the current infrastructure state, and calculate the resulting state transitions. Hashimoto argued that some operators or DevOps engineers may want to ‘move up the stack’ and leverage tools such as Terraform to achieve their goals, in much the same way as many developers have moved from assembly language to third-generation programming languages.
This is where I make the distinction between core operators and application operators. In every company there are the operators who understand how to make a highly-available database cluster, and there as those who don't, but want a highly available database cluster. You can either educate them to do this, or give them the abstraction [that Terraform provides]
The second part of the presentation introduced Hashicorp’s Consul tool, which provides service discovery, config and orchestration in a highly-available manner across datacenters. Hashimoto stated that Consul can be used to answer questions within an organisation’s infrastructure such as ‘where is service X’, ‘what is the health of service Y instances’, ‘what is currently running’, ‘what is the config of service Z’ and ‘is anyone else performing operation A within my platform?’.
Consul provides service discovery via DNS or an HTTP API, and can enable the discovery of internal or external services across datacenters. Health checks are implemented as a shell script, which allows the creation of custom service verification protocols. A highly-available key-value store is also offered by Consul, which provides the ability to expose consistent values that can be used to ‘tune’ configuration parameters that would not necessarily warrant the execution of configuration management tools. Examples of these tunable actions include the specifying the location of services, indicating the system is in maintenance mode, or the setting of service QoS parameters.
Hashimoto stated that Consul also provides a set of orchestration primitives, and allows asynchronous ‘events’ to be broadcast via UDP across the datacenter, synchronous ‘exec’ instructions issued via TCP to specific machines, and ‘watches’ to be specified that implement long-polling to react to events or execs.
Additional details of Mitchell Hashimoto’s ‘Automating the Modern Datacenter, Development to Production’ CraftConf talk, including the video recording, can be found on the conference website. Terraform v0.5 is available for download at the Terraform.io website, and Consul v0.5 is available for download at the Consul.io website.