InfoQ Homepage Articles Introduction to Puppet

Introduction to Puppet

Jan 08, 2015 13 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Every IT professional has suffered the frustration of code that breaks in production. Experienced developers pour hours, days and weeks into creating applications, only to have to patch them repeatedly after release. Quality assurance engineers are certain they’ve hit targets for high performance and low risk…on their test systems. And ops follows every deployment checklist to the letter, only to find themselves working late night after night, trying to keep these applications running (or limping along) in production.

Meanwhile, executives wring their hands and fret about all the money being spent, with such mediocre results. “Why does it take so long for us release features, and even bug fixes?” Customers are defecting. Competitors’ technology is way ahead, and Wall Street is taking notice.

IT organizations in situations like the above are often strictly siloed. Dev, ops and test are managed separately, have different metrics and goals, may work in different buildings, and sometimes, have never even have met each other. These teams are likely doing their work on different technology stacks with distinct configurations. The application code may stay consistent, but nothing else does. What works on a dev’s laptop or in the QA environment often doesn’t the work when deployed to production. Worst of all, no one understands the root causes of their problems.

Our founder, Luke Kanies, was one of those ops folks stuck working late nights in the data center. His dissatisfaction with the status quo led him to write the software that became Puppet.

But wait — we were just talking about about organizational problems. How can software solve cultural issues and enforce collaboration? The answer is, it can’t — at least, not by itself. Puppet is a great infrastructure management platform that any system administrator can use to get work done more efficiently, even from within a siloed ops team. However, for an organization that’s ready to lift collaboration to the next level, Puppet supplies the powerful glue of a shared codebase that unifies different teams. Bear with me for a bit as we walk through how Puppet works, and discuss how it helps teams at all stages of enhancing collaboration around software development and release — an evolution that’s often referred to as DevOps.

What is Puppet?

“Puppet” really refers to two different things: the language in which code is written, and the platform that manages infrastructure.

Puppet: the language

Puppet is a simple modeling language used to write code that automates management of infrastructure. Puppet allows you to simply describe the end state you want systems (we call them “nodes”) to be in. Contrast that with procedural scripts: to write one, you need to know what it will take to get a specific system to a specific state, and be able to write those steps out correctly. With Puppet, you don’t need to know or specify the steps required to get to the end state, and you aren’t at risk of getting a bad result because you got the order wrong, or made a slight scripting error.

Also unlike procedural scripts, Puppet’s language works across different platforms. By abstracting state away from implementation, Puppet allows you to focus on the parts of the system you care about, leaving implementation details like command names, arguments, and file formats to Puppet itself. For example, you can use Puppet to manage all your users the same way, whether a user is stored in NetInfo or /etc/passwd.

This concept of abstraction is key to Puppet’s utility. It allows anyone who’s comfortable with any kind of code to manage systems at a level appropriate for their role. That means teams can collaborate better, and people can manage resources that would normally be outside their ken, promoting shared responsibility amongst teams.

Another advantage of the modeling language: Puppet is repeatable. Unlike scripts, which you can’t continue to run without changing the system, you can run Puppet over and over again, and if the system is already in its desired state, Puppet will leave it in that state.

Resources

The foundation of the Puppet language is its declaration of resources. Each resource describes a component of a system, such as a service that must be running, or a package that must be installed. Some other examples of resources:

A user account
A specific file
A directory of files
Any software package
Any running service

It’s helpful to think of resources as building blocks that can be combined to model the desired state of the systems you manage.

This leads us naturally to Puppet’s further definitions, which allow you to combine things in an economical way, economy being one of Puppet’s key attributes.

Types and providers

Puppet groups similar kinds of resources into types — for example, users fall into one type, files into another, and services into another. Once you have correctly described a resource type, you simply declare the desired state for that resource: Instead of saying, “run this command that starts XYZ service,” you simply say “ensure XYZ is running.”

Providers implement resource types on a specific kind of system, using the system’s own tools. The division between types and providers allows a single resource type (such as “package”) to manage packages on many different systems. For example, your “package” resource could manage yum on Red Hat systems, dpkg and apt on Debian-based systems, and ports on BSD systems.

Providers are less commonly declared by the admin, and only if she wants to change the system defaults. Providers are written into Puppet precisely so you don’t have to know how to manage each operating system or platform running on your infrastructure. Again, it’s Puppet abstracting away details you shouldn’t have to worry about. If you do need to write a provider, these are often simple Ruby wrappers around shell commands, so they are usually short and easy to create.

Types and providers enable Puppet to function across all major platforms, and allow Puppet to grow and evolve to support additional platforms beyond compute servers, such as networking and storage devices.

The example below demonstrates the simplicity of the Puppet language by showing how a new user and group is added with a shell script, contrasted to the identical action in Puppet. In the Puppet example, “user” and “group” are types, and Puppet automatically discovers the appropriate provider for your platform. The platform-specific, procedural scripts are much harder both to write and to understand.

(Click on the image to enlarge it)

Classes, manifests and modules

Every other part of the Puppet language exists to add flexibility and convenience to how resources are declared. Classes are Puppet’s way of separating out chunks of code, combining resources into larger units of configuration. A class could include all the Puppet code needed to install and configure NTP, for example. Classes can be created in one place and invoked in another.

Different sets of classes are applied to nodes that serve different roles. We call this “node classification” and it’s a powerful capability that allows you to manage your nodes based on their capabilities, rather than based on their names. It’s the “cattle not pets” approach to managing machines that is favored in fast-moving organizations.

Puppet language files are called manifests. The simplest Puppet deployment is a lone manifest file with a few resources. If we were to give the basic Puppet code in the above example the filename “user-present.pp,” that would make it a manifest.

Modules are a collection of classes, resource types, files, and templates, organized around a particular purpose and arranged in a specific, predictable structure. There are modules available for all kinds of purposes, from completely configuring an Apache instance to setting up a Rails application, and many, many more. Including the implementation of sophisticated features in modules allows admins to have much smaller, more readable manifests that simply call modules.

One huge benefit of Puppet modules is that they are reusable. You can use modules written by other people, and Puppet has a large, active community of people who freely share modules they’ve written. That’s in addition to the modules written by Puppet Labs employees. Altogether, you’ll find more than 3,000 modules available for free download on the Puppet Forge. Many of these were created for some of the most common tasks sysadmins are responsible for, so they’ll save you a lot of time. For example, you can manage everything from simple server building blocks (NTP, SSH) to sophisticated solutions (SQL Server, F5).

Classes, manifests and modules are all just code. They can — and should, as we’ll discuss later — be checked into version control, just like any other code your organization needs.

Puppet: the platform

The language alone is not the full Puppet solution. People need to deploy Puppet code across infrastructure, periodically update code with configuration changes, remediate unintended changes, and introspect their systems to ensure everything is working as intended. To meet these needs, most customers run the Puppet solution in a master-agent structure, comprised of a number of components. Customers run one or more Puppet masters, depending on their needs. An agent is installed on each node, which then establishes a secure, signed connection with the master.

The master-agent structure is used to deploy Puppet code to nodes and to maintain the configuration of those nodes over time. Before configuring a node, Puppet compiles manifests into a catalog. Catalogs are static documents that define resources and the relationships between them. A given catalog applies to a single node, according to its job and the context in which it will do its job. A catalog defines how a node will function, and is used by Puppet to check whether a node is correctly configured, and apply a new configuration if needed.

Each node-based agent checks in periodically with a master server during each regular Puppet run. Puppet can then do any of the following:

remediate any configurations that have drifted from what they should be
report on the state of nodes without making any changes
apply any desired configuration changes, using Puppet’s orchestration tooling
collect data from nodes and events, and store it for retrieval

Puppet Labs’ commercial solution, Puppet Enterprise, adds customer support and a variety of advanced, mission-critical capabilities:

sophisticated node management capabilities
role-based access control
operational metrics and a reporting console

Putting it all together

Now you have a basic understanding of how Puppet works, but you may still be wondering how it can help your organization fix its deeper problems and enable people to collaborate more easily.

It all boils down to this: When you use Puppet, you are modeling your infrastructure as code. You can treat Puppet — and, by extension, your infrastructure’s configuration — just like any other code. Puppet code is easily stored and re-used. It can be shared with others on the ops team, and people on other teams who need to manage machines. Dev and ops can use the same manifests to manage systems from the laptop dev environment all the way to production, so there are fewer nasty surprises when code is released into production. That can yield big improvements in deployment quality, especially for some organizations we’ve seen.

Treating configuration as code also makes it possible for sysadmins to give devs the ability to turn on their own testing environments, so devs don’t see sysadmins as standing in their way anymore. You can even hand Puppet code to auditors, many of whom accept Puppet manifests as proof of compliance. All of this improves efficiencies, and people’s tempers, too.

Perhaps most important of all, you can check Puppet code into a shared version control tool. This gives you a controlled, historical record of your infrastructure. You can adopt the same peer review practices in ops that software developers use, so ops teams can continually improve configuration code, updating and testing until you are secure enough to commit configurations to production.

Because Puppet has the ability to run in simulation or “no-op” mode, you can also review the impact of changes before you make them. This helps make deployments much less stressful, since you can roll back if needed.

By using Puppet with version control and the practices outlined above, many of our customers achieve the holy grail of continuous delivery, delivering code more frequently into production, with fewer errors. When you deploy applications in smaller increments, you get early and frequent customer feedback to tell you whether you are headed down the right road — or not. This saves you from delivering a big wad of code after six to 12 months of development, only to discover it doesn’t fit user needs, or simply doesn’t please them.

Our customers evolve the configuration of dev, test and production environments in step with application code from developers. This allows devs to work in an extremely realistic environment, often identical to production. Applications no longer break in production due to unknown configuration differences between dev and test. Devs and QA get to deploy more good software; ops no longer burns the midnight oil; and executives are finally…well, if not happy, at least they are satisfied enough to shift their focus to concerns other than IT efficiency!

Taking the first step

Most organizations we see admittedly are pretty far from an advanced level of continuous collaboration, let alone continuous delivery. The nice thing about Puppet is that it grows and scales as your team and infrastructure grow and scale. You may not be ready yet to roll out company-wide DevOps practices — and that’s okay. Many customers use Puppet successfully as a configuration management tool in conservative, compliance-oriented industries such as banking and government. These organizations may have little need to adopt continuous delivery, but nonetheless, storing and versioning infrastructure as code vastly improves their change control and security practices.

We recommend you start by automating one thing that will make your job easier. For instance, many admins start by automating management of NTP, DNS, SSH, firewalls, or users and groups — all things that are completely routine, and that suck up a lot of time.

After gaining experience with Puppet, many people move up the stack, writing more complex modules to manage services like Tomcat monitoring or their JBoss application servers. Others adopt and adapt Forge modules. When you’re ready to dive in further, you can make sure all the machines in the data center — and in the cloud, too — are equipped to do the jobs they're supposed to do, that they're actually doing those jobs, and that the overall system is functioning properly to run the applications that serve your business.

It’s important to remember that you don't have to wade into infrastructure as code all by yourself. Others have solved these problems before you, so make good use of their work! We already mentioned the thousands of modules available on the Puppet Forge. You can also rely on the Puppet community, which numbers in the tens of thousands. Subscribe to the Puppet user group on Google, check out ask.puppetlabs.com, and get to know the engaged and responsive people there. Attend a Puppet Camp or Puppet User Group in your area to meet people in person. You can use Puppet Labs learning resources, both free and paid, and there’s always our YouTube channel and our official documentation, too.

This is just a taste of what you can find when you enter the Puppet ecosystem. We look forward to seeing you and helping you learn how Puppet can make your infrastructure, your business and your work life run so much better.

About the Author

Susannah Axelrod joined Puppet Labs in 2013 from Huron Consulting, where she was Director of Product Management. Prior to Huron, Susannah held product leadership roles at Thomson Reuters, Sage Software, Intuit and Intel. She loves figuring out what customers need and working to solve their problems. Susannah received her BA from the University of Chicago and her MBA from the Wharton School at the University of Pennsylvania.

InfoQ Software Architects' Newsletter

Introduction to Puppet

Write for InfoQ

Related Sponsors

What is Puppet?

Puppet: the language

Resources

Types and providers

Classes, manifests and modules

Puppet: the platform

Putting it all together

Taking the first step

About the Author

Rate this Article

This content is in the Infrastructure topic

Related Topics:

Related Editorial

Popular across InfoQ

The InfoQ Newsletter