Key Takeaways
- With DevOps being at the centre stage in the industry, teams can expand the “as code” approach beyond traditional infrastructure and configuration
- Pipelines as Code changes the way developers perform automated tasks in their workflows, moving away from a tool lock-in and towards a completely independent way, all contained within a code repository
- Developers can now benefit from all the features previously reserved to code files, as well as advanced features like templates, loops, etc., improving the way pipelines can be used to perform tasks outside of the classic build and release
- Moving to this approach will become the new standard in software engineering, increasing productivity and adding a new level of control for developers
- While it might take some work to migrate legacy definitions to the new model, developers will be able to benefit from tools and approaches they already know, as well as taking advantage of more powerful automation orchestrators
Looking at the state of software engineering, it’s clear that the industry underwent a level of transformation akin to a chameleon’s. What used to be mainstream is now almost extinct, replaced by completely different tools and technologies.
If I look at what I used to do ten years ago, I remember working heavily with centralised version control systems, being bound by the choice of operating system a workload was running upon, and in general a strong sense of demarcation between being “a developer” and working “in infrastructure”.
Things have obviously changed, however the single biggest disruptor in this field remains Git. Git changed everything – it democratised good engineering practices like source control, and it allowed a wealth of tools to be built upon its foundation. DevOps obviously played a major part in this, being the glue collating together a number of new tools, approaches and technologies. In short, this bottom-up proliferation and the broad adoption of DevOps practices led to the industry organically moving to an “as code” approach.
That’s how Terraform (and similar) tools emerged, pushed by the tools’ ecosystem and by DevOps being broadly adopted and mainstream for most companies. Infrastructure as Code is now ubiquitous, and every cloud provider offers infrastructure deployment capabilities via code files and APIs – which should be the default choice for any application that is not a Hello World sample.
Infrastructure as Code was just the beginning. Configuration as Code followed shortly after – again becoming extremely commonplace and enabling organisations to scale their engineering capacity by a number of times. And in order to continuously increase the value development teams generate, Pipelines as Code was the natural consequence.
Why should I bring build and release definitions into code?
Pipelines as Code is the natural evolution of a key artefact engineering teams use every day. Think about it: you have Infrastructure as Code, Configuration as Code… why not Pipelines as Code?
The concept is simple - rather than thinking about a pipeline just in terms of a CI/CD engine, you can expand it to being an orchestrator for your development platform, and all its artefacts are stored in code files.
That will provide you with versioning, team working capabilities, etc. while at the same time giving you the power to automate all of your processes. And the more you automate, your quality increases, your speed improves, and your resiliency goes up exponentially. It’s a game changer for any development team.
Look at my blog publishing system - it’s all hosted on GitHub, and whenever I post something this is what happens:
Two pipelines (or workflows in GitHub’s jargon) will run every time, one for publishing and one for other activities, under certain conditions. You might wonder why two, and why the CI workflow exists near the pages-build-deployment workflow. The first one is custom, the second one is out of the box for publishing. Let’s take a look at the custom one:
name: CI
# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
push:
branches: [ master ]
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
if: "!contains(github.event.head_commit.message, 'NO CI')"
# The type of runner that the job will run on
runs-on: ubuntu-latest
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v2
- name: Send Tweet Action
uses: ethomson/send-tweet-action@v1.0.0
with:
# The status ("tweet") to post to twitter.
status: "Blogged! ${{ github.event.head_commit.message }} - https://mattvsts.github.io #mvpbuzz #AzureDevOps @AzureDevOps @GitHub"
# Consumer API key, available in the "Keys and tokens" section of your application in the Twitter Developer site.
consumer-key: ${{ secrets.TWITTER_CONSUMER_API_KEY }}
consumer-secret: ${{ secrets.TWITTER_CONSUMER_API_SECRET }}
access-token: ${{ secrets.TWITTER_ACCESS_TOKEN }}
access-token-secret: ${{ secrets.TWITTER_ACCESS_TOKEN_SECRET }}
This workflow automatically tweets on my behalf. It will run every time, unless a commit contains NO CI in the message. It’s a code file, and it is stored in my repository. Should I ever move it from my account to another repository, it will keep working without issues.
All CI/CD orchestrators are going in this direction: Azure Pipelines, GitHub Actions, Jenkins, etc. The UI is no longer a focus, and Pipelines as Code allows for some very specific advantages for a developer.
The benefits of Pipelines as Code
Being just code means that your pipelines will benefit from all the tools already used in any engineering organisation. This includes version control, branching, pull requests, etc. Developers know how to deal with code, so pipelines become just another artefact stored in Git.
This also facilitates a number of situations where you must maintain traceability, auditability, etc. while still maintaining ease of access and familiarity. Once again, Pipelines as Code means that everything is stored with full history and access controls, while still maintaining ease of use and repeatability.
Eventually, portability. Yes, there will be different dialects of Pipelines as Code depending on the target platform, however, the concept remains the same across the board. I think about GitHub Actions and Azure Pipelines for example - both based on YAML with a different syntax and some peculiarities. It takes a day at most for a developer to get up to speed, a week tops to be comfortable with the differences. The productivity boost is unbelievable, given there is no more distinction between a build pipeline and a release pipeline. Everything is just a set of orchestrated tasks performed by the same engine.
There are some advanced features in each modern orchestrator. Templates are really common now, and a true life-saver. You’ll define a pipeline once and you can re-use it across multiple automations and projects with minimal changes. Your template will contain all the logic, and the possible configurations - which you will invoke from your main pipeline. Let’s take a look.
This would be a template, named template.yml in your repository:
parameters:
input : []
steps:
- ${{ each p in parameters.input }}:
- script: 'echo ${{ p }}'
This template accepts an input array, and it will relay the individual items making up the array one by one using a command-line task. It’s a very simple logic, however within a pipeline you can see you can already use complex constructs like for loops (via the each keyword) and it will allow you to dynamically compose as many tasks as the input array’s items.
Now, if you invoke it from another pipeline, all you have to do is this:
steps:
- template: template.yml
parameters:
input: ["Pipelines", "Azure DevOps", "Demo", "Pipelines as Code"]
The output of this main pipeline is as follows:
Four command-line tasks generated on-demand, printing out the values. All orchestrated on the fly.
Another feature I really like is the Matrix in Azure Pipelines, for example:
strategy:
matrix:
'Ubuntu 18.04':
image: 'ubuntu-18.04'
'Ubuntu latest version':
image: 'ubuntu-latest'
'MacOS X 10.15':
image: 'macos-10.15'
'MacOS X latest version':
image: 'macos-latest'
'Windows Server 2016':
image: 'vs2017-win2016'
'Windows Server 2019':
image: 'windows-2019'
pool:
vmImage: $(imageName)
steps:
…
This snippet will run the tasks specified in the steps section across three different pipelines, each running in a different agent with different operating systems. This is all it takes.
Needless to say, it’s not all plain sailing and straightforward – there is a learning curve. Unsurprisingly, the biggest hurdle to go past is the lack of a UI. For at least a decade our build orchestrators relied on UIs to make the process simpler and easier to digest as the developers lacked full control over their process. As an industry we settled on the expectation that the UI had to be there to make things easier to digest.
Then the as code movement came along, and started breaching through. Infrastructure as Code was the first foray, then everything else followed through. Fast forward ten years, we are now able to deal with the fact that a UIs no longer have the most features and options, instead becoming just a gateway to a build orchestrator to learn the main functionalities before moving to the as code implementation.
The other important change factor is the fact that now everything runs in a pipeline, with potentially no distinction between build or release. It’s up to the developer to define these boundaries, and migrating can require some work as there will be no 1:1 mapping for everything. It is however a fairly lightweight job, so not so big of an obstacle.
What did I learn?
After working with many platforms you will realise there are patterns and reusable approaches, however the main lesson learned still is about getting into the habit of implementing Pipelines as Code as early as possible. Creating the build definition should be the first thing an engineer does, because it will evolve with the application code and it will provide a seamless experience once used with the DevOps platform.
A typical example is this: having pipeline definitions embedded in your code repositories means that your repositories will immediately become fully granular and independent, as they will contain not just the source code for the application, but also the build definition required in order to compile and deploy such application, making it a movable artefact across a DevOps platform. Microservices development becomes way easier. Testing gets simpler. Automating mundane tasks can yield so much additional value to the team, given any engineer can focus on solving actual problems rather than repeating the same steps all the time. Pipelines as Code does wonders.
Conclusion
Moving to Pipelines as Code doesn’t happen overnight, but can open so many doors and paths for your development team. If you are just getting started, do one thing - pick up any of your build and release processes and start replicating it in your code files. It’s as simple as that. The more you automate these processes, the more you will start implementing them as the default option and you will save a huge amount of time which is otherwise wasted on repetitive tasks.
Doing so will naturally guide you towards automating the steps currently holding you back, all with the benefit of the development experience engineers are used to. Changes become easier to track, merging is simple and coupled with a peer review process it will be accessible to every developer.