How Database Administration Fits into DevOps
The Agile Consortium International and Unicom are organizing the DevOps Summit Brussels 2016 on February 4 in Brussels, Belgium. InfoQ will cover this event.
Dan North will talk about how database administration fits into the world of DevOps at the DevOps Summit Brussels. InfoQ interviewed him about the activities that are performed by database administrators and how they are related to those done by developers and by operations, how database administration is usually organized, how the database fits into DevOps or Continuous Delivery, and what he expects that the future will bring for database administration when organizations adopt DevOps.
InfoQ: What are the activities that are typically performed by database administrators?
North: The DBA role typically spans multiple production environments, development teams, technologies and stakeholders. They may be tuning a database one minute, applying a security patch the next, responding to a production issue or answering developers’ questions. They need to ensure backups and replication are configured correctly, the appropriate systems and users have access to the right databases (and no-one else!), and they need to be on hand to troubleshoot unusual system behaviour.
Their real value lies in understanding the mechanics and details of the database itself, its runtime characteristics and configuration, so they can bridge the gap between developers writing queries and operations staff running jobs. A skilled DBA can identify ways to speed up slow-running queries, either by changing the query logic, altering the database schema or editing database runtime parameters. For instance, changing the order of joins, introducing an index (or sometimes removing an index!), adding hints to the database’s Query Execution Planner, or updating database heuristics, can all have a dramatic impact on performance.
InfoQ: How do their activities relate to those done by developers and by operations?
North: Sadly, too few developers really understand what goes on in a relational database. Mapping layers like Hibernate or Microsoft’s Entity Framework hide the internals from regular enterprise developers, whose bread and butter is C# or Java programming, and of course this is a double-edged sword. On one hand it makes it easier to develop basic applications where the database schema maps onto equivalent OO data structures, but things quickly become complex when your desired domain model diverges from the database schema, or when there are performance, availability or scaling considerations. At this point having a helpful DBA as part of the development team can be invaluable.
On the operations side, the DBA is often responsible for implementing a business’s replication or availability strategy. It is Operations’ role to monitor the systems, diagnose issues and "keep the lights on", but the DBAs will be closely involved in monitoring and diagnosing database-related issues. They also define the database management and maintenance processes the Operations team carry out.
InfoQ: Can you give some examples of how database administration is usually organized? What are the benefits and pitfalls of organizing it in such a way?
North: Traditionally the DBA role is another technology silo, turning requests or tickets into work. This means they can be lacking context about the broader business or technology needs and constraints, and are doing the best they can in an information vacuum. In my experience DBAs are often in an on-call support role so they are the ones being paged in the middle of the night when a developer’s query exceeds some usage threshold. Because of this they tend to be conservative, if not outright sceptical, about database changes coming through from developers.
Sometimes there is a mix of "Production DBAs" and "Development DBAs". The former tend to sit together, doing all the production maintenance work I described above. The latter are helping development teams interact with the database correctly, both in terms of schema design and querying. This model can work really well, especially where the Production and Development DBAs have an existing relationship of trust. The Production DBAs know the Development DBAs will ensure a level of quality and sanity in the schema design and database queries, and the Development DBAs trust the Production DBAs to configure and maintain the various database instances appropriately.
InfoQ: How does the database fit into DevOps or Continuous Delivery?
North: In a lot of organisations it simply doesn’t. Any database changes go through a separate out-of-band process independent of application code changes, and shepherding database-and-application changes through to production can be fraught.
Some organisations talk about "database-as-code", and have some kind of automated process that runs changes into the database as managed change scripts. These scripts live under source control along with the application code, which makes it easier to track and analyse changes. These change scripts, known as migration scripts or simply migrations, are the closest we have got to treating the database as code, but they still contain lots of unnecessary complexity.
InfoQ: What do you expect that the future will bring for database administration when organizations adopt DevOps? How can database administrators prepare themselves?
North: The ideal model is for DBAs to be an integral part of both Development and Operations teams. DevOps is about integrating the technical advances of agile development such as continuous integration, automation and version control into an Operations context, at the same time as retaining the rigour and discipline of the lights-on mentality.
In the future I would like database changes to be as simple as code changes. I don’t want to write migration scripts by hand or keep an audit of which scripts have been run in and which ones haven’t. I should be able to make whatever changes I choose to a development database and then "check it in" like I would with code. I don’t hand-roll the diffs that go into my version control system, I just change the software and the VCS figures out what changed.
The database tooling should figure out what has changed since the last "version" of the database and create the appropriate migrations. Some vendors such as Red Gate are making inroads in this space, but it feels like we still have a long way to go. Most of the current "agile" database tooling is around creating, applying and managing migrations rather than genuinely treating the database as code.
Think "function" not "diff"
The VCS analogy sounds plausible at first, but is actually quite misguided.
In a VCS repository you manage both structure (directories and file names) and data (file contents).
With database migrations the focus is slightly different as you evolve multiple instances of a database, each containing different sets of data. The migration effectively acts as a function that you apply to an existing database instance to always transform it reliably in the same way. This transformation could be changing structure (DDL), evolve reference data or transforming user data.
And let's not kid ourselves. Our data WILL outlive our code. So do what's right and adopt a contract-first mindset. Evolve your schema (a.k.a. the contract between your application and the DBMS) carefully and reliably through migrations. And think of those migrations as functions you apply, not diffs you generate.
Re: Think "function" not "diff"
It's exactly the same in a database: you manage both structure (tables, indexes, views, key and value constraints) and data (stored procedures, triggers, reference or static data).
It's a difference in paradigm. I was involved in the early days of dbdeploy, one of the earliest migration-style tools, and that was the choice we made too. I've only recently been convincing myself about database-as-code rather than database-as-diffs (or database-as-transformation-functions if you prefer, it's the same thing). I was reluctant to consider the idea at first, and of course flyway is a migration-based tool so you made the same call as we did. Neither is right or wrong, they are different ways of looking at the same problem.
I find migrations are an artificial and brittle, or at least clumsy, way to model database changes, and fundamentally at odds with how we manage source code (both structurally and content-wise). I don't hand-roll diffs for DNS changes, network config changes, server changes, etc. I describe the desired target state and the tooling takes care of the transformation.
The same model is catching on in web development with virtual DOMs (react, etc.), so you define the desired DOM state rather than applying the deltas yourself. It leads to simpler applications and fewer edge cases.
For what it's worth I think flywaydb is a great implementation of the migration-based model. I'm just interested in how far we could push the declarative approach.
The Dev-DBA gap is way bigger than the Dev-Ops gap
Databases need to be a 1st class citizen in the DevOps narrative
My company, Datical, is working hard with some very progressive DevOps groups to address this difficulty and make Mr. North's desired future state a reality. GE Transportation presented a summary of our work together at a webinar on DevOps.com if anyone is interested to learn more. webinars.devops.com/devops-at-ge-transportation
Yousef Awad May 16, 2016
Jason McGee of IBM Talks about Open Source Projects and the Interactions at the Collaboration Summit
Jason McGee May 15, 2016
Srini Penchikala May 15, 2016