Bindings, Platforms, and Innovation
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
Tracking change and innovation in the enterprise software development community
Posted by Mike Bria on Feb 18, 2008 05:01 AM
After a post asserting that one must, as a rule, always version their database work, Scott Allen detailed an approach to making the best of versioning databases. Allen presented a comprehensive, practical approach to creating the baseline, using change scripts to manage schematic revisions, controlling programmatic database objects (like views, stored procedures, functions, and triggers), and leveraging branching and merging.Like many conveniences in software development, a shared database is a tar pit waiting to fossilize a project. Developers overwrite each other's changes. The changes I make on the server break the code on your development machine. Remote development is slow and difficult. Avoid using a shared database at all costs, as they ultimately waste time and help produce bugs.2. Always Have a Single, Authoritative Source For Your Schema
Everyone should know where the official schema resides, and have a frictionless experience in getting a fresh database setup. I should be able to walk up to a computer, get the latest from source control, build, and run a simple tool to setup the database (in many scenarios, the build process can even setup a database if none exists, so the process is one step shorter).3. Always Version Your Database
the common goal is to propagate changes from development, to test, and ultimately to production in a controlled and consistent manner. A second goal is to have the ability to recreate a database at any point in time. This second goal is particularly important if you are shipping software to clients. If someone finds a bug in build 20070612.1 of your application, you must be able to recreate the application as it appeared in that build - database and all.Allen states the goal of versioning databases as being able to push out changes in a consistent, controlled, testable, and reproducible manner. Many coaches would agree that achieving this goal is paramount for an agile team to be effective.
I like to keep all of the SQL needed to create tables, constraints, defaults, and primary indexes in a single file. Any views, stored procedures, and functions are scripted one per file.Allen stresses the suggestion to include in this baseline a table that will record any future changes to the schema, then procedes in his next three posts to describe the how these changes should occur.
If you go the multiple file approach, make sure to write a batch file, shell script, application, or some other form of automation that can automatically locate and run all of the script files required to install the database. Human intervention in this process is a step backwards.
The simple reason is to find problems as early as possible. If someone commits a schema change and the change removes a column used by a view, you'll find out there is an error early – hopefully before the build escapes for testing. Likewise, if someone checks in a view but forgets to publish a schema change the view needs, someone else is going to show up at their desk a few minutes later asking why they are breaking the software.Allen strongly stresses the importance of using an automated tool to make optimal use of the strategies listed above:
A second reason is to avoid some rare errors I've seen. Some databases have a tendency to screw up execution plans when the schema changes underneath a view. Dropping everything and starting over avoids this problem, which is really hard to track down.
The magic happens when a developer, tester, or installer updates from source control and runs a tool that updates their local database. The tool uses a three step process:As for the benefits of using this strategy, particularly the automated tool, Allen gives these examples:
1. The tool applies new schema changes by comparing the available schema change files to the SchemaChangeLog records in the database.
2. The tool will DROP all stored procedures, views, and functions in the database.
3. The tool will run all of the scripts needed to add the views, stored procedures, and functions back into the database.
Since the schema change scripts are in source control, you can recreate your database as it looked at any point in time. Is a customer reporting a bug on build 3.1.5.6723? Pull the source code tagged or labeled with that version number and run the baseline, then all schema change scripts included in the tag. You now have the same database and have a much better chance to recreate the bug. Also, changes move from development to test, and ultimately into production in a consistent, orderly, and reproducible manner.Allen completes his series with a discussion about how the approach he describes handles branching and merging, as both are likely realities of any application that lives past its first version. Allen suggests Branch For Release as his preferred branching strategy, and explains that he re-baselines the database with each new release. He presents an example describing this, and adds a scenario in which a defect is found in an older, branched release that requires a schema change. On the branched release, a new script for the change is created with no problem. The question then posed is how to reflect the change on the current release, now residing on the mainline:
To get this fix into the mainline, there are two options. Well, actually there are an infinite number of options to consider depending on how you apply your updates, but here are two options:For more on this and related subjects, see Scott Ambler's take on agile database development.
1. Merge the schema change script into the mainline as 01.00.0046, and fix the 2.0 baseline script to incorporate this change.
2. Write a new schema change script, 02.00.0003, that has the same changes as change 46 in the branch.
With option #1 you have to be careful because any database that updated to v2.0 will not take the 46th change script from the branch (unless you write your tools differently than I do). You have to force people to run this script manually, or you go around destroying any existing v2.0 databases (which at this point. should only be on development and test machines anyway). This is not a great option, but if you are not deep into 2.0 it is sometimes viable.
Option #2 is a bit friendlier. The v1.0 databases will pick up the fix from 01.00.0046. The v2.0 databases will pick up the fix from 02.00.0003. You have to be careful though, to write the 02.00.0003 change script so that it won't try to reapply changes if the 01.00.0046 script ran.
In other words, databases installed from the v2.0 baseline script need to apply the 02.00.0003 script, but production type databases that have been around since 1.0 will use the 01.00.0046 script, and you don't want 02.00.0003 to create an error by making changes that are already in place when the database eventually updated to v2.0.
5 Ways to Ensure Application Performance
Effective Management of Static Analysis Vulnerabilities and Defects
Ensuring Code Quality in Multi-threaded Applications
Most of this is implemented for Java and Grails with this project: http://code.google.com/p/dbmigrate Sam
Hi Mike, Thanks for the great job summarizing these posts. As you point out, Scott stresses the need for automated tooling for database "builds" and deployment. You'll probably also want to consider also using tool like ControlTier to tie together both your code deployments and your data deployments. Having one button to push to get both done is about the only way you can ensure your developers and QA staff get it right each and every time. ControlTier: http://open.controltier.com ControlTier's automation library for migrating and deploying database updates (similar to dbmigrate for Rails): http://moduleforge.controltier.com/lib_rdb.html
Sam, I am wondering how complete do you consider the project. I have quickly checked it a couple of weeks back, and all I could find was some code with no documentation, no clear examples, etc. So, I had to move along and check the LiquiBase. ./alex -- .w( the_mindstorm )p.
Tools like ActiveRecord:Migration do the versioning pretty well. Here though, we're a .NET shop, and while there's some projects on their way to replicate it, most are in beta, or didn't do what we need... Just whipped out one day of work and made our own based on SQL Server Management Objects (we use SQL Server exclusively) with a bit of Reflection and Attribute magics, and we were going strong. Now we can get our database to any point in time easily. Working on local, unshared database instances for development is also definately a must. The only catch is having a comprehensive set of test data that follows a given schema.
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.
This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.
This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.
This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.
After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.
IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.
Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.
4 comments
Watch Thread Reply