Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Building a Data Maturity Model for Data Governance

Building a Data Maturity Model for Data Governance

In five blog entries spanning five days, the Data Governance Blog provides a quick start guide for developing a Data Maturity Model. An interesting differentiation with this approach compared to earlier approaches to building maturity models for data governance is that it advocates a propriety model customized for a given organization instead of applying standard models that attempt to be universal. In the first of the five part series on Data Governance, the focus is on defining scope and establishing a baseline. When building a maturity model for your data, it makes sense to target a subset of your enterprise data first. Once the scope of data is defined, then a baseline needs to be established. From the article:

What is the lowest maturity level of data in your dataset? Your answer could be something along the lines of, "Unreviewed, unmodeled, no metadata, have no idea what it is", "It is in our corporate data model but no information other than field name, not in metadata", or "Its in our model, we have some old definition that we can no longer consider reliable".

This may take some time to do, but it is important to establish this baseline. The main things to capture are:

1. Is it in your datamodel?
2. Do you have metadata for it?
3. Do you trust the information in the metadata, if it is in there?

On the second day the concept of natural data maturity model progression is introduced:

What we are looking for here is if there are any natural progressions you can see in the data as it stands today. Starting from your lowest level, what is the next step-up in maturity that you already see? If your first level was "Unmodeled, No metadata, no idea what it is", the next step you see in your data could be, "Its in our datamodel but we have no supporting information on it"... Rather than creating a data maturity model and forcing your data to fit into it, we are letting the various stages the data is already in define the maturity path.

The third blog entry flips the focus to the lowest maturity to the highest, and then attempts to bridge the gap by using consistent terminology:

In essence, I want you to take what you did on Day 1 and write down the complete opposite. This should help you identify the highest maturity level. So if your lowest level is "Unmodeled, Unreviewed, No metadata" then the highest optimum level would be "In Datamodel, Reviewed and Governed by the Data Governance Council, Metadata verified and up to date". What this does is keep your maturity model framed around the same items. If you talk about your data model in your lowest level, you should talk about it in every other level, including the highest.

In the forth blog entry working templates for a maturity model are provided, and both entry 4 and entry 5 walk you through appropriately customizing the maturity model for your organization.

First things first, get the following out: Your data governance maturity model template (filled in) and the in-scope data for your program. What you are going to do is take a sample of your data and make sure that you can easily find the position of that data on the maturity model. If I were doing this again, I’d randomly pullout about 40 fields and go one-by-one through them. I’d look at the field, check the model, check if their is metadata, etc., and see if it falls into a level. You need to make sure that all of the data fields fits somewhere on the maturity model… if it is questionable you may not have defined your levels clearly enough. If it falls right between two levels, you may need to define a new level to account for the difference, or incorporate the characteristics into one of your existing levels.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Coping with "Model Natural Progression" using the technology of the day.

    by Ernest Rider,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    In writing an article about the difference between SOA and distributed computing. I used the analogy of SOA being the front man, for possibly higher grained activities.

    Modern technologies are great. A lot of adoption builds parts that can be used independently of the suggested framework.

    The "Natural Progression" that you speak off, might be akin to how systems morphed RPC style activities to Synchronous/Asynchronous Message oriented activities.

    In a B2B sense WSDL is way too heavy to manage when the same infrastructure provides XML Schema marshallers that themselves can carry the interface definitions.

    In effect you are transmitting a reference into a HashMap, where there is no requirement for the client to change any explicit endpoint reference.

    Additionally the message oriented version of a function call can relay its own system dependent hash index, as a performance option as well.

    It is far more manageable to work at the XML Schema level than the WSDL level. All sorts of Enterprise problems disappear approaching the problem with XML Schema itself.

    In fact this model is perfect for REST. And implementable for Web Services.

    Still building arrays of schema's to transmit and recieve is not as fast as binary mechanisms doing exactly the same activities. So mapping this methodology between XML Schema and existing binary IDL's would be a good gap to fill.

  • Using ORM to guide data maturity models.

    by Ernest Rider,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Seeing the right patterns in data is not easy in an Entity Relationship sense, nor UML sometimes. ORM I feel does a nicer job of this.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p