Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Presentations Data-Driven Decision Making

Data-Driven Decision Making



Zoe Vance and Denise Yu describe how to design the indicators to build an understanding of a product, how to monitor those metrics over time, and how to build feedback loops.


Zoe Vance is Senior Product Manager, Pivotal. Denise Yu is Software Engineer, Pivotal.

About the conference

Pivotal Training offers a series of hands-on, technical courses prior to SpringOne Platform. Classes are scheduled two full days before the conference and provide you and your team an opportunity to receive in-depth, lab-based training across some of the latest Pivotal technologies.


Vance: We are talking about data-driven decision making. I am Zoe Vance. I'm the Product Manager for RabbitMQ for PCF. I've managed many products throughout my career. The motivation for this talk was, there are so many tools and frameworks that PMs use for products that are actually really helpful when thinking about other problems that the team faces. This talk was a little bit about how can we take some of the tools and frameworks that PMs use and expose them so that others can use them in their day-to-day work?

Yu: My name is Denise. I'm actually a software engineer. A few months ago, I was asked to fill in for a product manager who had recently left Pivotal Cloud Foundry, and I was like, "Sure. I can do it for a couple of weeks." That turned into six months. I accidentally found myself product managing for longer than I expected. Along the way, I found it really overwhelming to learn the entire toolset that product managers use and to figure out how to apply them to the specific and very technical domain that I was in charge of. My motivation for collaborating with Zoe on this talk is to hopefully unearth some of the things that we both learned along the way and give you some more tools, in your own toolset to make better and more data-driven decisions.

Product Team Velocity

The first thing that we want to talk about is why this topic matters. As product people, hopefully, as entire teams, everybody on a product team, we spend a lot of time thinking about velocity. We spend a lot of time thinking about how to go faster. Thinking about how to measure velocity. Is it the number of stories you've completed? Is it the number of features that you've delivered? Whatever it is. At Pivotal, we like to think of velocity as a vector, which means that it's a function of both speed and direction. It doesn't really matter how fast you run, if you're running towards the wrong goal, if you're working towards something that's not actually going to add value to your team or to your customers. Throughout, we're going to try to give you some tools to assess whether the direction that you're moving in is the right one for your team and for your customers.

Data in Service of Decision Making

Vance: This talk is called data-driven decision making. What we really mean by that is data in service of decision making with the emphasis on the decisions that you're enabled to make. Data tends to be glorified sometimes, but it doesn't matter if it's not helping you make better decisions and take better actions as a team. Given that the end goal is decision making, we're going to be talking about a framework that lets you take a problem, break it down, quantify it, understand it, and then make the decision that you need to be improving your teams and your work.

When we talk about quantitative versus qualitative data, which we will during the talk, by quantitative we mean things that are to do with quantity. It's where the word comes from. Numbers, percentages of what is happening, so sales increased 10%. Whereas on the qualitative side, it's more about the qualities and the characteristics. It's important because the what is as important as the why. The why really lets you understand what's happening, why does it matter?

Known Knowns

Yu: The way that we're going to frame this is we're going to tell you three different stories based on our own experiences working in product management. We think that when you're trying to choose the best tools to use, it's really important to understand the level of context that you're working in. How much do you really know about the world? How much does your team know about the world? I'm going to begin by telling you a story where we had many known knowns. What I mean by that is, we were in this situation on the team where everybody on the team knew and agreed that there was a problem, but we lacked consensus on how to tackle that problem.


Who has seen Concourse before? Concourse is an automation tool that many of our teams use to reduce toil. Concourse will do things like run our unit tests, run our integration tests, build our code, upload the build artifact to our remote storage, our remote buckets, that kind of thing, so that engineers are freed up to do more interesting types of problem solving. One problem that I've experienced on many of the teams I've worked on both as an engineer and a product manager is that the length of time it took for a pipeline to run could be very unpredictable, and it could take a really long time. On one of the teams that I worked on, everyone on the team knew that our pipelines were taking way too long. Everyone knew that this was painful. It came up time and again in weekly retrospectives. We weren't really equipped to act on it because we didn't know just how bad it was. We didn't have anything to point to, to evaluate, should we address this now or should we do more feature work?

What did we do? We decided to start applying data to this problem. We treated our nebulous pain on the team as if it was a valid product decision, a valid product problem that we could apply our existing toolset to. We took the last few dozen times that the pipeline ran, we exported all the data from Concourse, we plugged it into a spreadsheet that we then visualized using some other tools. We started measuring both the overall time that each pipeline took as well as the amount of time that discrete tasks took. By doing this, we were able to, first of all, put a number to this nebulous pain that we were all feeling. Second of all, identify where the bottlenecks were, and where the most unreliable tasks were.

Team Alignment

By being more metric driven about it, by being more data driven about it, we built alignment on the whole team. We were able to answer the question, how bad of a pain is this? How much time is it costing us? What is the cost of this delay compared to other things that the team could be doing? By building this alignment, we sat down as a team. We actually created an entire track of work for this. It wasn't considered chore work, or valueless work, it was prioritized the same way as we would prioritize feature work, because this is something that was actively costing the team. When we had these numbers, and we had these concrete pieces of work laid out, the entire team was able to engage in creative problem solving, by treating this problem, elevating it to the same hierarchy as a customer problem, we were able to have the whole team focus on moving that number.

The key takeaways for our team were that applying data to this nebulous problem enabled a decision to be made. It's really hard to reason about problems when all you have is these nebulous feelings and sadness in your heart when you have retros. When you have a number, that gives you a target, and that gives you a goal to focus on and to come around.

Known Unknowns

Vance: The second situation we're going to talk about is a situation where you have known unknowns. What we mean by that is that maybe there's some vague awareness of a problem lying under the surface, but there's no real consensus around it or agreement on what exactly it is. I'm actually going to speak to when I was Product Manager at a company called Kimono. It was a startup that created a tool for developers and allowed anybody to create an API from any web page. Being a hip Silicon Valley startup, we read "The Lean Startup," we read "Lean Analytics," and we started tracking the numbers that were recommended. We started looking at active users over given periods of time: daily, weekly, monthly. We defined active as users who were making a call to an API created with Kimono in that given period.

What we started to realize is that we had no idea what that meant. We would have these goals, sometimes we would hit them, sometimes we would miss them. We started to have conversations about, what does it mean that we're tracking active users? What is an active user? We broke down the problem. We looked at the quantitative user patterns, who made up those active users, and we realized there are actually many different types of activities that were contributing many different types of behavioral patterns to be counted as active. We then followed that up with qualitative interviews to understand, why do you have the pattern that you have? We realized that we basically had two very distinctive user groups in our active user number. There were the users who were calling APIs from many different websites and creating amazing things like visualizations, new apps, analyses. Then we had another group of users who were calling large amounts of data infrequently. They were essentially just stealing other people's information, essentially.

This raised a bunch of existential questions for the company. What do we want to be? Who are we building for? Why are we building it? It allowed us to then actually prioritize features and prioritize bugs, prioritize user stories based on who is the user we're actually trying to solve? The takeaway from here with this situation where we thought we had it under control, we had an active user number, was that, once we probed deeper, once we broke it down even more and applied quantitative and the qualitative framing, it helped us really understand who we were as a company. Where we were going. Why we were going there.

Unknown Unknowns

The final situation where we're going to talk about this framework of breaking a problem down, quantifying it, understanding it, making a decision is unknown unknowns. Here we really just mean very nebulous, nobody really knows that this thing even exists. Maybe nobody has talked about it. The example we're going to use is the Spotify team health check. I highly recommend the exercise. We do it as a team every couple months. The idea is you take a couple categories that are critical to teamwork, such as easy to release, fun delivery. You have a conversation with all the team members about how they feel the team is performing. This then gives you amazing insights into things that you maybe wouldn't have taken the time to talk about or think about. One thing that we realized at the back of it is we had these amazing conversations, but then we didn't do anything. You raise these great points, but they're not clean things you can just tick off. We started instituting a monthly experiment off the back of the Spotify health check that would help us actually improve on those metrics.


The experiment I'm going to talk about is learning. When the team discussed learning, the feeling was that our product surface area was so large, and we were always so busy that it was hard to stay on top of all the tools and technologies we needed to be successful. The reason why we chose learning and the reason we chose the health check for this story is because those are fundamentally unquantifiable things. It's not about the data, it's about the decisions. We took learning, how do you quantify something like learning? The first thing we did is we wrote down all the tools and technologies that we needed to understand to be effective at our jobs. This is a very abbreviated list, just the things that I could be bothered to put on a slide.

The second thing we did to quantify it is say, how important is the tool to our day-to-day success? Finally, where are we now as a team? That helped us quantifying it, in these broken down smaller parts, helped us choose the most important thing that we wanted to learn. From there, we actually step back. I don't want to tell team members how they should be learning. It became, here's the goal. Here's what we're trying to achieve. Take a day, a week and do whatever you need to feel like you are improving your learning and your knowledge on this topic. Unsurprisingly, because this data is fuzzy, the number went up. The point wasn't the number, the point was, what works for us? What ritual should we be building into our day-to-day? How do we involve more learning in our processes? The takeaway here just being, we can become a stronger team by taking even nebulous, qualitative things, breaking it down, thinking about a way to measure it to gain alignment, and then making a better decision as a group.

Frameworks to Being More Data Driven

Yu: We're going to close out the anecdotal section of this talk by explaining to you a framework that you can use to bring some of these lessons pulled out of these stories into your day-to-day work. We're going to give you a framework for how you can start to be more data driven. This also applies no matter what level of context you're working in. Although if you have already high context and high alignment, you might be able to skip the first few steps.

The first thing to do is to identify the problems that you want to solve. At Pivotal, within the product practice, we have an exercise called stinky fish, which the facilitator will ask every member on the team to identify some problem area. The way that you frame it is, what problem, if we continue to ignore it will give us the greatest amount of grief down the line? You can set timelines, in two weeks, in two months, in six months, in one year. You can customize this to whatever works for your team. The reason why we call it stinky fish is because if you leave a fish in the back of your refrigerator for two weeks, you're probably not going to want to open your refrigerator and deal with it, the longer that time goes on. This is the process of moving things from either the complete unknowns or the not well-organized known unknowns into the area of known knowns where you can begin to address them. Where you can begin to build alignment on why we should be tackling this problem.

Once you have a shortlist of problems to solve, the next step is to choose probably the biggest problem. You can choose the problem that has the greatest number of people in the team identified or you can run some sorting exercise like a two by two or affinity mapping. The process of breaking down and framing a problem is to turn something that's big and nebulous and inaccessible into something that's small, accessible, and addressable.

Once you've decided on the problem to solve, the next step is to choose what you're going to measure. As we've talked about throughout the three stories, I hope it's clear by now that neither of us are experts at this, neither of us are all-knowing. The thing that you choose to measure today is probably not going to be the same thing that you will measure on day 30 or day 100. That's a good thing. The point that we want to make here is that whatever you choose to measure doesn't have to be perfect. You don't need to read every product management handbook out there to figure out what to measure. What you do choose should be good enough that it provokes the team to have conversations, and it provokes the team to learn. You're not comparing yourself against the platonic ideal of a measurement, you're comparing this against the complete absence of any data. Something is always better than nothing.

As you learn more about the solution space, you'll discover whether you need that measurement to be greater fidelity, or whether a finger in the air, just like a sense, is enough to make a decision and move on. Of course, whatever you choose to measure, you should introspect on that. You should always question whether that's still the right thing to measure. In Zoe's story, they completely changed the way that they were analyzing customer engagement. I think that that's a healthy thing. That's actively a good thing. Hopefully, we've all seen build-measure-learn before. We should be iterating when it comes to product vision and product roadmaps. We think that when it comes to the process of measuring the things that we collect, measuring data, we should also be iterative about that.

Lean Product Development

What we've been talking about this entire time is actually just the core principles of lean product development. Dan North, said a few months ago that when we take the ideas of lean and we apply them to lean manufacturing, or lean supply, they look very different in practice than when we apply them to product development. In lean product development, we always want to be optimizing for learning. That means making mistakes, because mistakes will surface more information. They will empower you to act differently in the future. The data that you collect when you're starting to become more data driven about the day-to-day things that we do, should also be in service of the goal of discovering more.


Data driven practices from the world of product management can and should be applied more broadly to every operation on a product team, or perhaps on a consulting team, not necessarily a product team. The framework that we gave you was to figure out your hardest problems, figure out the things that are causing you the most pain, frame it, break it down, and then measure it, and understand it. Of course, always be iterating and learning.


See more presentations with transcripts


Recorded at:

Feb 12, 2019