Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Big Data's Role in Etsy's Product Development

Big Data's Role in Etsy's Product Development


1. My name is Harry Brumleve and we’re here at QCon 2012, San Francisco, sitting with Nell Thomas, could you tell us a little bit about yourself and what you do?

Yes, my name is Nell Thomas and I am a Data Analyst at Etsy, I’m part of a great group of engineers, analysts who help use data to make things better.


2. What drew you to the analyst path?

Sure. Thinking about data and thinking about behavior in terms of data was engrained in me pretty early on, I studied Psychology in college, and there I had the opportunity to work in labs and run experiments and run some of my own experiments and I think it just kind of became part of how I approached decision making. After college, I worked at a startup, it was a financial services firm, that used data to understand the performance of publicly traded stocks and that was an opportunity for me to learn both the skills of point data and data analysis, modeling but also communicating about data. The deliverables were words not numbers and so having to articulate the meaning of an analysis became an important part of my job. After that I went to a startup that was building a web product, I had the opportunity to learn about building a product and at Etsy I get to use data to help build products which is the best of both worlds, and it’s really an awesome place to be.


3. Where does your data come from at Etsy?

We have two main data sources. The first is what we would call Transactional data, which is someone who makes a purchase or signs up for a member account and all that data is kind of regular relational databases; pretty straightforward. The second set of data, which is messier and bigger and a little bit more exciting is the behavior of someone on the site, so a visit who comes and clicks on the homepage, types some words into the search box, favorites an item maybe, all of that data we collect using an event logging system.

We store in Hadoop and processes there and requires a little bit more technical sophistication to extract that data. We are moving to a place where we’re going to be using a data warehouse solution called Vertica to store some of both sets of data, which will allow people with SQL skills to extract an analyze both. In general we are trying to move to a place where data is more and more accessible so that you don’t need programming skills in order to extract it. You will always need analytical skills to understand it but lowering the barrier in terms of getting to the data is important to us.


4. And so what typically do you end up using this data for?

There’s all sorts of data I should also say obviously we have lots of operational metrics that we look at on a daily basis to understand how the site is doing. In terms of Product development, we are using it from every point of the process. Trying to identify a product we build, to understanding how we should build it, to iterate on it once it’s built, even to decisions around maybe deprecating a product or killing it, we want to make sure that we understand the impact of that. We run a lot of experiments on the site as well and there obviously we are using data and statistical analysis to understand the effects of changes we are making and we really try to inject that decision making with data, both in the product organization and throughout the company, so from marketing, PR, senior management, or support.


5. If this data is highly available at Etsy, how does it affect day to day operations?

I think in general, and I said earlier, we have operational matrix that we look at day to day to just understand how the website is doing, but in terms of planning or optimization, we think about less in terms of the unit of the day that we do with the long term impact of changes. Real time analytics for us, I think my concern of that is that it leads to optimizing around short term changes or small term effects and we want really to be thinking about things in a longer time horizon than that. So on a day to day level we are running experiments, we are doing ad hoc analysis, product managers or marketing managers might be looking at data to help plan, but it may not be as much about responding to yesterday as it is about planning for a month or two or three months from now, a year from now.


6. You are really incorporating data into the structure of Etsy?

We are trying very much to do so. We have a data team that works on both the infrastructure, that allows us to collect accurate data and deliver it to the company, we also work on the relationships with the individual teams, so the analysts, each take a few teams to help be the liaison on that, interprets the data, provides additional analysis, helps translate it into actionable items, for example, I personally work with our seller tools team, our search team, I work a lot with our marketing team and the idea there is that people on those teams are empowered to get their own data and use it, but they also have a resource for people to understand it and use it.


7. Traditionally the world of developing products has been something of a combination between sales, marketing and business analysts, is this broken, do we need to fix it and is data better than user requirements for this type of effort?

At Etsy it’s not sales marketing or business that I would say are driving product decision making and it’s really the engineers, Proc managers and Proc designers that make up our product development team, it’s very well named. So in terms of user requirements, I don’t think it’s about replacing user requirements with data, it’s really about using data to inform those user requirements. And that, what generally is, it’s kind of a set of data culture where everyone at the company is empowered to think about it, to think about data, use data and that we can demystify data and make it transparent. And in inspiring to do that, and this thing we’re working on, I think we do a pretty good job out, we can always do better, we’re trying to make sure that, decisions, both planning in the future and looking back at past decisions, we can do that with objective data.


8. The idea of being able to use data, to inform decisions, and it’s also kind of how you determine whether a new feature or a test is successful. Do you foresee data maybe becoming a crutch for making decisions in the future at Etsy?

So anytime you make a decision, you want to have multiple inputs and data it’s just one of those. It’s a really strong one, a really good one in my mind. An example of experiments, that’s a place where we can quantify the difference between two possibilities. Sometimes that’s two different treatments of a design, we might have one that has a bigger button and one that has a smaller button and seeing how those things perform in an A|B test, it does allow us to have a better sense than we would have if we would launch that blindly. That said, when we evaluate overall which test we should run or how to interpret the results in some experiment, it has to be used as part of a larger understanding of the product. So a clutch could be a good thing or a bad thing, I like to think of it as something that rounds out the tool kit makes it a little bit stronger, a little bit more robust but it should not be the only thing that’s being considered.


9. Analytics at Etsy are used to improve the experience for its customers and its sellers, how does this focus on analytics and data all through the day to day experience for a user?

Etsy is a marketplace made up of buyers and sellers and the sellers on Etsy who sell handmade goods, vintage goods or supplies, some of them are running their entire small businesses through the site and there we also want to inspire this idea of transparency and demystification, we have a seller facing analytics tool called “Shop Stats” where a seller can see, referral sources into their shop or what are their most popular listings are to be viewed or what keywords people were searching for when they found the shop and just as a company, Etsy tries to use data to make decisions , we want that for our sellers as well.


10. Not all software shops have the benefit of having a data culture within their organization, how can this be applicated or even started with an organization’s new tool data culture?

I think thing is that data can sometimes seem scary or specialized, not accessible to everyone and I think the first part of a data culture is just fostering kind of a safe environment for evaluating things with numbers. I think I would start not with data itself but with questions around… questions you want to answer, so what are your key metrics for the product or the business , how do you want to change them , how could you evaluate the changes, what do you need to get to that evaluation, you can have all the data in the world but if you don’t have those questions articulated and answers to them in your mind, you’re going to probably be shooting the dark or overwhelmed by numbers without any sense of direction, so I think a lot of it starts with a way of thinking, a way of structuring how you are going to approach the data and the other part of a data culture I think it’s an incredible important is the ability to look back on past decisions and evaluate them using data, so we’ve been talking a lot about using data forward thinking and in planning or iterating, optimizing but it’s also about looking back at things you’ve decided on in the past and seeing how those changes you’ve made… those decisions you’ve made performed.

I think that’s really about a kind of honesty, assessing how something did and use those numbers, speaking directly about them and not that numbers should be the only answer but that, again, you can speak honestly and transparently about the role of data in your decision making.


11. Where do you see product development driven by analytics heading?

So I think that you’ve really seen the idea of analytics and big data getting a lot of attraction recently and become a really popular touchstone. I think that will continue to be the case and I think that’s a good thing. At Etsy and elsewhere, I do believe we’ll start to see data become more distributed across the companies and not just in terms of the actual CSVs and tables and Excel files but in terms of the skill set. So I think there will always be a place for a specialized analyst team but having individual team members empowered to think about data and use data and the skills to do so, I think will be a possible direction of the future.

Harry: Nell thank you for your time and I hope you enjoy QCon.

Thank you for interviewing me.

Mar 04, 2013