Bio Nellwyn Thomas is an analyst on the Data team at Etsy, where she works closely with product, marketing, and engineering to scout, build, instrument and improve Etsy’s product portfolio. Before Etsy, she worked on analytics and product teams at two smaller NYC start-ups. She is a graduate of Harvard College, and holds a master’s degree from the University of Pennsylvania.
Software is changing the world; QCon aims to empower software development by facilitating the spread of knowledge and innovation in the enterprise software development community; to achieve this, QCon is organized as a practitioner-driven conference designed for people influencing innovation in their teams: team leads, architects, project managers, engineering directors.
Yes, my name is Nell Thomas and I am a Data Analyst at Etsy, I’m part of a great group of engineers, analysts who help use data to make things better.
Sure. Thinking about data and thinking about behavior in terms of data was engrained in me pretty early on, I studied Psychology in college, and there I had the opportunity to work in labs and run experiments and run some of my own experiments and I think it just kind of became part of how I approached decision making. After college, I worked at a startup, it was a financial services firm, that used data to understand the performance of publicly traded stocks and that was an opportunity for me to learn both the skills of point data and data analysis, modeling but also communicating about data. The deliverables were words not numbers and so having to articulate the meaning of an analysis became an important part of my job. After that I went to a startup that was building a web product, I had the opportunity to learn about building a product and at Etsy I get to use data to help build products which is the best of both worlds, and it’s really an awesome place to be.
We have two main data sources. The first is what we would call Transactional data, which is someone who makes a purchase or signs up for a member account and all that data is kind of regular relational databases; pretty straightforward. The second set of data, which is messier and bigger and a little bit more exciting is the behavior of someone on the site, so a visit who comes and clicks on the homepage, types some words into the search box, favorites an item maybe, all of that data we collect using an event logging system.
We store in Hadoop and processes there and requires a little bit more technical sophistication to extract that data. We are moving to a place where we’re going to be using a data warehouse solution called Vertica to store some of both sets of data, which will allow people with SQL skills to extract an analyze both. In general we are trying to move to a place where data is more and more accessible so that you don’t need programming skills in order to extract it. You will always need analytical skills to understand it but lowering the barrier in terms of getting to the data is important to us.
There’s all sorts of data I should also say obviously we have lots of operational metrics that we look at on a daily basis to understand how the site is doing. In terms of Product development, we are using it from every point of the process. Trying to identify a product we build, to understanding how we should build it, to iterate on it once it’s built, even to decisions around maybe deprecating a product or killing it, we want to make sure that we understand the impact of that. We run a lot of experiments on the site as well and there obviously we are using data and statistical analysis to understand the effects of changes we are making and we really try to inject that decision making with data, both in the product organization and throughout the company, so from marketing, PR, senior management, or support.
I think in general, and I said earlier, we have operational matrix that we look at day to day to just understand how the website is doing, but in terms of planning or optimization, we think about less in terms of the unit of the day that we do with the long term impact of changes. Real time analytics for us, I think my concern of that is that it leads to optimizing around short term changes or small term effects and we want really to be thinking about things in a longer time horizon than that. So on a day to day level we are running experiments, we are doing ad hoc analysis, product managers or marketing managers might be looking at data to help plan, but it may not be as much about responding to yesterday as it is about planning for a month or two or three months from now, a year from now.
We are trying very much to do so. We have a data team that works on both the infrastructure, that allows us to collect accurate data and deliver it to the company, we also work on the relationships with the individual teams, so the analysts, each take a few teams to help be the liaison on that, interprets the data, provides additional analysis, helps translate it into actionable items, for example, I personally work with our seller tools team, our search team, I work a lot with our marketing team and the idea there is that people on those teams are empowered to get their own data and use it, but they also have a resource for people to understand it and use it.
7. Traditionally the world of developing products has been something of a combination between sales, marketing and business analysts, is this broken, do we need to fix it and is data better than user requirements for this type of effort?
8. The idea of being able to use data, to inform decisions, and it’s also kind of how you determine whether a new feature or a test is successful. Do you foresee data maybe becoming a crutch for making decisions in the future at Etsy?
So anytime you make a decision, you want to have multiple inputs and data it’s just one of those. It’s a really strong one, a really good one in my mind. An example of experiments, that’s a place where we can quantify the difference between two possibilities. Sometimes that’s two different treatments of a design, we might have one that has a bigger button and one that has a smaller button and seeing how those things perform in an A|B test, it does allow us to have a better sense than we would have if we would launch that blindly. That said, when we evaluate overall which test we should run or how to interpret the results in some experiment, it has to be used as part of a larger understanding of the product. So a clutch could be a good thing or a bad thing, I like to think of it as something that rounds out the tool kit makes it a little bit stronger, a little bit more robust but it should not be the only thing that’s being considered.
Etsy is a marketplace made up of buyers and sellers and the sellers on Etsy who sell handmade goods, vintage goods or supplies, some of them are running their entire small businesses through the site and there we also want to inspire this idea of transparency and demystification, we have a seller facing analytics tool called “Shop Stats” where a seller can see, referral sources into their shop or what are their most popular listings are to be viewed or what keywords people were searching for when they found the shop and just as a company, Etsy tries to use data to make decisions , we want that for our sellers as well.
I think thing is that data can sometimes seem scary or specialized, not accessible to everyone and I think the first part of a data culture is just fostering kind of a safe environment for evaluating things with numbers. I think I would start not with data itself but with questions around… questions you want to answer, so what are your key metrics for the product or the business , how do you want to change them , how could you evaluate the changes, what do you need to get to that evaluation, you can have all the data in the world but if you don’t have those questions articulated and answers to them in your mind, you’re going to probably be shooting the dark or overwhelmed by numbers without any sense of direction, so I think a lot of it starts with a way of thinking, a way of structuring how you are going to approach the data and the other part of a data culture I think it’s an incredible important is the ability to look back on past decisions and evaluate them using data, so we’ve been talking a lot about using data forward thinking and in planning or iterating, optimizing but it’s also about looking back at things you’ve decided on in the past and seeing how those changes you’ve made… those decisions you’ve made performed.
I think that’s really about a kind of honesty, assessing how something did and use those numbers, speaking directly about them and not that numbers should be the only answer but that, again, you can speak honestly and transparently about the role of data in your decision making.
So I think that you’ve really seen the idea of analytics and big data getting a lot of attraction recently and become a really popular touchstone. I think that will continue to be the case and I think that’s a good thing. At Etsy and elsewhere, I do believe we’ll start to see data become more distributed across the companies and not just in terms of the actual CSVs and tables and Excel files but in terms of the skill set. So I think there will always be a place for a specialized analyst team but having individual team members empowered to think about data and use data and the skills to do so, I think will be a possible direction of the future.
Harry: Nell thank you for your time and I hope you enjoy QCon.
Thank you for interviewing me.