Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Camille Fournier on the Software and Data Science Behind Rent the Runway

Camille Fournier on the Software and Data Science Behind Rent the Runway


1. We are here at CraftConf 2015 in Budapest and I’m sitting here with Camille Fournier, so Camille who are you?

Who am I, I am the CTO at Rent The Runway, I’m also a ZooKeeper person of interest, I’m a ZooKeeper PMC member and committer, and I’m also involved with the DropWizard Java web framework project.

Werner: I don’t even have to wash it?

That’s right, you don’t have to do anything, just wear it, look beautiful, take pictures, have a good time and send it back to us.


2. Ok, so what are the main challenges in doing that, are there any IT challenges or are there physical challenges?

the bigger challenge has been really scaling the operation. I have a big tech team, so my team is about 65 people and that includes people that obviously support the website and the product development for the website and the mobile app, but a big part of what we do actually also is that all of the software for our warehouse, it’s all bespoke software for our warehouse. So the interesting thing about the evolution of Rent The Runway's operational side, the warehousing and shipping part of things is that it really shows the impact of scaling on an organization, so you know with a computer code nowadays you kind of have to get to really, really large volumes before you have to think about scaling challenges, big O notation.

When you are talking about things humans can do, it actually also sort of happens in that way but at much smaller volumes, so when Rent The Runway was first founded they had, I forget how many, 400 or 1000 items of inventory, and when you have that few you don’t need to label them or barcode them, everybody just kind of knows what they are, they actually had flash cards that they would give to the people working in the operation to know what dress was what, what name, we have names to our dresses like the “Fifth Avenue Show Stopper” dress, and so they would have these flash cards and then when they needed to go pick a dress to send to a customer, they just knew that they were looking for the “Fifth Avenue Show Stopper”, it’s short and sequined and gold and it’s going to be over there. Of course that scales only so far, human scaling only goes so far, you could actually go a long way with Google Docs, that’s one of the things that I think is really interesting and I think that engineers should think about when they are working for companies that have physical components is that not every problem immediately needs a bespoke technical solution, you can actually do a lot of amazing things with Google Docs and collaborating, for example, for us, our warehouse once it had moved out to New Jersey and our customer service team had to do a lot of collaboration around, like, this customer wants to change what's in her order that’s going out today, so put this item, what’s available in the warehouse right now, put this item in her order, a lot of that communication was done via Google Docs for a long time. Of course that really doesn’t scale once you get to 100.000+ items in your warehouse and once you get to really tight turn around times and very, very large customer service and operations teams. So a big part of the challenge that my team has built, a big part of the, sort of, technical challenge of Rent The Runway has been scaling the operation and building the software to scale the operation.

So a big thing that we had to do, one of the big projects that we did when I first started at the company, I started about three and a half years ago, was we remodeled the way that we do reservations, so we are reserving items but it’s different than reserving a car, where you reserve a compact, or a mid size or whatever, and half the time you go to the car rental place and they are like: “We don’t have that so we are going to actually give you something slightly different”, and for the most part you don’t care because you just really need a car, it needs to store you and your passengers and your luggage. When you rent a dress, you want the dress that you ordered; they're pets, not cattle, to use a popular term from computing these days, like: “These things are very special even know we may have hundred or more of a particular type of dress, each one of those hundred is a pet and we need to care and monitor that particular garment because we want to understand its behavior, we want to understand where it is at all times in its life cycle. So we remodeled our reservations to be very, very accurate, to take into account: “How long it takes to ship things, the likelihood someone is going to return something late”. A deactivation curve, so eventually clothing will fell apart and you obviously don’t want to take rentals on something that is actually not going to be available for someone, so we actually model a deactivation curve and we use a lot of data to guide our reservation taking system to make it very, very accurate. So that was an early challenge and once we got that done, then we could really start to think about, in a very smart way, how to manage our operation and how to allocate inventory, so again it used to be for a traditional e-commerce or traditional warehouse, you sell something, you decrement a counter somewhere, you print a pick list and somebody goes and picks that thing and puts it in a box and ships it out to you, and it’s fun, it’s all out, there is some inbound, of course but it’s mostly all about outbound processing.

We have to do what we call reverse logistics, full reverse logistics, almost everything that goes out the door, comes back in, and we have a very tight timeline for that because we want to maximize the value, the ROI, for our inventory, so we take reservations where if things are late, we really maybe have only one day or even less to get that item, inspect it, clean it, possibly repair it, sew back on sequin or whatever, make sure it’s beautiful and perfect and ship it back out the door to the next customer who needs it for her event. So we have very complex allocation systems that taking to account the locations and predicted availability of all our inventory and tell the warehouse exactly at any given time what they should be doing.


3. These allocation models, so you model basically everything, your customers, the age of your clothes or the behavior of your clothes. So can you tell us more about that, so how do you model that, how do you get the data of when the “Fifth Avenue” dress is going to fall apart?

Sure, so we have a pretty large data science and analytics teams and I think there is a lot of a hype right now in the industry about the Internet of Things, and our dresses aren’t quite Internet of Things, they aren't, they don’t have like chips on them that are sending out data all the time, but we do barcode all of those dresses and every single time one of those units of inventory is touched, we are recording what happened to it, it went to a customer, it was cleaned, it was repaired, it was processed through the system, it was hung on the racket, it was taken off the racket, it was sent to one of our showrooms, because we actually have retail stores, so we are tracking a ton of data about every single unit of inventory, so we can very accurately learn about what the behavior of that inventory is, and we can see like, this particular unit has gone out the door 30 times, 40 times and it’s not longer wearable, and so we can predict that. Ok, for this particular type of unit we probably don’t want to take reservations on it beyond a certain threshold because we don’t believe that that is going to be able to satisfy our customers and we are going to give them a bad experience. So we are constantly collecting a lot of fairly fine grained of data about where our products are at all times in order to understand them and do analysis on them, make better models and better decisions about them in the future.


4. You mentioned that your warehouse has a lot of bespoke software, can you talk about that?

Yes, absolutely. So as I mentioned earlier we do this reverse logistics and we actually have a just in time system that powers that because what happens is, especially in our busy seasons, on Wednesday’s, our peak days, everything is coming back to the warehouse and a huge percentage of the volume of dresses that are coming back have to go back out that day, so you might have 20000, 30000, 40000 items coming back that day that need to actually be shipped out the same day, which is huge, right? So what happens is if you were to come to our warehouse at the very beginning of the day, UPS pulls a giant trailer up and you have these bins and bins and bins of identical UPS packages, they all just have mailing envelopes that say Rent The Runway and you have no idea what’s in them, and you need some percentage of those, maybe 60%, but you don’t need all of them.

And you can also imagine that when you are talking about a warehouse operation you have bottlenecks, a big bottleneck being cleaning, so we have probably the world’s largest dry cleaner, certainly the largest dry cleaner in the US. A funny story about that is we can actually shut down manufacturing in plants in Italy that make dry cleaning machines for like a whole season, building dry cleaning equipment for us which is kind of crazy to think about. So we have a huge dry cleaning operation but dry cleaning takes time, it’s not immediate, you have to look at the item, you have to put it in the right machine, it goes for a while, and we have a sharp UPS cut-off time of 8 PM that same night, so all of this inventory is coming in and has to be cleaned and shipped back out and you don’t want to do extra work because extra work is a waste, when you’ve got things that are very, very time sensitive, you just want to clean and operate on what is absolutely necessarily to get the job done. So every time a package comes into our warehouse it's scanned and our system actually calculates the priority of the items in that package, so they may say: “This is a red priority and absolutely needs to be done because it has to go out today”. It may be an orange priority meaning we would like to ship this now if we can, it can wait another day if we can’t get through it, but it'd save us money to ship it out a little earlier. Or it’s green, we don’t need it, we may not need it any time soon, don’t process this right now, process this later when you are less busy, when the bulk of the work has been finished for the day. So we are actually at every single time any dress goes to a station in our warehouse, that dress is scanned and we are calculating the priority of that dress and we are telling them whether this is a fast track dress or not.

So that’s a big part of our bespoke system and that’s part of the reason why we don’t use off the shelf software to run our warehouse because that kind of understanding of what are the rentals coming up and, again, it’s not about an order where you are just decrementing an order counter, it’s about a rental, somebody has rented a thing that has to be shipped, there is a lot of information about what you need to do with the actual order in the rental model and that doesn’t really translate so well to your classic inventory management models. The other thing we actually are doing a lot now and will be doing more in the future is around shipping, so we actually built our own shipping service called “Shipmunk” and it is in charge of printing labels, validating everything is in the package getting ready to go and what we expect to be doing with that in the near future is actually using that to optimize the shipping rates we get from people, so shipping is obviously a huge cost of any kind of logistics driven business and we use UPS, UPS is great but there are so many regional carriers and more and more you are seeing, companies like Amazon where instead of just using one carrier for all of their shipping, they will figure out who has the best rates right now or who has the best rates to that particular location for the timeframe they need and they’ll use that carrier, so we want to be able to be flexible and to do that we’ve actually built our own systems to understand that flexibility.

Werner: So you can actually optimize for carriers.

Exactly, we can optimize for carriers and we also can optimize for return shipping, so if we know that we are going to need that dress back next week because it has another reservation on it immediately, we can say, we want you to ship this back even faster than we would normally ship it back, so print an overnight label instead of a 2 days label for example to ship it back to us, so we can put a lot of logic into all the parts of our system to optimize where the inventory is, because again when you’ve got these peaks, we got this Wednesday peak, you want to smooth the peak as much as possible, any time you can smooth peaky behavior in your systems you are going to create better business outcomes for yourself.


5. What kind of software do you use internally, what kind of languages do you use for that?

We do microservices, I would not have called the microservices until very recently but I finally have given in, I started doing microservices three and a half years ago, I just called them services, but we actually use the Dropwizard web framework which is a very lightweight web framework that actually we as a company now support. It was written by Coda Hale originally out of Yammer, and I really liked it when we started using it because it’s very simple, it’s very straightforward to use, I came from a world where I've done a lot of Spring and Tomcat and all that, and these are not bad solutions but you have to do a lot of XML configuration and yadda yadda to get them going, and we didn’t need all that, we needed something lightweight, these are going to be RESTful endpoints doing some business logic, databases are talking to messaging queues, we use a lot of messaging queues, RabbitMQ is very heavily used especially in our warehouse. So we had this microservices architecture that powers both our warehouse and our website and then our website is fronted by a very sort of thin Ruby client that is really kind of used as glue, so it will call to the API of the various microservices and obviously you are going to manage the Javascript assets and all that to show our customers our experience.


6. And so on the modeling side, the data science side, what do you use?

For our data pipelines we use a lot of Python actually, you know Python is a great language for data pipelines. We have a very, all data pipelines I really believe are actually spit and duct tape, if you just, if you look at all of them anyone, big companies or small will say this, it’s very hard to do a good job with data pipelines because data changes all the time, you want something that you can change fairly easily, you don’t want to slow the process, the process of building your product down but you also don’t want to lose data so it’s constantly a game of catch up. We use Vertica as our data warehouse and that’s most of it, we are not doing anything fancy or special with Storm or anything like that - yet, although we have been actively researching that for a while and I think we'll get there soon.

Werner: I think you’ve given us a great overview of the challenges of this modern world of, maybe what we are facing with IoT, and thank you Camille!

May 21, 2015