Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Max Sklar on Machine Learning at Foursquare

Max Sklar on Machine Learning at Foursquare


1. I`m here at the QCon Conference in New York and I`m talking to Max Sklar; Max, would you introduce yourself?

Hi, I`m Max Sklar; I am an engineer at Foursquare and I work on the Data Team which means that I work on the Explore algorithm which is a recommendation system and I work on applying machine learning to our data.


2. A recommendation system, meaning?

Pretty much, Foursquare is a location-based social application which means we have an enormous database of all sorts of venues, any city you might go to; if you`re using a lot, hopefully you have lots of friends on there or people that you`re following, they`ll be leaving tips and you`ll be going places leaving information but even if not, there`s a whole group of strangers out there who are also going to places and leaving little pieces of information, photos, everywhere; so as you go places and check in on Foursquare, just look around on Foursquare, you`ll get that data but the recommendation system does something a little more interesting than that; it`s when you`re in a particular area or in a particular neighborhood, you look at all the venues, within a given radius depending on where you are, if you`re in the middle of the city and then try to pick out things that would be interesting to you.


3. Now humans do that very well; if I have a friend, I know the friend very well, I can say, "Oh you know, you really should see such and such," or, "You should do," or, ‘You got to try this restaurant," and so on, "And I know you`ll like it". So that`s your problem. And you`re here giving a talk at the QCon conference about something having to do with that; that`s new and exciting. Tell us about it.

I agree that your friend who knows you has a distinct advantage over a computer but our advantage I think is in our data; we have all the opinions of all your friends and everyone else who`s ever been there; and we have 20 million tips in the database, 2 billion check-ins because the check-in is when you go announcing your location; so we can kind of build off all that and show you stuff that maybe one particular friend doesn`t know about.


4. Well, part of the idea of mobility is your friends aren`t there immediately and so you want this information culled from the millions of people who've visited that area before. So, how do you go about solving this enormous problem of having a machine make judgments about recommendations?

It`s not easy in this case; it`s very different from, I`ll tell you what problem that`s different from first, then I`ll tell you how we go about solving this problem. Another problem that we have is the venue search problem where people are just, they`re trying to use the app and they`re saying, ‘Hey, I`m here right now; you know, I want to check in,` then we see whether they checked in or not; and then if they did, we got it right; if they didn`t, we got it wrong; so we got that immediate feedback whether they were there or not mostly based on distance but because GPS doesn`t work perfectly and we have to include some other heuristics as well.

Also if you`re building an ad based system, then it`s easy; you know whether someone clicked the ad or earned a few cents or whatever; in this case, it`s not clear whether people liked the recommendation or not. You know, you can look at what people are clicking on; you can look at where people eventually go but it has to be a lot more intuitive because unless you`re asking a person, "Was this recommendation helpful to you; was that helpful to you?"...


5. And indeed, that does get asked on certain sites.

Max: Yes, on certain sites but you can`t be asking it all the time and really we have to rely on kind of indirect signals and we also have to ... I think that the major portion of it is our sort of justification system which is there`s a particular piece of information about this venue that`s interesting about this venue that we`re going to show.

In other words, if you go on to explore on Foursquare, you`ll see the recommendation, maybe it says, "Ten of your friends have been here," or say, "This venue is far busier than it usually is," it has a hundred people maybe there`s usually like ten; so that`s like this is busy right now; or maybe it says something like, "A lot of people like this place," or "A lot of people leave tips here and say good thing about this place, or "A lot of people talk about whatever ...a lot of people talk about the burgers here," you know, we can pull out that stuff.

So I think that stuff is what makes it interesting and we have a bunch of different lists that we pull from, like I said, the trending list, like what are people doing right this second, it`s a list we pull from; what tends to be popular around this time of day, it tends to be interesting; figuring out if you type in a query like, "I want burgers," originally we were just searching on tips to see if any of the tips on the venue that people left mentioned ‘burgers`; but it turns out that we really need a number on that; we really need to say, "Okay, if there`s one venue that mentions ‘burgers` and there`s like a thousand tips, it could just be a fluke; so you really have to think about ‘when is this interesting and when it`s not," and a good example of that was I think the Apple Store in I think it was in San Francisco when someone said, "Oh, these Mac Geniuses have these really bad haircuts," and every time someone types in "hair cut", that thing comes up first; you don`t want that to happen; or the place in New York where someone wrote, "This place used to be a Chinese restaurant; well, it`s not anymore," ...

Barry: That is a problem.

Max: Yes, exactly.


6. And you use Bayesian statistics? And what role does that area play in the whole scheme of things here?

Right. So in the large view Bayesian statistics is kind of a way of looking at the world and kind of deciding what`s true and what`s not.

So if you think about this way, let`s suppose that you have two hypotheses to different models of the way the world works; and then you kind of have a prior general sense of which one is more likely than the other; maybe you think they`re equally likely; maybe you think this one you know, has a far greater likelihood than this one; then you collect data and then you use Bayes' rule which is actually a pretty simple equation to update your likelihood as you go.

And if you choose good priors and you apply data to it, then you know, oftentimes, and you have a wide variety of hypotheses from which to choose from, then a good result will come out of it. So let`s take the problem that I just gave to you, the problem of someone talking about the haircuts in some of the people who work in the Apple Store; you know, if there`s a single tip on the venue and the tip mentions haircut, you know, you could say 100% of the tips on this venue mentioned haircut. But another point of view is, "Well, you know, it could be equally likely that there was some kind of a fluke in this one example and maybe people don`t tend to mention haircuts at this place; and with just one data sample, it`s not going to affect the probabilities that much.

So the example I used are people liking and disliking a place, a new feature that we have added to Foursquare about two weeks ago and if we have one person liking a place or one person disliking a place, you don`t want to give it a hundred percent or zero percent, you know, a lot of these sites do ‘Rotten Tomatoes` does it; and YouTube does it, you know, but it`s not really fair.

So in my talk, I talked about an easy way to fix it which is to add a little bit of prior data to each one; so I mean let`s suppose that you add ten likes and ten dislikes that you start out with. So you start out with 50% and you add a single like, it`s going to be a little bit more than 50%; you get tons of data then that kind of list of data speak for itself and the stuff you added at the beginning kind of gets washed out; and I also have a Python script that kind of estimates what the optimal place to start is; so that`s a Dirichlet prior.


7. Out of curiosity, so if I have a new book and it`s on Amazon, and should I jump to make sure that the first person who reviews it gives me a good review; or, do most websites that do ratings and show reviews use the kind of fairness criteria that you`ve just suggested?

I don`t know how Amazon or Netflix or any of these other sites do it; you know, Amazon you could sort by date and you could sort ascending or descending; so I would think it`s pretty important you want the first ones to be good just for the fact that people can sort by descending date and it will always come up first.


8. Like you don`t want to cheat and have your friends do it.

I think on Amazon or on Yelp it might affect the overall rating of your book if you have just one reviewer. Ultimately, like on Foursquare, for example, the majority of the reviews are very positive because it`s not a critic site.


9. It`s "why I like this".

Yes, people are going out, enjoying themselves and people kind of like to tell their friends they`re having good time; sometimes people like to warn their friends; you know, "Don`t do this; don`t do that," but oftentimes if I have a place with very little data, I`d be better off assuming that on Foursquare, at least, that most people are going to like this than dislike it even if the first few people start to dislike it, I`m still going to think that the majority of people will like it; it`s going take quite a few before I say, "Okay, this is really bad."


10. Now, you have a term you use here: ‘explore query` is that a particular kind of query?

Right, that`s when somebody; no, that`s any type of query; that`s when somebody is looking for recommendations; so we have a few types of these: one is somebody just goes to explore, this is new in our update, released a couple of weeks ago, is when somebody just goes to, "Just show me interesting things nearby" and in that case, we have nothing to go on; we have no idea what the person is looking for. It`s just, "show me some interesting stuff." And from that, we pull from a lot of different data sources; we pull from what`s going on right now; we pull from specific categories like that people tend to like, "Here`s some coffee; here`s some night life; here`s some food".


11. Do you have a closeness criteria? If they like coffee, then chances are they like tea or something like that?

Max: We did it on the venue level, so you`ll see reasons ...

Barry: If they like this movie, then they`ll like that?

Max: Yes, except on the venue level. So we have that and coming back to the ‘explore query` question, there`s another, there are a couple other types of queries you can do: one is a ‘category query` so if you click on a topic that comes with a list of categories, you can get them really fast like food, shopping, outdoor - and then it just does that.

And then another one is, if you want to type in for something specific, that`s not on that list, which is, I feel like the majority of time, you`re just going look for things with very generic list but some people type in actual text that you want to search for a specific thing and that`s a pretty important part of that too; and the fourth thing is trending which is, "This is stuff that`s going on right now; this is what people are doing around you, right this second".


12. So in your vocabulary, trending doesn`t mean "what`s happened over the period over the last month"? It means "what`s going on the last three hours"?

Max: Right and before a couple of weeks ago, we changed our trending algorithm a little bit.

Barry: Tell us about that.

Max: Before, it was just this is an ordered of list here`s where the most people are, and then here`s who in second place, third place, fourth place, all the way down; and I think there`s a lot of value in that list but one of the problems is, it always looked the same; Grand Central Station, Penn Station, LaGuardia airport, JFK airport, and then maybe Central Park or Bryant Park or something like that. Now, every once in a while, you`d see something on that list that was ... but we kind of changed the measure a little bit to include what`s happening right now that`s not normally happening.


13. Like higher priority on the most recent things.

Max: Not more recent, all these things are recent; there are a lot of people at Penn Station right now.

Barry: ... but higher priority to things that don`t usually happen on a Wednesday afternoon.

Max: If there`s 60 people at Penn Station right now but there are usually 60 people at Penn Station on a Wednesday afternoon, I don`t know if that`s the exact number but it's something like that.

Barry :...background noise

Max: Background noise, exactly; so we kind of have this off trending measure which says, "This thing has a lot more people; this is how much we predict and this is how much the actual is higher than the prediction." And sometimes we get these venues that are we call, ‘super trending," which are so unique that we'll show it to you even if it`s a little outside your radius like we give you a big ten-mile radius. That originally happened when we were testing this thing and the Giants had won the Superbowl and they were having their parade and it was down the street and we can see from our office window, "That`s where they`re having it."

But I plotted our radius on the map and it`s like, "Oh, it`s just outside the radius;" you know, literally yards outside our radius; so we decided, "Well, if and this was a new venue; so it starts with a very low baseline; and you know, hundreds of people were there; so it`s, okay if we get that big signal, then we`re going to show you this venue," and the fact we`re working on ways to kind of aggregate these signals from around the world to find interesting events going around on the world. The other day, we looked over, there were 3,000 people checked in at this venue in Istanbul and it turned out to be Madonna`s concert and we saw the pictures coming in; it was like watching the concert real time.


15. I realize also some of that information in question might be proprietary to your organization?

Max: Yes, nothing that I can think of in that sense; I`m sure there is more interesting stuff than I know about; but one of the fun ones is this app that if you check into a restaurant that gets a bad review, gets a bad health reading like it will tell you...

Barry: "Don`t swallow that"...

Max: I think it was called, it wasn`t called that although it`s something like that; I think it`s called ‘Don't eat here`; but I think there have been a lot of innovative uses of that data and then we get requests from researchers sometimes and we`re still kind of working on that.


16. You mentioned changes in the trending algorithm, are there also changes in the ‘Explore algorithm`?

Yes, huge changes; so the first part that I mentioned, the part of the ‘Explore algorithm`, where you haven`t typed in any query, it`s just find thing that are interesting around me right now, that`s brand new and you could see from the justifications that it pulls from all different sources; it says, "Okay, these are places where friends have been; these are places that are generally popular with the general public; these are places that, you used to go these place but you stopped going," maybe it`s time to go back.


17. There are often reasons and I realize this: I simply forgot about the place, it just went off my mental radar.

Exactly and one of the things that`s important that we`re working to incorporate more at least I like to incorporate more is, ‘let`s show you something that you don`t know; if you already know about a place, you know, it`s not interesting for us to show it because you already know it`s there. I want you to be able to open the app and go, "Oh, here`s some ideas that I didn`t necessarily think of" and it could be something that you`ve been to but you haven`t been there in a while; it could be something that maybe your friends know about it and you know, anything like that.


18. That would be great; and so what`s in the future for these algorithms? What do you foresee in the next five years? What will I be doing with my handset that I`m not quite able to do now?

Max: Well, five years is a long time and five years ago, I was working on a website called Sticky map, I still go there; it does something like this where you kind of post these little icons all over a map and you add little pieces of data; too bad I didn`t have any idea about like starting a business but at least it prepared me for what I`m doing now. You know, I think this is really one of the first apps where you, your past self, your friends and everyone else in the world is kind of there to guide you as you`re walking around and going about your daily living.

And it`s a little clunky right now; I think we`ll admit that; you have to take out the app, you have to find that to check-in; you may have to click around to get tips for information. You know, I would really like it to be a lot more seamless in five years, almost happen automatically, I don`t know what that means, you know, you have glasses or watch or even if you just have a phone.

Barry: When you and your company make a fortune, I`ll call you.

Max: But in terms of the algorithm, you know, this means that the algorithm has to be pretty seamless as well; so it`s got to be fast and it`s got to incorporate, I think we really have to think about you know, what does the user know and what can we give them right now that gives them the most value; and I think we have a long way to go in terms of optimizing that.


19. And certainly changes in the hardware; changes just in the ergonomics of the way people use devices will push a lot of the development in the next several years.

And changes in data models as well, I mean, this whole like/dislike thing that we have now is brand new; we have it because before, we could kind of infer whether someone likes a place or not; it`s been a little controversial. If you go there all the time, do you like it? And you know, some people say, ‘yes` but there are these exceptions; you know, people check into work all the time; not everyone likes where they work; people can check into a place that if it`s a monopoly, maybe they have to go there.

I know sometimes people go somewhere just because that`s what they do but they don`t necessarily like it; and you know from tips, we can infer whether people like it or not but it`s kind of a heavy weight, people like to click more.

So it gives us a signal that we could only get at very weakly before it gives it to us a little bit better; and I think it`s very difficult to take all these different types of data; the likes and the dislikes, the check-ins, the tips where people put on lists especially the natural language and try to come up with optimal recommendations for that. My hope is that in five years, we`ll be real good at that.

Aug 09, 2012