InfoQ Homepage Presentations Fairness, Transparency, and Privacy in AI @LinkedIn

Fairness, Transparency, and Privacy in AI @LinkedIn

View Presentation

Speed:

Download

47:32

Summary

Krishnaram Kenthapadi talks about privacy breaches, algorithmic bias/discrimination issues observed in the Internet industry, regulations & laws, and techniques for achieving privacy and fairness in data-driven systems. He focuses on the application of privacy-preserving data mining and fairness-aware ML techniques in practice, by presenting case studies spanning different LinkedIn applications.

Bio

Krishnaram Kenthapadi is part of the AI team at LinkedIn, where he leads the transparency and privacy modeling efforts across different applications. He is LinkedIn's representative in Microsoft's AI and Ethics in Engineering & Research Committee. He shaped the technical roadmap for LinkedIn Salary product, and served as the relevance lead for the LinkedIn Careers & Talent Solutions Relevance team

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Kenthapadi: Thank you all for coming on the last day of the main conference. I'm going to start with a few stories. Let's go back almost two decades. This was when the insurance commission at Massachusetts had decided to make the anonymous medical records available for various research purposes. Back then, there was a grad student at MIT named Latonya Sweeney. So what she did was she got hold of the voter roll at Cambridge for about $20 and joined the two data sets, the anonymized health records and the voter roll, not on attributes that we would expect, not on attributes like name or Social Security Number because these are now present in the anonymous data, but on attributes such as zip code, date of birth, and gender. With this, she was able to identify the health records of the Governor of Massachusetts.

It turns out that this was not an isolated incident. Later research has shown that around two-thirds of people in the United States are uniquely identifiable based on a combination of these attributes, specifically like zip code, date of birth, and gender. So this kind of illustrates that when we make any data available, we have to not only think in terms of what is there and that data set, but also think in terms of, what other external information someone might make use of to join with the data set?

Let's go forward nearly 10 years. Then came the Netflix challenge. Many of you might remember this. This was in 2006 when Netflix decided to make available the anonymized movie ratings of around 500,000 Netflix users. So this is about 10% of their users. This data set had about 200 ratings per user. This was great. It was very well-intentioned. It was to see if we can improve the algorithms for predicting what movies might users like. But what happened was there were two researchers from the University of Texas who joined the Netflix Prize challenge data set with the public data set from IMDB. These are like publicly rated movies on IMDB. By joining these two data sets, they were able to associate identity with some of the anonymized movie ratings.

This, again, worked in a similar fashion. On the left side, we have the anonymized Netflix data. On the right side, we have the public but incomplete IMDB data. The difference is that in the public data sets if you take Alice, she's likely to acknowledge watching movies that she's comfortable saying in public. But in the Netflix data, she might have also rated some movies that might have been sensitive or she may not want others to know about it. The intention of the join was that if you think of all possible movies, it's intrinsically high dimensional. So the movie records of each person, especially when combined with the ratings, becomes almost like a signature for a person. So that was the information which was a used for this attack.

As a result of the attack, the researchers were able to associate identity with many of the records. Just to give an example. If, say, there's a hypothetical user who we know has watched these movies, it can reveal a lot about the political, religious, or other social beliefs. So I hope, from these two examples, I have convinced you that it's very important to think of privacy, and it is important to think of ways in which the attackers might use the data in ways that you may not have imagined when you reveal the data set initially.

Let's go forward another decade. This is a roughly about a year back. This is a very interesting work called Gender Shades. This is by two researchers who wanted to see how good is face recognition software for different groups of populations. Specifically, they wanted to understand whether the accuracy of face recognition is the same for men and women, or is it the same across the skin tone? What they observed was that the commercially available face recognition software had higher accuracy for men and higher accuracy for light-skinned individuals. In particular, when we take the combination, they observed a significant difference in accuracy between light-skinned men and dark-skinned women.

For example, they observed that the error rates for dark-skinned women were as much as 20% to 34%. So this, again, highlights that it's not enough for us as maybe Machine Learning practitioners and data-driven system builders to just develop models, but also think of how accurate these models are for different populations.

Algorithmic Bias

If you take this message broadly, there have been lots of studies showing that when it comes to Machine Learning and the AI systems there, there are a lot of ethical challenges. This is partly because we know that there are inherent biases in different parts of society, and these biases can get reflected in the training data. So more often than not, many models are trained on training data, which is derived from human judgments. Either these can be explicit human judgments, or these can be implicit human judgments in the follow up for user feedback. The reality is that there are a lot of biases when it comes to such human judgments.

One of the challenges with the AI and Machine Learning models is that they might reflect these biases in the models and sometimes can even potentially amplify such biases. So there is so much work and awareness around these issues of late that there are even popular books written on this topic. I would encourage all of you to take a look at some of these books, like "Weapons of Math Destruction," which go into such challenges across almost all applications that we can think of.

So if there is one message that I would love you to take out of this talk, is that we should think of privacy, or transparency, or fairness by design when we build products. Instead of leaving them as afterthought, we should think of them upfront when we design AI and Machine Learning models. For the rest of the talk, I'm going to give you a very brief overview of what we do at LinkedIn and then describe a few case studies that highlight how we are approaching privacy and fairness at LinkedIn.

AI at LinkedIn

As many of you might know, our LinkedIn has the vision of creating economic opportunities for every member of the global workforce. Our mission is more about, how do we connect the world's professionals to make them more productive and successful? Towards this mission as well as vision, we have built what we call the Economic Graph. This is a graph which contains over half a billion LinkedIn members, around 26 million companies, 15 million jobs, 50,000 skills, and so forth, and all the connections and the relationships between these different entities.

It so happens that if you look at almost all applications on LinkedIn, the Machine Learning or AI is one of the key underpinnings, because the power of LinkedIn or even the essence of all the applications is that they're data-driven and no wonder that we are heavy users of AI within LinkedIn. For example, we have several petabytes of data which are being processed either offline or near line every day. If you look at all the different Machine Learning models, we have several billions of parameters which need to be learned periodically from the data. At any point in time, we have about 200 A/B Machine Learning experiments happening every week, and so forth.

Given what the message I was trying to convey earlier and given the scale at which we operate, it is natural that we need to take dimensions like privacy and fairness seriously. In fact, my role at LinkedIn is to lead our efforts for fairness, privacy, and related aspects of transparency, explainability across all our products. I'm going to describe two of the case studies with respect to privacy.

Analytics & Reporting Products at LinkedIn

The first one is around how we address privacy when it comes to analytics at LinkedIn. The second one is about privacy in the context of a large crowdsourced platform that we built called LinkedIn Salary. So let's start with the first one. So, this is the analytics platform at LinkedIn. It so happens that whenever there is a product that we have at LinkedIn, there is usually an associated analytics dimension. For example, to LinkedIn's members, we provide analytics on the profile views. Similarly, when they post some content, articles or posts on LinkedIn, we provide analytics in terms of who viewed that content and how the demographic distribution of the viewers is in terms of titles, companies, and so forth. Similarly, when advertisers advertise on LinkedIn, we provide them analytics on who are the LinkedIn members who viewed these ads, or clicked on the ads, and so forth.

Given this and for all of this, the analytics provides demographics of the members that engage with the product. One of the nice or interesting aspects of analytics applications is that unlike arbitrary queries, unlike our usual, say, database or a sequel applications for the kinds of analytics applications I described, they admit queries of very specific type. Quite often, this involves querying for the number of member actions for some settings such as ads and for a specified time period, and also along with the demographic breakdowns in terms of titles, companies, company sizes, industries, and so forth.

So we can kind of abstract these types of queries in the following template. We want to select the number of, say, actions from your table which contains some statistics type and some entity for a given time range and for certain choices of the demographic attributes. For example, we can think of the table as corresponding to the clicks on a given ad. The attribute value could be that we want all the senior directors who clicked on this ad.

You might wonder, this is all being provided in aggregate, what are the privacy concerns? It happens that we want to ensure that an attacker cannot infer whether a member performed some action from this reported analytics. So, for example, due to various regulations and our member-first policy, we don't want an advertiser to know which members clicked on an ad. We don't want for anyone to figure out which member clicked on an article.

And for this, we assume that the attackers might have a lot of auxiliary knowledge. For instance, they can have knowledge of various attributes associated with the target member. This can be obtained from the public LinkedIn profile. We can know the title of the member, or which schools they went to, which companies they worked at, and so forth. We also make a strong assumption that the attacker might also know about others who perform similar actions. This can be achieved in several ways. One way might be that the attacker might create a small number of fake accounts, and the fake accounts might resemble the target member, and the attacker has control over how the fake accounts respond to, say, the ads.

Possible Privacy Attacks

Let me give you an example of potential privacy attacks in this setting. Let's say that an advertiser wants to or target all the senior directors in the U.S. who happened to have studied at Cornell. This criteria in itself matches several thousands of LinkedIn members, and hence, it will be over any minimum targeting thresholds that we have. But the moment we consider the demographic break down by, say, company, such as we are looking at all the members who meet this criteria but who are from a given company, this constraint may end up matching just one person.

Even though we targeted all the senior directors in the U.S. who went to Cornell, the moment we consider any one company, the number of such people might be just one. And now by revealing the statistics at the company level, it's possible for us to know whether that person clicked on the ad or not. So our initial thought was, what if we require, say, minimum reporting thresholds such as we don't report unless maybe at least a 15 people clicked on the ad for any dimension, right?

But then the problem is that, as I mentioned before, the attacker can create fake profiles. By observing, let's say, if this threshold is 10, the attacker can create 9 fake profiles. And if we reveal the statistics, then the attacker can know that this real member indeed clicked on the ad. This assumes that the attacker creates 9 fake profiles and has all those fake profiles clicked on the ad. In principle, this approach can hold even if the threshold is much larger, just that it makes attacks of this kind harder to execute.

Similarly, if you think of other mechanisms such as maybe round and report in increments of 10 and so forth, so even those don't work. We can easily construct Cornell cases as well by observing the reporter's statistics. Well, over time, the attacker can maybe be able to figure out who clicked on that. So all this suggests that we need rigorous techniques to preserve member privacy. As part of these techniques, we may not want to reveal the exact accounts.

Key Product Desiderata

This is as far as the privacy concerned. At the same time, we should remember that the very goal of this product is to provide sufficient utility and coverage from the product perspective. So, of course, we could have achieved privacy by not showing any statistics, but that's not going to help with the analytics product. So we want to provide as many analytics combinations as possible for as many demographic combinations and as many different types of actions as possible.

So that's one of the requirements from the product perspective. We also have requirements along the consistency of what we show. Say, if we're computing the true count and reporting the true counts, we don't have to worry about this. Because if, let's say, the same query is repeated over time, we know that we will get the same answer because we're asking, what is the number of clicks on this ad for the month of September? So that's not going to change, whether I issue the query now or in future.

But the moment we decide not to reveal the true answer and maybe potentially obfuscate or add noise, these kind of requirements may not hold anymore. Similarly, there are several other types of consistency that we might want, so I won't be going into each of this in the interest of time. To summarize, we can think of the problem as, how do we compute robust and reliable analytics in a manner that preserves member privacy and at the same time provides sufficient coverage, utility, and consistency for the product?

Differential Privacy

Here is where I'm going to take a segue into this notion called differential privacy. This is a notion that has been developed over roughly the last 10 to 15 years. Here, the model is that, let's say, there is a database shown on the left side in red. On the right side is an analyst who is interested in issuing some queries and learning the answers to those queries. In between the database and the analyst is a curator who decides how to respond to the questions of the analyst.

Let's, as a thought experiment, consider two worlds. One world is original data. The other world is the same as the original data, but it does not contain, say, one person's data. Let's say that the blue world here contains my data, and the red world does not contain my data. Everything else is the same. Yes. If we can't guarantee that from what we give to the analyst, the analyst cannot determine which of the two worlds we are in, that gives a very strong intuitive guarantee on my privacy. This means that irrespective of which world we are from, we are using in reality that the analysts or the attacker cannot determine which of the two was used. This, in particular, means that the attacker gains very little information about my private data, and this gives us, intuitively, a strong guarantee on my privacy.

In terms of if there is a way to formalize this, so this is called differential privacy. It intuitively requires that if you take two databases which just differ in one individual or one record, the distribution of the curator's output should be about the same with respect to both. I'm not going to go into the mathematical definition, but you can kind of think of this as a definition with some parameter. There's a parameter, epsilon, here which kind of quantifies the information leakage. The smaller the epsilon, the better it is for the purposes of privacy.

This might all sound abstract because I'm talking about this thought experiment, which does not exist in reality. In reality, either my data is there in the database or it's not there in the database. But, again, remember that this is just for the purpose of understanding this definition. In practice, this definition can be achieved by adding a small amount of random noise. Formally, you can add noise from what is known as the Laplace distribution, such that the more sensitive the query is, the more noise you add. By sensitivity, we mean the extent to which the function changes if one person's record is added or removed from the data set.

Intuitively, if you think of, say, something like counting, we know that if you remove just one person, the accounts are going to maybe change at most by one. It has very low sensitivity. Whereas some other queries, such as if you think of all people's salaries, the sensitivity might be quite large, because if one person earns a disproportionately large amount, that can affect the output of the function.

Again, the takeaway here is that there are ways to instantiate this definition in practice. Over years these techniques have been developed more and more, and they have evolved from being in the research domain to being in practice. In fact, it so happens that all of us, in some way or other, have used differential privacy. Differential privacy today is deployed as part of Google’s Chrome browser. It's deployed as part of iVoice from apple. It's deployed as part of ad analytics at LinkedIn. We all have used techniques or differential privacy without perhaps knowing that this is being applied.

PriPeARL: a Framework for Privacy-Preserving Analytics

Going back to our setting, so the way we make use of differential privacy is as follows. First, as I said earlier, we want consistency. In particular, if the same query is issued again and again, we want to ensure that the results don't change. The way we do this by creating a pseudo-random noise. We don't generate noise at every instance when the query is asked. Instead, we determine this noise based on the query parameters. Specifically, we take all the parameters of the query, which are shown on the left side, then we compute cryptographic hash based on these parameters, and then normalize that into, say, a zero-one range and then map that to the suitable distribution.

This way, in this manner, we can show that we can generate the noise which looks random, but it's determined in a manner which is kind of deterministic. Then we compare the true count for the analytics and add this random-looking noise. The noisy count that results is what we reveal then to advertisers. So as I mentioned just now, this satisfies the property that if you repeat the same query, you're going to get the same result. It's good for consistency. It's also good from the perspective that you cannot perform an averaging attack. You cannot keep issuing the same query again and again, and maybe average out the noise and that way get a good sense of what is the true answer.

There are also various other types of consistency that we have to handle, such as consistency for top queries, or consistency across time ranges, and so forth. I will refer you to our paper for more details, or I will be happy to talk offline after the talk. This is implemented as part of our analytics pipeline. Here, in this architecture, the ports to the left are what we can think of as offline or nearline components. Based on the various tracking events such as the impressions, clicks, and so forth, these are directed to an offline analytics computation pipeline, which operates on, say, a daily basis, as well as an online or nearline analytics for getting the analytics in the last few hours. And output of these are fed to Pinot, which is our scalable online analytics platform.

To the right side is what happens online. Let's say that either through our analytics web interface or through the associated APIs, we get a query for, say, the statistics about some ad campaign. As part of this query, we go and retrieve the true answer from this Pinot store. At the same time, we also compute this noise in the manner that I described earlier. Then the privacy mechanism takes both of these into account and applies various consistency checks and then returns the noisy count back to the advertiser. In this manner, we ensure that member privacy is preserved. And at the same time, we can have balance between various conflicting requirements such as coverage and a consistency of the data.

Lessons Learned from Deployment

This has been deployed within LinkedIn for more than one year. There are a lot of the lessons we learned during this process. For example, we learned about the need for balancing semantic consistency versus just adding maybe unbiased and unbounded noise. For example, let's say, if you draw noise from such a distribution, the noise can be sometimes positive, and sometimes it can be negative. Let's say the true count is maybe five clicks. And if our noise happens to be, say, negative seven, we don't want to report, five plus negative seven, negative two, as the answer. That doesn't look intuitive, and people might think that there is a bug in the system. So we had to balance between adding such unbiased noise, versus ensuring that it looks consistent and meaningful from the product perspective.

Similarly, there are a few other lessons we learned such as, even though for the purposes of privacy it's enough for us to reveal such noisy counts. We still suppressed when these counts were very small. Because of the fact that we're adding noise, the small counts may not be meaningful anymore. In fact, in the very first place, we don't want anyone to make inferences based on a very small number of clicks. Those are not statistically meaningful. In particular, once we add such noise, we would like to not show such small counts. So this is something we added to the product.

The final most important lesson for us was that in the process of scaling this to various applications, we had to make not only the tools available, we had to kind of abstract out the relevant components, realize that these are not just specific to, say, other analytics, but applicable across various other analytics applications as well. We not only built those tools, but also built, as I stated, tutorials, how-to tutorials, and so forth. That helps for increasing the chances of adoption of such tools. So that's something that I think most of you can resonate with. If you're building something which is initially built for one vertical application, but then you realize that it's horizontally applicable, then it's important to not only have the tools, but also have sufficient documentation and convince and talk about why this is important, and make the tools easier for our options so that we can then scale for various applications.

In summary, in this framework, we address various challenges such as privacy of users, product coverage, data consistency, and so forth. There are a lot of interesting challenges here such as, instead of adding the same amount of noise for, say, impressions and clicks, can we trade off in a certain manner? Can we think of this as an optimization? Maybe we might want to add more noise, say, to impressions, and less noise to clicks. So there a lot of interesting challenges that we can think of in this setting. Let me acknowledge that this is a joint effort with lots of different people from several different teams.

Let me give a quick flavor of the second application, which is the tool, LinkedIn Salary. This is a tool where we provide insights on how the salary is distributed for different occupations and regions. For example, from this product, we can learn about the salaries for, say, user experience designers in the Bay Area, and we provide the median salary, the 10th and 90th percentiles, and the various different types of compensation, the distribution in terms of the histogram from what we have obtained, how the salary varies by title, by region, and so forth. This is all based on a massive crowdsourcing that we have been performing by collecting salaries from LinkedIn members with their consent.

This is how we collected salary from LinkedIn members. For example, Charlotte here has the option to provide her base salary as well as her various types of compensation. And once a member submits their salary, and once we have enough data, we get back to the members with the insights for their corresponding professions. Today, this product has been launched in several countries, and it makes use of salary data points obtained from several millions of LinkedIn members.

Data Privacy Challenges

What I want to highlight today, though, is that there are a lot of privacy challenges when it comes to how we designed this product. Salary is one of the most sensitive attributes that we have on the LinkedIn platform, and we had two related goals when designing this product. We wanted to minimize the risk of inferring any one individual's compensation from this aggregated insights. The second related goal was we wanted to protect against worst-case scenarios, such as a data breach. We wanted to ensure that there's no single point of failure in our system.

These were achieved by a combination of several different techniques such as encryption, access control, de-identification, aggregation, and so forth. I would refer you to take a look at this paper and associated talk slides for more details. But let me just give you an example to illustrate how this is done at a very high level. Let's say that Charlotte provided her salary as a user experience designer at Google, and then we associate the key relevant attributes of the member along with the salary that Charlotte has provided. For example, this might include her title, region, company, years of experience and so forth, along with the salary, which is, let's say, 100K. Then we form cohorts from this original data. So these cohorts can be thought of as, say, salaries for all user experience designers in San Francisco Bay Area. This can also be by, say, UX designers who are in the internet industry or UX designers who work at Google in Bay Area with some years of experience and so forth.

At first, it might seem that it's enough to do this and make this data available for our offline analysis but turns out that, in this example, this may not be the case. But if you take, say, the queries such as [inaudible 00:35:28] at Slack, we know that at any point, there is just one person who is civil at the company. And from such de-identified data, we might still be able to infer that person's identity and their salary.

So because of that, we required that there needs to be at least a minimum number of data points from each cohort before that data is even available to us as Machine Learning researchers to process offline. This is the data that we use to perform the modeling and compute the statistical insights, which are then displayed back in the product. Again, there is a very detailed architecture and system design that involves different encryptions, decryptions ensuring that there is no single point of failure, and so forth. Again, in the interest of time, I'm not going to discuss the specifics.

Fairness in AI at LinkedIn

Let me just jump to the last part of the talk, which is about, how do we look at fairness in Machine Learning systems? As far as this is concerned, our guiding principle has been what we call "Diversity by Design." By this, we mean that we are not creating, say, a separate product for diversity, but we are integrating diversity as part of our existing products. In fact, you might ask, "Why diversity?" A lot of research has shown that diversity is tied with the culture of a company, as well as financial performance. In LinkedIn's own studies, we have found that around 80% of the LinkedIn customers want diversity for improving the company culture, and around 60% think that it's correlated with the financial performance. There are also studies in Academia as well as by [inaudible 00:37:37] which show this correlation within diversity and financial performance.

If you look at LinkedIn, the notion of diversity happens in various settings. For example, if you look at the talent products that we have at LinkedIn, there are several different ways in which people might use these products. So we have integrated diversity as part of three stages. The first is planning. This is when a company wants to decide which are the skills that they want to look for, which are the regions where people with those skills are present? How does diversity appear across the industry? How does diversity appear at the current company in different disciplines? And so forth. So this is the planning stage. The second stage is when it comes to the actual hiring and sourcing of candidates. And the third one is, once we have employees, how do we ensure that we train avoiding bias, and unconscious bias, and so forth.

LinkedIn recently announced products across all three stages of this talent pipeline. I'm going to mostly focus on the second part, but let me just give you a feel for what we mean by the first part. This is the planning stage, and LinkedIn has a product recently launched called LinkedIn Talent Insights. With this product, we can not only know the skill distribution for our different regions or for different professions and so forth, but we have integrated diversity insights as part of this. Specifically, the LinkedIn customers can understand, what is the gender representation of their workforce, and how does that compare with the peer companies in the industry? Which are the sub-teams that have the biggest room for improvement? These insights can help your company to set realistic goals with respect to diversity, and also decide how they want to prioritize various diversity efforts.

Let's say that, through this tool, your company has determined that, for certain skills or certain titles, this is the available diversity in terms of the talent supply pool. Then the natural next question is, how can the recruiters or hiring managers at the company reach out to this talent? That is where our LinkedIn Recruiter tool comes in. This is a tool which allows a recruiter to search for different skills, titles, locations, and so forth, and they get to see a list of candidates who are good matches for the search query. Our goal is to maximize the mutual interest between the recruiter and the candidate.

What we have done here is we have ensured that these results are representative of the underline talent pool. By this, we mean that, let's say, for this query such as the UX designers, or product designers, interaction designers, let's say that the underlying talent pool is about 40% women and 60% men. What we are ensuring is that, even in our top results, the top page of results, we are reflecting the distribution that is present in the underlying talent pool. You might ask, "What is the intuition for such representativeness?" The intuition is that ideally, we would like similar distribution for, say, gender, or age, or other attributes for both the top line results, as well as all the candidates that qualify for that search request.

This, in particular, means that we would like to have the same proportion of members with any given attribute value across both these sets. This is grounded in a notion called equal opportunity. There are very interesting papers, both from computer science, as well as from the law profession, and so forth, around this notion. This is our underlying intuition, and how do we convert this intuition into an algorithm?

This is done in two steps. First, as part of our algorithm, which re-ranks the results to ensure representativeness, we first determine, what is their desired proportion with respect to an attribute of interests, such as gender for a given search request? That's the first part. The second part is, how do we compute your Fairness-aware ranking of a given size?

Let's go into the first part. The first part is based on this intuition that I mentioned earlier. So for a given search request, we retrieve all the members that meet that criteria. We call that as the set of members that qualify for the search request. Then from these members, we determined the distribution with respect to the attribute of interest, in this case, the distribution with respect to gender. So that's how we get the desired proportions for the attribute of interest. Let me just remark that there may be other ways of getting such desired distributions. For example, this can be obtained based on certain legal mandates or even voluntary commitments of companies and so forth.

Once we have the desired distribution, the algorithm works as follows. We first get the set of potential candidates that match the query and then partition them into different buckets. For example, in the case of gender, we partition them into the bucket for whom the gender has been inferred as male, the female, and the cases where we were not able to infer the gender. And then within each bucket, we rank the candidates based on the score from the underlying Machine Learning model. Then we aggregate or merge this rank lists where we ensure the representativeness requirements.

When we perform the merging, we ensure that the candidates, at any position, whether it's top 10 or top 25, top 50, and so forth, the distribution of candidates represents the distribution in the underlying talent pool. At the same time, it takes into account as much as possible ensuring that the highest scored candidates are selected. There is a detailed architecture in the design, how we achieve this as part of our recruiters' search pipeline. Again, in the interest of time, I'm going to skip this and defer you to the engineering blog post that we published a few weeks back.

As a result of this, therefore, we ensure that over 95% of all searches on LinkedIn Recruiter are representative of the underlying talent pool. We also then perform the A/B Tests to see whether there is any impact on various business metrics. The interesting aspect is that we were able to achieve this representativeness without any impact on the business metrics. We observed no significant change in the business metrics, such as whether candidates are responding to messages or emails from the recruiters. Now, this approach has been ramped to all users of LinkedIn Recruiter tool worldwide.

Reflections

Let me just conclude the talk with the two takeaways. The first takeaway, as I mentioned earlier, is that we need to think of the privacy and fairness from the beginning. We need to think of them as part of our design, rather than thinking of them as an afterthought. I'm hoping that from the few case studies that I described, I have conveyed a sense of the challenges associated with ensuring this.

In particular, one of the challenges is that privacy or fairness are not just technical problems. These are not problems that, say, software engineers or computer scientists can go and determine. These are problems which have a socio-technical dimension, which means that we have to collaborate and reach consensus with all the stakeholders. This includes the product teams, the legal teams, the PR engineering, the AI teams, as well as, in our instance, even reaching out to LinkedIn members and customers, and building the consensus from all the stakeholders. That is very, very important when we consider dimensions like fairness, transparency, and privacy. With that, let me conclude. Here are some references that have more details on this topic.

See more presentations with transcripts

Recorded at:

Mar 31, 2019

Krishnaram Kenthapadi

InfoQ Software Architects' Newsletter