Fighting Financial Fraud with Machine Learning at Airbnb

Airbnb, the online marketplace that matches people who rent out their homes with people who are looking for a place to stay, uses machine learning (ML) techniques to fight financial fraud. They use "targeted friction" to battle the chargebacks while minimizing negative consequences to genuine guests using their online reservation system.

Fraud detection is critical for the Airbnb team because nearly two million people stay in Airbnb listed properties in 191 countries around the world on any given night. This means the rapid growth of their global community is predicated on trust. Their approach to fighting fraud consists of both proactive measures and reactive support. Proactive measures are used ahead of time of the transaction - and often in the background - by leveraging machine learning, experimentation, and analytics to prevent fraudsters from using stolen credit cards on the website.

David Press, trust data scientist at Airbnb, wrote about how they leverage machine learning techniques to identify and block fraudsters while minimizing impact on good users.

Chargebacks are the main focus of their fraud detection program. Chargebacks, which are common in online businesses, are the transactions that unauthorized users charge using the stolen credit cards. When the actual cardholder realizes their card has been stolen and notices unauthorized charges on their bill, the credit card company issues the refund to the merchant and the merchant (Airbnb) returns the money. Unlike other companies, Airbnb absorbs the full cost of these chargebacks and doesn't pass the financial liability to the hosts. So they actively work to block stolen credit cards from being used in the first place in order to better protect the user community and reduce their own exposure to chargeback costs.

The transactions are sometimes blocked outright, but in most situations Airbnb allows the user the opportunity to satisfy an additional verification called a "friction". A friction is something that blocks an unauthorized user, but is easy for a good user to satisfy. The different types of frictions include micro-authorization (placing two small authorizations on the credit card, which the cardholder must identify by logging into their online banking statement), 3-D Secure (which allows credit card companies to directly authenticate cardholders via a password or SMS challenge), and billing-statement verification (requiring the cardholder to upload a copy of the billing statement associated with the card).

Press discussed how they use machine-learning models to trigger frictions targeted at blocking fraudsters. He also outlined how they choose the ML model's threshold by minimizing a loss function for three different scenarios: false positives, false negatives, and true positives.

They detect financial fraud transactions using machine-learning models trained on past examples of confirmed good and confirmed fraudulent behavior. Like any other ML model, they have to deal with different scenarios like false positives, false negatives, and true positives.

False positives are "good" events the model classifies as "bad" (above the threshold).
False negatives are the fraudulent events that scored below the machine learning model's threshold.
True positive event occurs when the model correctly identifies a fraudster with a score above the threshold.

Press also wrote about the cost of each of these scenarios. If they incorrectly apply friction to a good booking (a false positive), they incur a cost because there is a chance the good user will not complete the transaction that the friction has been applied, and then will not use Airbnb.

In regards to false negatives, the total loss is calculated by multiplying the number of false negatives by the cost of each fraud event: FN * C. Airbnb absorbs all costs associated with chargebacks so the total cost is the full amount of the payment made by the fraudster, plus an overhead factor associated with processor fees and increased card decline rates. Another cost of false negatives is if they incorrectly apply friction to a good booking (a false positive), the good user may choose not to complete the friction and then will not use Airbnb.

Finally, for the true positive transactions, they apply friction to achieve their goal by preventing that specific unauthorized user from using Airbnb.

The machine learning model threshold is optimized by training the chargeback model on positive (fraud) and negative (non-fraud) examples from past bookings. Because fraud is extremely rare, this is an imbalanced classification problem with scarce positive labels. They characterize their model's performance at identifying fraudulent versus good bookings at various thresholds in terms of the true-positive rate and false-positive rate, then evaluate the total cost associated with each threshold using a loss function that depends on those rates.

Airbnb also runs an A/B test using their Experiment Reporting Framework in order to measure the impact of each friction on good users. They assign users with low model scores (who are very unlikely to be fraudsters) to the experiment at the same stage in the funnel where they will apply the friction against fraudsters.

In the post, Press also explained a numerical example comparing the optimization of blocking transactions versus applying a friction.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter