QCon New York: Evaluating Machine Learning Models - A Case Study in Real Estate

| by Srini Penchikala Follow 36 Followers on Jul 04, 2017. Estimated reading time: 1 minute |

Opendoor, a real estate company that helps customers with buying and selling homes, uses machine learning techniques to drive pricing models. Nelson Ray, data scientist at Opendoor, spoke at QCon New York 2017 Conference about how they developed a simulation-based framework for reasoning about machine learning models to assess the risk in reselling homes. Opendoor takes the risk in reselling the homes so the team needs to understand the effectiveness of different hazard-based liquidity models.

Ray started the discussion with real estate industry statistics of the homeowners moving to a new home. American homes represent a $25 trillion of assets with a $1.4 trillion of annual transaction volume and $100 billion in fees. With 5.5 million Americans per year buying and selling homes, moving is the number one consumer expenditure with an average of $17,798 per year.

He talked about the problems with traditional unguided A/B Testing and how simulation based inference solutions can help in real estate use cases. A typical A/B test of a real estate liquidity model can run for several months to fully realize resale outcomes. Simulation based testing is a multi-step approach to evaluating models against critical business metrics. It offers the advantages of shorter time to assessment results (seconds vs months) and low cost (the only cost is computational).

They generate 3D models to assess the liquidity of homes based on house economics metrics like conversion, profit and fees. The model uses data like historical home sale transactions, house listings on the market and simulates their buying process to estimate the costs and observe the actual outcome for a house.

He also talked about the Pyramid of causal inference which includes the following elements:

  • Observational Analysis
  • Simulation-Based Inference
  • Quasi-experiments
  • A/B Test

Ray discussed the recipe for guided testing to generate simulation home offers. This includes the data generation process (past house sale transaction data) and the user model which simulates their home buying process; P(sell | cost). He suggested everyone should do simulation before A/B testing to make the testing effort efficient and effective.


Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread


Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you