SlideShare a Scribd company logo
11
DATA SCIENCE AT ZILLOW
Imri Sofer, Senior Data Scientist
2
Zillow Group’s mission
TO BUILD THE WORLD’S LARGEST, MOST TRUSTED AND
VIBRANT HOME-RELATED MARKETPLACE
3
Zillow’s marketplace
PROPERTY
MANAGERS
& LANDLORDS
BUYERS &
SELLERS
RENTERS
HOMEOWNERS
REAL ESTATE AGENTS
MORTGAGE PROVIDERS
4
For buyers
5
For buyers
6
For renters
7
Zillow Group two step business model:
1. Make amazing products to
attract users
2. Professionals pay to show case
themselves.
8
For buyers
9
The largest portfolio of real estate brands
CONSUMER BRANDS
BUSINESS BRANDS
10
Zillow Group’s audience continues to grow
MONTHLY UNIQUE USERS
Quarterly average (Millions)
0
20
40
60
80
100
120
140
160
180
Seasonal peak of
171M
Unique visitors in May 2016
1111
Why is data science
important to Zillow?
Because Zillow is data
12
Zillow is data
- Our product is driven by data
- The largest most comprehensive housing data (Breadth and depth).
- Over 65 million have been updated by users.
- Our product generates data
- 2MM Reviews of agents.
- More than 300,000 lender reviews.
- 1TB of user activity every day.
- Data is our product
- Users come to Zillow because they trust our housing data.
- Users want to find a trusted agent, and lender that provide great rates and
services.
- We provide data for free for academic/institutional researchers.
- Zillow.com/data – free consumer data (Zillow home value index is available at
a monthly frequency for the nation through states, to neighborhoods.)
13
Data Science and Engineering at Zillow
Clam Bake Beach Day, Aug 2016, at Golden Gardens Park in Seattle, WA
14
Machine Learning at Zillow
Home Valuation
• Zestimate
• Zestimate Forecast
• Zillow Home Value Index
• Rent Zestimate
• Zillow Rent Index
• Pricing Tool
• Best Time to List
B2B
• Ad Campaigns
• Agent segmentation
• Search Engine Marketing (SEM)
Computer Vision
• Videos
• Photos
User Profiles
• Persona Predictions
• Journey location prediction
• Lender Recommendations
Recommendations
• Home recommendation
• Similar homes
• New regions to explore
• Explain recommendations
15
Machine Learning at Zillow
• Example page
Home Valuation
• Zestimate
• Zestimate Forecast
• Rent Zestimate
• Pricing Tool
• Best Time to List
• Zillow Home Value Index
• Zillow Rent Index
example page
16
Zestimate
Goals:
• High Accuracy
• Low Bias
• Independent
• Stable over time.
• Robust to outliers.
• High coverage (Over 100
million homes currently)
• Able to respond to user fact
changes
17
Challenges with the Zestimate
• Some listings are missing features: How do we deal with missing data?
• Some listings have corrupted features (e.g. 28 bathrooms): How do we
identify those?
• Some sale prices do not reflect the value of the home(e.g. a parent
sales to his child): how do we deal with outliers?
• Feature engineering: How can we translate previous sales to
meaningful features?
• How do we identify the places where the model needs to be improve?
18
Machine Learning at Zillow
Home Valuation
• Zestimate
• Zestimate Forecast
• Zillow Home Value Index
• Rent Zestimate
• Zillow Rent Index
• Pricing Tool
• Best Time to List
Computer Vision
• Videos
• Photos
19
Computer Vision at Zillow
• Images and videos play a big role in helping people buy/rent
homes
• Recent deep-learning advancements for CV
20
Let Zillow See
• As of now, our Zestimates are mainly based on
location and size of the properties and they do not
consider the quality.
• Tax assessment might carry house quality
information up to some extent but that’s not
enough.
• For example, an interior upgrade would not change the
tax assessment in most cases if not all
21
• We train a deep convolutional neural network (CNN) to estimate
quality.
Deep Convolutional Neural
Network
Zestimate
22
Image quality scores (prediction)
[0-3] [3-7] [7-10]
23
Machine Learning at Zillow
Home Valuation
• Zestimate
• Zestimate Forecast
• Zillow Home Value Index
• Rent Zestimate
• Zillow Rent Index
• Pricing Tool
• Best Time to List
Computer Vision
• Videos
• Photos
Recommendations
• Home recommendation
• Similar homes
• New regions to explore
• Explain recommendations
24
Recommending movies
25
Home Recommendations
• Our goal is to show users the homes that are relevant to them.
Email
When viewing a home
Ranking search results
26
Email Recommendation
• Goal: Take past user activity and generate relevant recommendations
for new and existing listings.
• Challenges:
• How do we transform user activity into a vector of features?
• What do we want to optimize for? Clicks? Dwell time? Saves?
• What should we do when users don’t have a browsing history (cold start)?
• How can we scale the model to rank 2.5MM homes for 50M buyers? Most
recommendation algorithms are not built for this problem (Netlifx has 5000
movies in its catalog)
27
• user_id listing_id like
• 12 5 1
• 12 34 0
• 12 567 1
• 144 5 0
• 144 34 0
• 1550 567 1
28
Traditional User-Item matrix
Users
Traditional
Items
29
Zillow’s User-Item matrix
Users
Zillow
Items
30
How can we generate meaningful features?
• Date user_id listing_id f1 f2 ... f50 like
• 2017-01-02 12 5 0.89 0.3 0.6 0
• 2017-01-09 12 34 0.90 0.1 0.1 0
• 2017-01-29 12 567 0.82 0.8 0.1 1
• 2017-01-02 144 5 0.19 0.9 0.9 0
• 2017-02-20 144 34 0.40 0.3 0.8 0
• 2017-02-03 1550 567 0.99 0.9 0.8 1
31
Machine Learning at Zillow
Home Valuation
• Zestimate
• Zestimate Forecast
• Zillow Home Value Index
• Rent Zestimate
• Zillow Rent Index
• Pricing Tool
• Best Time to List
B2B
• Ad Campaigns
• Agent segmentation
• Search Engine Marketing (SEM)
Computer Vision
• Videos
• Photos
User Profiles
• Persona Predictions
• Journey location prediction
• Lender Recommendations
Recommendations
• Home recommendation
• Similar homes
• New regions to explore
• Explain recommendations
32
Tools
• Spark (Scala and Python)
• R
• Python (numpy, scipy, sklearn, pandas)
• Random forest
• Linear, logistic, quantile regressions.
• Deep neural nets.
• Matrix Factorization
• Etc.
• AWS
33
Zillow Core Values
• Own it.
• Turn on the Lights.
• ZG is a Team Sport.
• Move Fast. Think Big.
• Winning is Fun.
• Act With Integrity
3434
We’re hiring!
• Data Scientist, Computer Vision and Deep learning
• Software Engineer, Machine Learning
• Data Scientist, Machine Learning
• Internship opportunities across Analytics
- Glassdoor reviews: Top 10 in Seattle Business Magazine
100 Best Companies (#3)
- Glassdoor’s Employees’ Choice Best Places to Work;
Glassdoor’s Best Benefits and Perks;
www.zillow.com/jobs
www.zillow.com/data-science

More Related Content

PDF
Zillow's favorite big data & machine learning tools
PDF
Spark at Zillow
PDF
Data Science At Zillow
PDF
Recommendations at Zillow
PDF
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
PPTX
Neo4j - Cas d'usages pour votre métier
PDF
DevOps Sonatype Nexus Demo_2023.pdf
PDF
Data Ingestion in Big Data and IoT platforms
Zillow's favorite big data & machine learning tools
Spark at Zillow
Data Science At Zillow
Recommendations at Zillow
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Neo4j - Cas d'usages pour votre métier
DevOps Sonatype Nexus Demo_2023.pdf
Data Ingestion in Big Data and IoT platforms

What's hot (20)

PDF
New Dynamics 365 Implementation Guide - Available for download
PPTX
Recommender systems for E-commerce
PPTX
Compression Options in Hadoop - A Tale of Tradeoffs
PPTX
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
PDF
Apache Kylin - Balance Between Space and Time
PDF
Visualization for Security
PPTX
RedisConf17- Using Redis at scale @ Twitter
PDF
eBay Architecture
PDF
Building an Agentic RAG locally with Ollama and Milvus
PDF
Vector databases and neural search
PDF
NiFi Developer Guide
PDF
Running Apache NiFi with Apache Spark : Integration Options
PDF
Observability for Data Pipelines With OpenLineage
PPTX
Learning a Personalized Homepage
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PDF
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
PPTX
Best practices and lessons learnt from Running Apache NiFi at Renault
PDF
Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...
PDF
Neo4j Webinar: Graphs in banking
PPTX
LinkedIn talk at Netflix ML Platform meetup Sep 2019
New Dynamics 365 Implementation Guide - Available for download
Recommender systems for E-commerce
Compression Options in Hadoop - A Tale of Tradeoffs
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Apache Kylin - Balance Between Space and Time
Visualization for Security
RedisConf17- Using Redis at scale @ Twitter
eBay Architecture
Building an Agentic RAG locally with Ollama and Milvus
Vector databases and neural search
NiFi Developer Guide
Running Apache NiFi with Apache Spark : Integration Options
Observability for Data Pipelines With OpenLineage
Learning a Personalized Homepage
Tame the small files problem and optimize data layout for streaming ingestion...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Best practices and lessons learnt from Running Apache NiFi at Renault
Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...
Neo4j Webinar: Graphs in banking
LinkedIn talk at Netflix ML Platform meetup Sep 2019
Ad

Similar to Overview of Data Science at Zillow (20)

PPT
Neighborhood Match Pitch
PDF
QH_SalesPitch (2).pdf
PPTX
Roommatefax Inc. Pitch Deck
PPT
Neighborhood Match - Top 10 Team - Stanford Venture Lab 2012
PDF
Usalytics.pitch.v3.1
PPTX
Listing Presentation St. Charles IL(1)
PPTX
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
PDF
Peppy walls business model
PDF
Internet Marketing, EO Accelerator Presentation
PDF
GAMING INDUSTRY ENABLER. WE PROVIDE WHITE LABEL & CUSTOM GAMES PLATFORMS
PPTX
Today's Renter
PPTX
2016 Technology in the Vacation Rental Industry
PPTX
6 Top Real Estate Managed Analytics Service Providers.pptx
PPTX
Reocon social media power of lead generation_1-29-2012
PPTX
VerifyPro: A real estate management pitch deck
PDF
TripleLift: Preparing for a New Programmatic Ad-Tech World
PPTX
Tipping Point for CRE Tech - Brandon Weber, VTS
PDF
BSSML17 - Introduction, Models, Evaluations
PPTX
Atlas Arkansas What Would Google Do?
PPTX
ZingClick- Innovating solutions
Neighborhood Match Pitch
QH_SalesPitch (2).pdf
Roommatefax Inc. Pitch Deck
Neighborhood Match - Top 10 Team - Stanford Venture Lab 2012
Usalytics.pitch.v3.1
Listing Presentation St. Charles IL(1)
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
Peppy walls business model
Internet Marketing, EO Accelerator Presentation
GAMING INDUSTRY ENABLER. WE PROVIDE WHITE LABEL & CUSTOM GAMES PLATFORMS
Today's Renter
2016 Technology in the Vacation Rental Industry
6 Top Real Estate Managed Analytics Service Providers.pptx
Reocon social media power of lead generation_1-29-2012
VerifyPro: A real estate management pitch deck
TripleLift: Preparing for a New Programmatic Ad-Tech World
Tipping Point for CRE Tech - Brandon Weber, VTS
BSSML17 - Introduction, Models, Evaluations
Atlas Arkansas What Would Google Do?
ZingClick- Innovating solutions
Ad

Recently uploaded (20)

PDF
A comparative analysis of optical character recognition models for extracting...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
1. Introduction to Computer Programming.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Getting Started with Data Integration: FME Form 101
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Approach and Philosophy of On baking technology
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A comparative analysis of optical character recognition models for extracting...
MIND Revenue Release Quarter 2 2025 Press Release
1. Introduction to Computer Programming.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Getting Started with Data Integration: FME Form 101
NewMind AI Weekly Chronicles - August'25-Week II
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Approach and Philosophy of On baking technology
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MYSQL Presentation for SQL database connectivity
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation_ Review paper, used for researhc scholars
Network Security Unit 5.pdf for BCA BBA.
Digital-Transformation-Roadmap-for-Companies.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
“AI and Expert System Decision Support & Business Intelligence Systems”
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Advanced methodologies resolving dimensionality complications for autism neur...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...

Overview of Data Science at Zillow

  • 1. 11 DATA SCIENCE AT ZILLOW Imri Sofer, Senior Data Scientist
  • 2. 2 Zillow Group’s mission TO BUILD THE WORLD’S LARGEST, MOST TRUSTED AND VIBRANT HOME-RELATED MARKETPLACE
  • 3. 3 Zillow’s marketplace PROPERTY MANAGERS & LANDLORDS BUYERS & SELLERS RENTERS HOMEOWNERS REAL ESTATE AGENTS MORTGAGE PROVIDERS
  • 7. 7 Zillow Group two step business model: 1. Make amazing products to attract users 2. Professionals pay to show case themselves.
  • 9. 9 The largest portfolio of real estate brands CONSUMER BRANDS BUSINESS BRANDS
  • 10. 10 Zillow Group’s audience continues to grow MONTHLY UNIQUE USERS Quarterly average (Millions) 0 20 40 60 80 100 120 140 160 180 Seasonal peak of 171M Unique visitors in May 2016
  • 11. 1111 Why is data science important to Zillow? Because Zillow is data
  • 12. 12 Zillow is data - Our product is driven by data - The largest most comprehensive housing data (Breadth and depth). - Over 65 million have been updated by users. - Our product generates data - 2MM Reviews of agents. - More than 300,000 lender reviews. - 1TB of user activity every day. - Data is our product - Users come to Zillow because they trust our housing data. - Users want to find a trusted agent, and lender that provide great rates and services. - We provide data for free for academic/institutional researchers. - Zillow.com/data – free consumer data (Zillow home value index is available at a monthly frequency for the nation through states, to neighborhoods.)
  • 13. 13 Data Science and Engineering at Zillow Clam Bake Beach Day, Aug 2016, at Golden Gardens Park in Seattle, WA
  • 14. 14 Machine Learning at Zillow Home Valuation • Zestimate • Zestimate Forecast • Zillow Home Value Index • Rent Zestimate • Zillow Rent Index • Pricing Tool • Best Time to List B2B • Ad Campaigns • Agent segmentation • Search Engine Marketing (SEM) Computer Vision • Videos • Photos User Profiles • Persona Predictions • Journey location prediction • Lender Recommendations Recommendations • Home recommendation • Similar homes • New regions to explore • Explain recommendations
  • 15. 15 Machine Learning at Zillow • Example page Home Valuation • Zestimate • Zestimate Forecast • Rent Zestimate • Pricing Tool • Best Time to List • Zillow Home Value Index • Zillow Rent Index example page
  • 16. 16 Zestimate Goals: • High Accuracy • Low Bias • Independent • Stable over time. • Robust to outliers. • High coverage (Over 100 million homes currently) • Able to respond to user fact changes
  • 17. 17 Challenges with the Zestimate • Some listings are missing features: How do we deal with missing data? • Some listings have corrupted features (e.g. 28 bathrooms): How do we identify those? • Some sale prices do not reflect the value of the home(e.g. a parent sales to his child): how do we deal with outliers? • Feature engineering: How can we translate previous sales to meaningful features? • How do we identify the places where the model needs to be improve?
  • 18. 18 Machine Learning at Zillow Home Valuation • Zestimate • Zestimate Forecast • Zillow Home Value Index • Rent Zestimate • Zillow Rent Index • Pricing Tool • Best Time to List Computer Vision • Videos • Photos
  • 19. 19 Computer Vision at Zillow • Images and videos play a big role in helping people buy/rent homes • Recent deep-learning advancements for CV
  • 20. 20 Let Zillow See • As of now, our Zestimates are mainly based on location and size of the properties and they do not consider the quality. • Tax assessment might carry house quality information up to some extent but that’s not enough. • For example, an interior upgrade would not change the tax assessment in most cases if not all
  • 21. 21 • We train a deep convolutional neural network (CNN) to estimate quality. Deep Convolutional Neural Network Zestimate
  • 22. 22 Image quality scores (prediction) [0-3] [3-7] [7-10]
  • 23. 23 Machine Learning at Zillow Home Valuation • Zestimate • Zestimate Forecast • Zillow Home Value Index • Rent Zestimate • Zillow Rent Index • Pricing Tool • Best Time to List Computer Vision • Videos • Photos Recommendations • Home recommendation • Similar homes • New regions to explore • Explain recommendations
  • 25. 25 Home Recommendations • Our goal is to show users the homes that are relevant to them. Email When viewing a home Ranking search results
  • 26. 26 Email Recommendation • Goal: Take past user activity and generate relevant recommendations for new and existing listings. • Challenges: • How do we transform user activity into a vector of features? • What do we want to optimize for? Clicks? Dwell time? Saves? • What should we do when users don’t have a browsing history (cold start)? • How can we scale the model to rank 2.5MM homes for 50M buyers? Most recommendation algorithms are not built for this problem (Netlifx has 5000 movies in its catalog)
  • 27. 27 • user_id listing_id like • 12 5 1 • 12 34 0 • 12 567 1 • 144 5 0 • 144 34 0 • 1550 567 1
  • 30. 30 How can we generate meaningful features? • Date user_id listing_id f1 f2 ... f50 like • 2017-01-02 12 5 0.89 0.3 0.6 0 • 2017-01-09 12 34 0.90 0.1 0.1 0 • 2017-01-29 12 567 0.82 0.8 0.1 1 • 2017-01-02 144 5 0.19 0.9 0.9 0 • 2017-02-20 144 34 0.40 0.3 0.8 0 • 2017-02-03 1550 567 0.99 0.9 0.8 1
  • 31. 31 Machine Learning at Zillow Home Valuation • Zestimate • Zestimate Forecast • Zillow Home Value Index • Rent Zestimate • Zillow Rent Index • Pricing Tool • Best Time to List B2B • Ad Campaigns • Agent segmentation • Search Engine Marketing (SEM) Computer Vision • Videos • Photos User Profiles • Persona Predictions • Journey location prediction • Lender Recommendations Recommendations • Home recommendation • Similar homes • New regions to explore • Explain recommendations
  • 32. 32 Tools • Spark (Scala and Python) • R • Python (numpy, scipy, sklearn, pandas) • Random forest • Linear, logistic, quantile regressions. • Deep neural nets. • Matrix Factorization • Etc. • AWS
  • 33. 33 Zillow Core Values • Own it. • Turn on the Lights. • ZG is a Team Sport. • Move Fast. Think Big. • Winning is Fun. • Act With Integrity
  • 34. 3434 We’re hiring! • Data Scientist, Computer Vision and Deep learning • Software Engineer, Machine Learning • Data Scientist, Machine Learning • Internship opportunities across Analytics - Glassdoor reviews: Top 10 in Seattle Business Magazine 100 Best Companies (#3) - Glassdoor’s Employees’ Choice Best Places to Work; Glassdoor’s Best Benefits and Perks; www.zillow.com/jobs www.zillow.com/data-science

Editor's Notes

  • #2: Roadmap for today: Overview of company, data, and culture Introduce the Data Science and Engineering team and the problems we try to solve Leave time at the end for general Q&A
  • #3: Zillow was founded ten years ago with a simple but incredibly ambitious mission: To build the world’s largest, most trusted and most vibrant home-related marketplace. What this means is that we’re a company which creates a marketplace, and a marketplace has consumers and practitioners., We’re not a brokerage, not an agent, not an MLS; We are creating a marketplace – a place where consumers and producers congregate to conduct commerce with one another.
  • #5: For buyers: - We help buyers understand the state of the marketplace, what can they afford provide them information about each and every listing recommend homes for them, and alert them when a new relevant listing came to market Help them to price a listing. Help them to chose an agent based on rating and number of sales. For sellers: Help them to price their home. See how many people view it online. Connect them to an agent to help them sell, or let them sell by themselves.
  • #6: For agents, lenders: - provide a way to connect with new clients, and to demonstrate their success.
  • #7: A few years ago Zillow went into rentals and today it’s the leading site in this category in the US.
  • #9: Here on the bottom right we can see where agents have an opportunity to connect with buyers.
  • #10: Ten years ago, we were just Zillow, but our brand portfolio grew over time and reflects our mission. Each brand is striving to empower the consumers through transparency. Zillow, Trulia and Hotpads focuses on homes and rentals nation wide. StreetEasy and Naked Apartments focus on NYC. Business brands: Mortgage quotes/rates (Mortech), transaction platform (dotloop)
  • #11: Huge user base. 30MM rental shopper per month. First in real estate class - double from our largest competitor (Realtor.com ) 78% Market share of all mobile exclusive visitors to real estate category. In July - Half a billion homes were viewed on Zillow Mobile (270/second) (?????) Mortgages – 35 million requests in last year
  • #12: Steven
  • #14: There are 21 people in the picture. We are actually 48 people now, and have 12 open positions. Our mission: We attack Zillow’s DS challenges. Today I’ll talk about the
  • #16: Start with demo Zestimate is what made Zillow so famous. It started on day 1, and it what differentiates us from our competitors. <go over list> Zillow Home Value index is a economic index derived from the Zestimate. Today it is used by large financial institution, organization and municipalities to understand the real estate market and help decision making. This means that Zestimate is not only helping individuals to value homes, it also help decision makers to understand the housing market.
  • #17: This is a supervised learning problem. Each home in our dataset, has a set of features associated with it and its sale price. Our goal is to predict the sale price using the features.
  • #20: David
  • #25: Netflix page is very personalized and tailored to the user interests. Each row gives a different way to organize movies. The first and created by the same model, which gets a collection of movies with a single attributes and rank them according to the user viewing habits. The second row is from a completely different model the rank similarity between movies. All these rows are ordered by a third model. - We would like to simplify the home buying experience and make it as easy a choosing a movie on Netflix.
  • #26: Each type of recommendation answers different needs. Email – We would like to send users alerts when their dream home comes on the market, or show them homes that they might wouldn’t consider. The challenge is how not to spam. When viewing a home, showing other similar homes that the user might like. When ranking search results, we need to chose the most relevant homes to go to the top of the list.
  • #28: In recommendation what we usually have is a set of user-item pairs and a corresponding label. The idea is that if we can predict whether a user would like a listing we could make good recommendations. This seems is a supervised learning problem. In real life it’s much more complicated. - How do we know if a user like an item? Most users don’t explicitly tells us. For example, most users don’t rate movies and like videos on youtube. Even when user tells us, it does not necessary means what we want it to mean. For instance, a user might not like a listing, but it was very relevant for him because at this stage she’s just exploring the market and she would like to understand what she can afford. So listings for homes we will never buy help us understand our options. The challenge with recommendation is that we never solve for the problem that we would like to solve. We only solve for a surrogate problem. So part of our work is to find the best surrogate problem to solve.
  • #30: We have a very large catalog. No of users is on the same order as the number of Items. No popular items. Block diagonal matrix
  • #31: To complicate things, we have features associated with the listings. And we have user activity. How can we translate that to features that are predictive of the outcome.
  • #34: Shown mission/brands/data – how do we get there Zillow culture - people Share people you like David – ZG is a team sport, turn on the lights (anonymous questions, wikis, open discussion) Steven –Winning Is Fun – competition, Move Fast Think Big (hackweeks)