SlideShare a Scribd company logo
Deep Learning World | Munich 2019
Machine Learning at WeWork:
Creating Community Through Graph Embeddings
PHYSICAL + DIGITAL DATA
Physical Space:
Use data to optimize how we construct,
maintain and design our spaces
On the Physical side, we use data to
optimize how we construct, maintain
and design our spaces
-reduce extraneous costs from
construction and building logistics
-encourage physical connections
through intelligent interior design
-maximize utilization
Digital Space:
Focus on helping people make connections
On the Digital, we focus mostly on helping
people make connections:
-promote their business
-request help from others
-simply make new friends and connections
In order to better facilitate connections, WeWork started an
ML team to focus on rec systems
In order to better facilitate connections, WeWork started an
ML team to focus on rec systems
In order to better facilitate connections, WeWork started an
ML team to focus on rec systems
-personalized newsfeed
-text classifiers
-entity extraction
-onboarding skills suggestion
-conference rooms recommender
-experiment platform
Graph Embeddings with node2vec
Or: How to recommend members for fun and profit
Graph Embeddings with node2vec
1) Business Use Case
2) Data
3) Model
Use Case: Member Needs
Our members have broadly expressed the desire to:
-meet other members in the community
-make social and professional connections
-feel a sense of “belonging”
Use Case: Member Needs
We are thus interested in facilitating the meeting of members
Use Case: Member Needs
How do we create an intelligent member-to-member
recommendation service?
Data Structure: Member Knowledge Graph
Member Knowledge Graph
A central repository of information about our members
Member Knowledge Graph
A central repository of information about our members
Member Knowledge Graph
A central repository of information about our members
Member Knowledge Graph
A central repository of information about our members
Member Knowledge Graph
We collect data from:
-member profiles
-member interactions on the app
-posts you’ve liked
-events you’ve bookmarked
-community team notes
-external data sources that we match to our internal data
Member Knowledge Graph
Member Knowledge Graph
All this information is expressed in graph form
Member Knowledge Graph
All this information is expressed in graph form
Sleeveless shirts
Laundry
The color blue
Motorbikes
Member Knowledge Graph
All this information is expressed in graph form
Cell phones
Jackets
The color blue
Member Knowledge Graph
Connecting members through shared skills or interests
The color blue
Member Knowledge Graph
Connecting members through shared skills or interests
The color blue
Member Knowledge Graph
Connecting members through shared skills or interests
Laundry
The color blue
The color blue
Member Knowledge Graph
We assume that people who have lots of shared skills or interests may
be good recommendations for each other
Laundry
The color blue
The color blue
Member Knowledge Graph
How do we calculate how member similarity using common skills
or interests?
Laundry
The color blue
The color blue
Embeddings
Word Embeddings for Text
Learn vector representations of words that capture semantic meanings
Word Embeddings for Text
Process: train a neural network
Outcome: each word can be represented by an N-dimensional vector
Word Embeddings for Text
Ultimately the results of this model can be used for many NLP tasks
such as word similarity, clustering, classification
Node Embeddings for Graphs
Can we do something similar for graph networks?
Node Embeddings for Graphs
Instead of words we have nodes
Each node can be represented with a
vector after training a neural
network model
Node Embeddings for Graphs
Node = person
Edge = Social interaction between 2 people
OR
Edge = Similarity between 2 people’s skills
OR
Edge = Similarity between 2 people’s interests
OR
Edge = Some other function…
Node Embeddings for Graphs
Nodes with similar vectors
(measured by something like cosine
similarity) should be clustered close
together
Node Embeddings for Graphs
Luckily, someone’s already written a paper about this…
word2vec vs Node2vec
word2vec:
Given a corpus, maximize P(next word | context words)
word2vec vs Node2vec
word2vec:
Given a corpus, maximize P(next word | context words)
CBOW
word2vec vs Node2vec
word2vec:
Given a corpus, maximize P(next word | context words)
CBOW
Skip-gram
word2vec vs Node2vec
node2vec:
For node2vec, we want to maximize the log-probability of
observing a network neighborhood Ns(u) for a node u
conditioned on its feature representation
word2vec vs Node2vec
node2vec:
In order to make this problem tractable, we assume:
- Conditional independence of neighborhood nodes
word2vec vs Node2vec
node2vec:
In order to make this problem tractable, we assume:
- Symmetry in feature space (a source node and neighborhood node
have a symmetric effect over each other)
- Conditional likelihood of every node pair written as a softmax function
word2vec vs Node2vec
node2vec:
With these two assumptions, original function simplifies to:
word2vec vs Node2vec
node2vec:
If we know the representation for node v (the gray node), can we
predict its neighborhood (x1, x2, x3)?
node2vec Algorithm
Problem:
There’s no obvious way to identify separate neighborhoods
in a graph like you can identify sentences in a document
node2vec Algorithm
Solution:
Simulate many random walks around each node to define
possible “neighborhoods” around the node
node2vec Algorithm
Each walk is a directed subgraph analogous to a “sentence”
in a text corpus
node2vec Algorithm
From the graph on the left we can create multiple potential
neighborhoods and use them as “training data”
node2vec Algorithm
In order to efficiently explore many possible neighborhoods of node u, the
random walk algorithm takes two parameters P and Q
Determines how ”locally” or “globally” the walk explores the neighborhood of
node u
High P à less likely to revisit an already-visited node
High Q à more likely to visit close-by nodes
node2vec Algorithm
Combinations of P, Q can emulate different sampling strategies:
Breadth First Search: Neighborhood is restricted to immediate neighbors of
the node u
Depth First Search: Neighborhood consists of nodes sampled at increasing
distances from u
node2vec
The node2vec algorithm can be decomposed into 3 steps:
a) Precompute transition probabilities for the random walk simulation
b) For every node, simulate r random walks of fixed length l
c) Feed random walks into a word2vec model and solve with SGD
node2vec
Once we train a model and find embeddings for each node we can do:
- Clustering using node embeddings
- Node classification
- Link prediction
- Community Detection
Clustering off node2vec
For each WeWork location we can run node2vec on its social graph…
Clustering off node2vec
Using the data in the Member Knowledge Graph…
Laundry
The color blue
The color blue
Clustering off node2vec
Map every member to a vector…
Clustering off node2vec
Then retrieve the most similar members for each member…
Clustering off node2vec
And use this to power different kinds of member recommendations
Clustering off node2vec
And use this to power different kinds of member recommendations
-General member recommendations during onboarding
Clustering off node2vec
And use this to power different kinds of member recommendations
-General member recommendations during onboarding
-Facilitated introductions between WeWork Labs members
Thanks for Listening!

More Related Content

PDF
Immediately Sales Deck
PPTX
ProformaTech: Zuora Product Showcase
PPTX
Office 365 Sales Presentation
PDF
Conversational AI and Chatbot Integrations
PDF
Ideas & Inspiration: Getting Started & Driving Success With Power Platform At...
PDF
apidays London 2022 - The State of Banking APIs 2022, Mark Boyd, Platformable
PDF
UBER: THE TRANSPORTATION VIRUS
PDF
apidays LIVE London 2021 - From Open Banking to Embedded Finance by Simon Tor...
Immediately Sales Deck
ProformaTech: Zuora Product Showcase
Office 365 Sales Presentation
Conversational AI and Chatbot Integrations
Ideas & Inspiration: Getting Started & Driving Success With Power Platform At...
apidays London 2022 - The State of Banking APIs 2022, Mark Boyd, Platformable
UBER: THE TRANSPORTATION VIRUS
apidays LIVE London 2021 - From Open Banking to Embedded Finance by Simon Tor...

What's hot (20)

PDF
Ling Shou Tong: Alibaba’s Next Innovative Disruptor?
PDF
Microsoft 365 business presentation
PDF
150birds Sales Deck - Omnichannel Marketing Platform
PPTX
Product Growth Strategy for SaaS
PPTX
Digital and Innovation Strategies for the Infrastructure Industry: Tim McManu...
PDF
Pitch Deck Teardown: Transcend's $20M Series B deck
PDF
LinkedIn sales deck
PDF
Generative AI for the rest of us
PPTX
Power BI Overview
PDF
How RPA Technology is Automating HR to Save Time & Increase Productivity
PDF
AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...
PDF
New Product Marketing Powerpoint Presentation Slides
PDF
Pipedrive - NOAH16 Berlin
PDF
Company Presentation (Stripe)
PDF
Ibm's global ai adoption index 2021 executive summary
PDF
Marlabs Capabilities Overview: Guidewire Services
PDF
Saigon Technology's business profile - Leading software development outsourci...
PPTX
Go-to-Market Strategy
PDF
Pitching Microsoft 365
Ling Shou Tong: Alibaba’s Next Innovative Disruptor?
Microsoft 365 business presentation
150birds Sales Deck - Omnichannel Marketing Platform
Product Growth Strategy for SaaS
Digital and Innovation Strategies for the Infrastructure Industry: Tim McManu...
Pitch Deck Teardown: Transcend's $20M Series B deck
LinkedIn sales deck
Generative AI for the rest of us
Power BI Overview
How RPA Technology is Automating HR to Save Time & Increase Productivity
AI and ML Series - Leveraging Generative AI and LLMs Using the UiPath Platfor...
New Product Marketing Powerpoint Presentation Slides
Pipedrive - NOAH16 Berlin
Company Presentation (Stripe)
Ibm's global ai adoption index 2021 executive summary
Marlabs Capabilities Overview: Guidewire Services
Saigon Technology's business profile - Leading software development outsourci...
Go-to-Market Strategy
Pitching Microsoft 365
Ad

Similar to Creating Community at WeWork through Graph Embeddings with node2vec - Karry Lu (20)

PPTX
Data Structure Graph DMZ #DMZone
PDF
Ego net facebook data analysis
ODP
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
PPTX
K anonymity for crowdsourcing database
PPTX
The Spring 2018 Undergraduate Symposium Poster
PDF
Neo4j MeetUp - Graph Exploration with MetaExp
PDF
Content-based link prediction
PPTX
SNAwithNeo4j
PPTX
Scalable constrained spectral clustering
PPT
GIS in the Rockies
PDF
Social Friend Overlying Communities Based on Social Network Context
PDF
cs224w-79-final
PDF
1 chayes
PPTX
Ariadne's Thread -- Exploring a world of networked information built from fre...
PDF
IEEE 2014 ASP.NET with C# Projects
PDF
IEEE 2014 ASP.NET with C# Projects
PDF
Virtual lab - Routing in Mobile Adhoc Networks
PPTX
Apache Spark GraphX highlights.
PPTX
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
PPTX
A Novel Target Marketing Approach based on Influence Maximization
Data Structure Graph DMZ #DMZone
Ego net facebook data analysis
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
K anonymity for crowdsourcing database
The Spring 2018 Undergraduate Symposium Poster
Neo4j MeetUp - Graph Exploration with MetaExp
Content-based link prediction
SNAwithNeo4j
Scalable constrained spectral clustering
GIS in the Rockies
Social Friend Overlying Communities Based on Social Network Context
cs224w-79-final
1 chayes
Ariadne's Thread -- Exploring a world of networked information built from fre...
IEEE 2014 ASP.NET with C# Projects
IEEE 2014 ASP.NET with C# Projects
Virtual lab - Routing in Mobile Adhoc Networks
Apache Spark GraphX highlights.
Novel Graph Modeling Framework for Feature Importance Determination in Unsupe...
A Novel Target Marketing Approach based on Influence Maximization
Ad

More from Rising Media Ltd. (20)

PDF
Data Science at Roche: From Exploration to Productionization - Frank Block
PDF
Cost-Effective Personalisation Platform for 30M Users of Ringier Axel Springe...
PDF
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan Hamed
PDF
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
PDF
Data Science Development Lifecycle - Everyone Talks About it, Nobody Really K...
PDF
More than 10 Blue Links: Advanced-Level SERP Optimisation
PDF
How to Get Great Results Across Every Marketing Channel
PDF
Don’t Freak Out! Tips for Mobile and Voice Search
PDF
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
PDF
Prescriptive ohne Predictive: Regression ist noch nicht tot! ROMI bei Unitymedia
PDF
Reinforcement Learning - Learning from Experience like a Human
PDF
Mindful Analytics - Wie Achtsamkeit uns noch besser macht
PDF
Data Science Development with Impact
PPTX
Predictive Analytics World for Business Deutschland 2018
PPTX
Predictive Analytics World for Business Germany 2018
PDF
The Centrality of a Detailed Understanding of your Audience
PDF
Der steinige Weg zum automatisierten Data Science Produkt – Empfehlungen und ...
PDF
Data Alchemy
PDF
SpiegelMining – Data Science auf Spiegel Online
PPTX
Predictive Analytics World for Industry 4.0 Munich
Data Science at Roche: From Exploration to Productionization - Frank Block
Cost-Effective Personalisation Platform for 30M Users of Ringier Axel Springe...
Uplift Modelling as a Tool for Making Causal Inferences at Shopify - Mojan Hamed
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
Data Science Development Lifecycle - Everyone Talks About it, Nobody Really K...
More than 10 Blue Links: Advanced-Level SERP Optimisation
How to Get Great Results Across Every Marketing Channel
Don’t Freak Out! Tips for Mobile and Voice Search
The Scout24 Data Landscape Manifesto: Building an Opinionated Data Platform
Prescriptive ohne Predictive: Regression ist noch nicht tot! ROMI bei Unitymedia
Reinforcement Learning - Learning from Experience like a Human
Mindful Analytics - Wie Achtsamkeit uns noch besser macht
Data Science Development with Impact
Predictive Analytics World for Business Deutschland 2018
Predictive Analytics World for Business Germany 2018
The Centrality of a Detailed Understanding of your Audience
Der steinige Weg zum automatisierten Data Science Produkt – Empfehlungen und ...
Data Alchemy
SpiegelMining – Data Science auf Spiegel Online
Predictive Analytics World for Industry 4.0 Munich

Recently uploaded (20)

PDF
Fluorescence-microscope_Botany_detailed content
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPT
Reliability_Chapter_ presentation 1221.5784
PPT
Quality review (1)_presentation of this 21
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Foundation of Data Science unit number two notes
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
Fluorescence-microscope_Botany_detailed content
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Reliability_Chapter_ presentation 1221.5784
Quality review (1)_presentation of this 21
Clinical guidelines as a resource for EBP(1).pdf
Foundation of Data Science unit number two notes
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
climate analysis of Dhaka ,Banglades.pptx
Introduction to Knowledge Engineering Part 1
Miokarditis (Inflamasi pada Otot Jantung)
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction-to-Cloud-ComputingFinal.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck

Creating Community at WeWork through Graph Embeddings with node2vec - Karry Lu