SlideShare a Scribd company logo
GrapeVin
Anasuya Das
Insight Data Science
Can wine recommendations be crowd sourced?
Can wine recommendations be crowd sourced?
130,000 reviews
82,342 unique wines
655 unique users
Images from:
Can wine recommendations be crowd sourced?
http://GrapeVin.us
130,000 reviews
82,342 unique wines
655 unique users
Images from:
Content based recommender
Content based recommender
Deep ruby color leads to a nose of currant and cedar
chest. The complex body contains plum,black cherry
and just the right amount of oak.
Content based recommender
Deep ruby color leads to a nose of currant and cedar
chest. The complex body contains plum,black cherry
and just the right amount of oak.
Get most similar wines in
latent semantic space
Content based recommender
Deep ruby color leads to a nose of currant and cedar
chest. The complex body contains plum,black cherry
and just the right amount of oak.
Deep ruby color, currant, dark cherry, vanilla,
cedar, full bodied, oak
Get most similar wines in
latent semantic space
2/8/2015 mpld3 plot
-0.6 -0.4 -0.2 -0.0 0.2 0.4 0.6 0.8
-0.6
-0.4
-0.2
-0.0
0.2
0.4
0.6
0.8
Pinot noir
Champagne
component3
component 2
Bordeaux
Wine reviews cluster by the type of wine
Preserved're(notopic'organiza(on'in'early'visual'cortex'followin
'implica(ons'for'visual'func(on?'
How*does*V1*damage*affect*the*re;notopic*organiza;on*of*spared*
visual*cortex?**How*is*this*related*to*spared*visual*func;on?*
V2d'
V2v'
V3v'
V4'
V3a'
hMT+'
V3d'
pole'
calcarine'
V1'LO1/LO2'
V2d'
V2v'
V3v'
V4'
V3a'
hMT+'
V3d'
pole'calcarine'
V1'
LO1/LO2'
V2d'
V2v'
V3a'
hMT+'
V3d'
pole'
calcarine'
V1'LO1/LO2'
Background&
Anasuya&Das1,2,&Elisha&P.&Merriam3,&David&J.&Heeger3,&Krystel&R.&
1Flaum&Eye&Ins+tute&and&2Centre&for&Visual&Science,&University&of&Rochester,&3Center&for&N
Damage&to&the&primary&visual&cortex&(V1)&or&its&afferents&produces&a&severe&
loss& of& vision& in& the& contralateral& visual& hemifield& (VF),& called& cor+cal&
blindness& (CB).& Studies& in& nonGhuman& primates& with& lesions& to& V1& observe&
reduced& but& organized& re+notopic& ac+vity& in& the& lesioned& visual& cortex&
(Schmid,&2009&&&2010).&These&and&other&studies&suggest&that&residual&visual&
processing&is&mediated&by&either&spared&V1&or&extraGgeniculoGcalcarine&input&
to&extrastriate&visual&areas&(reviewed&in&Das&and&Huxlin,&2010).&Human&fMRI&
studies&of&CB&have&examined&single&subjects,&and&the&lesion&characteris+cs&of&
these&subjects&have&varied&across&studies.&The&re+notopic&organiza+on&of&the&
damaged&visual&cortex&in&humans&is&not&well&characterized.&
Group*1:*Re;notopic*organiza;on*is*preserved*around*lesion**
But$extrastriate$cortex$has$greater$representa1on$of$central$field$than$spared$V1(CB8)$&
Intact&hemisphere&Damaged&hemisphere& Damaged&hemisphere&
Ques;ons& V2d'
V2v'
V3v'
V4'
V3a'
hMT+'
V3d'
pole'calcarine'
V1'
LO1/LO2'
The story so far …
y*visual*cortex*is*
Questions?
grapeVin
Cross validation: How accurately can the star rating of
a recommended wine be predicted?
red,spicy, oak
Train classifier
Get similar wines
Predict ratings
Cross validation: How accurately can the star rating of
a recommended wine be predicted?
Train a Naive-Bayes to learn the user preference of each user.
Test on the reviews written by other user for recommended wines
Cross validation: How accurately can the star rating of
a recommended wine be predicted?
2/8/2015 mpld3 plot
1 2 3 4 5
0
10
20
30
40
50
60
Rating prediction accuracy
Numberofusers
0.2 0.4 0.6 0.8 1
Train a Naive-Bayes to learn the user preference of each user.
Test on the reviews written by other user for recommended wines
A
L
G
O
R
I
T
H
M
Remove stop words
Stem using wordNet
Synsets to defeat adverbs
Detect language and filter
Unicode, HTML scrubbing
Cosine similarity in lower dimensional
space
reviews
= n
Truncated SVD, a.k.a Latent Semantic Indexing on
132,000 reviews x 20,000 words
TfIdf - m columns
x11 x12 ….
x21 x22 ….
x31 x32 ….
~
n x r components
~
x11 x12 ….
x21 x22 ….
x31 x32 ….
Recommend top 10
most similar and
highest rated wines
2/8/2015 mpld3 plot
1 2 3 4 5
0
10
20
30
40
50
60
Crossvalidation: What is the probability of
recommending a wine that is already reviewed
P of recommending already reviewed wine
Numberofusers
0.2 0.4 0.6 0.8 1
Data:
SuperUsers: Top 100 users with the most reviews
Wines: Top 100 wines reviewed by SuperUsers
Method: For each wine reviewed by SuperUser
recommend 20 most similar wines based on
remaining 99 users
1. convert to lower case
2. remove stop words and punctuation and html code
3. deal with broken unicode characters and replace with plain text
4. detect language and only include reviews in english
5. lemmatize using wordNet- works only on nouns and adjectives
6. do synset and pertainyms to convert adverbs to adjectives
7. use bigrams
8. tokenize using term frequence- inverse document frequency
Text processing steps
1/29/2015 mpld3 plot
500 1,000 1,500 2,000 2,500 3,000
20
30
40
50
60
70
80
90
Explainedvariance(%)
number of components
Selecting k components
1. Incorporate keyword search
2. Scale up and increase inventory
3. Scrape wine pricing information and local availability
4. Analyze tasting notes by vineyard or geographical region
- does soil and climate really impact how wines taste
Future directions
grapeVin
grapeVin
grapeVin
grapeVin
grapeVin
grapeVin
grapeVin

More Related Content

PDF
Love in Bloom
PDF
Pizza%20Hut_team137.compressed
DOCX
What Does Fashion Mean to You
DOC
Arun K_Testing_Updated
PDF
The Brown Bear, the Treasure of the Carpathian Mountains
PPTX
Advertisment
DOCX
Films
PPTX
Procesos mentales.
Love in Bloom
Pizza%20Hut_team137.compressed
What Does Fashion Mean to You
Arun K_Testing_Updated
The Brown Bear, the Treasure of the Carpathian Mountains
Advertisment
Films
Procesos mentales.

Viewers also liked (12)

DOCX
Mathias cv
PDF
Acebillo Summary Folio
PPT
$martWorks Storyboard Activity Management 3
PDF
lani_minella_resume
PDF
20150204 阿里巴巴說明會分享
PDF
Profile-P.M
PPTX
Правильний вибір внз
DOCX
CV for LinkedIn15
PDF
0000039611-01
PPTX
Effective communication
PDF
Summer Time Blues
DOCX
Questionaire results (2)
Mathias cv
Acebillo Summary Folio
$martWorks Storyboard Activity Management 3
lani_minella_resume
20150204 阿里巴巴說明會分享
Profile-P.M
Правильний вибір внз
CV for LinkedIn15
0000039611-01
Effective communication
Summer Time Blues
Questionaire results (2)
Ad

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
KodekX | Application Modernization Development
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Approach and Philosophy of On baking technology
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Electronic commerce courselecture one. Pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Weekly Chronicles - August'25 Week I
“AI and Expert System Decision Support & Business Intelligence Systems”
KodekX | Application Modernization Development
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Building Integrated photovoltaic BIPV_UPV.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Dropbox Q2 2025 Financial Results & Investor Presentation
Approach and Philosophy of On baking technology
sap open course for s4hana steps from ECC to s4
Electronic commerce courselecture one. Pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
The AUB Centre for AI in Media Proposal.docx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Understanding_Digital_Forensics_Presentation.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Ad

grapeVin