SlideShare a Scribd company logo
Admixture of Poisson MRFs: A New Topic
Model with Word Dependencies
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon
April 30, 2015
* Presenter
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Analyzing Large Collections of Documents
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
Bag of Words Matrix:
- Removes order and
syntax information
- Unrealistic but powerful
2
Model
Computation
3
Summary
4
Collection of
Documents
1
)
Examples:
1. Research papers
2. News articles
3. Twitter posts
1
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Analyzing Large Collections of Documents
Model
Computation
3
Summary
4
Collection of
Documents
1
)
Examples:
1. Research papers
2. News articles
3. Twitter posts
1
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
Bag of Words Matrix:
- Removes order and
syntax information
- Unrealistic but powerful
2
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Analyzing Large Collections of Documents
Summary
4
Collection of
Documents
1
)
Examples:
1. Research papers
2. News articles
3. Twitter posts
1
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
Bag of Words Matrix:
- Removes order and
syntax information
- Unrealistic but powerful
2
Model
Computation
3
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Analyzing Large Collections of Documents
Collection of
Documents
1
)
Examples:
1. Research papers
2. News articles
3. Twitter posts
1
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
Bag of Words Matrix:
- Removes order and
syntax information
- Unrealistic but powerful
2
Model
Computation
3
Summary
4
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Research Paper Example - Top Words
Top Words
(Frequency)
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Collection of
Documents
Digital
Representation
Model
Computation Summary
Examples:
1. Research papers
2. News articles
3. Twitter posts
Bag of Words Matrix:
- Removes order and
syntax information
- Unrealistic but powerful
1
Titles of Research Papers:
1. Machine Learning (ICML, NIPS)
2. Communication Networks (INFOCOM)
3. Programming Languages (PLDI, CAV, POPL, OOPSLA)
2
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Research Paper Example - Topic Modeling
Topic
Modeling
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Collection of
Documents
Digital
Representation
Model
Computation Summary
Examples:
1. Research papers
2. News articles
3. Twitter posts
Bag of Words Matrix:
- Removes order and
syntax information
- Unrealistic but powerful
1
Titles of Research Papers:
1. Machine Learning (ICML, NIPS)
2. Communication Networks (INFOCOM)
3. Programming Languages (PLDI, CAV, POPL, OOPSLA)
2
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Applications for Topic Modeling
Applications
1. Summarize/Visualize [Hall et al. 2008]
2. Word sense disambiguation [Boyd-Graber et al. 2007]
3. Multi-lingual understanding [Mimno et al. 2009]
4. Information retrieval [Wei & Croft 2006]
Different domains
1. Genetics [Pritchard et al. 2000 (14,000 citations)]
2. Computer vision [Li et al. 2010]
3. Social networks [Airoldi et al. 2008]
4. Social science surveys [Roberts et al. 2014]
5. Social E-commerce [Hu et al. 2014]
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Brief History - Latent Semantic Analysis (LSA)
2
k p
n
k
k k
U VT
Σ
Singular Value
Decomposition
3
Low Dimensional
Document
Representation
“Latent Topic”
4
1
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
n
p
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Brief History - Latent Semantic Analysis (LSA)
3
Low Dimensional
Document
Representation
“Latent Topic”
4
1
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
Singular Value
Decomposition
n
p
2
k p
n
k
k k
U VT
Σ
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Brief History - Latent Semantic Analysis (LSA)
“Latent Topic”
4
1
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
Singular Value
Decomposition
n
p
2
k p
n
k
k k
U VT
Σ
3
Low Dimensional
Document
Representation
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Brief History - Latent Semantic Analysis (LSA)
1
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
Singular Value
Decomposition
n
p
2
k p
n
k
k k
U VT
Σ
“Latent Topic”
4
3
Low Dimensional
Document
Representation
Positive and negative values difficult to interpret
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Brief History - Probabilistic Topic Models
3
Topic Weights
per Document
“Topics”
(Word weights)
4
1
Doc 1
Doc 2
Doc 3
Doc 4
...
“networks”
“learning”
“program
m
ing”
…
Digital
Representation
n
p
2
k p
n
k
Probabilistic Topic
Models
Related through
probability model
Probability vectors are much easier to interpret
LDA - Added Bayesian priors for regularization
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Comparison of 2D Projections
SVD dimensions are difficult
to interpret
APM has smooth
distribution compared to
LDA
SVD
LDA
APM
comm−net.6978
mach−learn.8925
prog−lang.6618
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Brief History - Extensions/Variants
1. Add time information [Blei & Lafferty 2006]
2. Add author information [Rosen-Zvi et al. 2004]
3. Add document category information [Mcauliffe & Blei 2008]
4. Automatically discover number of topics [Teh et al. 2006]
5. Model correlation between topics [Blei & Lafferty 2006]
6. . . .
7. . . .
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Brief History - Extensions/Variants
1. Add time information [Blei & Lafferty 2006]
2. Add author information [Rosen-Zvi et al. 2004]
3. Add document category information [Mcauliffe & Blei 2008]
4. Automatically discover number of topics [Teh et al. 2006]
5. Model correlation between topics [Blei & Lafferty 2006]
6. . . .
7. . . .
Previous models - topics only have weights for single words
Our model - topics have weights for pairs of words
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Interpreting Topics
LDA 3 topics
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Interpreting Topics
LDA 3 topics
LDA 6 topics
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Interpreting Topics
LDA 3 topics
LDA 30 topics
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Interpreting Topics
LDA 3 topics
LDA 30 topics
APM 3 topics
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Overview of APM
Admixture of Poisson MRFS
(APM)
Multinomial Admixture Poisson MRF
Gaussian
MRF
Mixture
LDA
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Overview of APM
Admixture of Poisson MRFS
(APM)
Multinomial Admixture Poisson MRF
Gaussian
MRF
Mixture
LDA
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Mixtures
Multiple sub-populations
The sub-populations are usually unknown a priori
Each individual from the population comes from exactly one
subpopulation
Figure source: Kalai, Moitra, and Valiant. Disentangling Gaussians. Communications
of the ACM. 2012.
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Admixtures
Mixtures - Draws from single
component distribution. (Top)
Admixtures - Draws from a
distribution whose parameters are a
convex combination of component
parameters. (Bottom)
x2
x1
"Documents"
Mixture
Components
x2
x1
Dense
"Topic"
Sparse
"Document"
Dense
"Document”
Sparse
"Topic"
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Overview of APM
Admixture of Poisson MRFS
(APM)
Multinomial Admixture Poisson MRF
Gaussian
MRF
Mixture
LDA
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Gaussian MRFs
Allows for dependencies between variables
What if the data dimension is large?
If dimension is 1000, 10002
/2 =500,000 parameters
Assume some conditional independence between variables.
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Independent PMRFIndependent PMRF
Count of Word 1
CountofWord2
0 2 4 6 8
8
6
4
2
0
1. Each conditional (”slice”)
of a PMRF is 1-D Poisson.
2. Distinct from Gaussian
MRF
3. Positive dependencies can
model word co-occurence.
Positive Dependency PMRFPMRF Positive Dependency
Count of Word 1
CountofWord2
0 2 4 6 8
8
6
4
2
0
Negative Dependency PMRFPMRF Negative Dependency
Count of Word 1
CountofWord2
0 2 4 6 8
8
6
4
2
0
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Poisson MRFs [Yang et al., 2012]
P(A | B, C) P(B | A, C) P(C | A, B)
P(A, B, C) ??
If we assume the node conditional distributions are Poisson,
does there exist a joint MRF distribution
that has these conditionals?
Poisson MRF joint distribution:
Pr
PMRF
(x | θ, Θ) ∝ exp θT
x + xT
Θx −
p
s=1
ln(xs!) .
Node conditionals are 1-D Poissons:
Pr(xs | x−s, θs, Θs) ∝ exp{ (θs + xT
Θs
ηs
) xs − ln(xs!) }.
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Overview of APM
Admixture of Poisson MRFS
(APM)
Multinomial Admixture Poisson MRF
Gaussian
MRF
Mixture
LDA
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Admixture of Poisson MRFs (APM) [Inouye et al. 2014]
APM replaces standard Multinomial with Poisson MRF
Pr
APM
(x, w, θ1...k
, Θ1...k
)
= Pr
PMRF
x ¯θ =
k
j=1
wj θj
, ¯Θ =
k
j=1
wj Θj
Pr
Dir
(w)
k
j=1
Pr(θj
, Θj
)
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
APM Algorithm
1. Optimization problem is not convex
2. Want to exploit parallel computing
3. Large optimization problem: APM has O(kp2) parameters
versus O(kp) for LDA
LDA(k = 5, p = 1000) ⇒ 5,000 parameters
APM(k = 5, p = 1000) ⇒ 5,000,000 parameters
APM(k = 5, NNZ(Θ) = 10 per word) ⇒ 50,000 free
parameters
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Parallel Alternating Newton-like Algorithm
1. Split the algorithm into alternating convex problems
arg min
Φ1,Φ2,··· ,Φp
−
1
n
p
s=1
tr(Ψs
Φs
) −
n
i=1
exp(zT
i Φs
wi ) +
p
s=1
λ vec(Φs
)1 1
arg min
w1,w2,··· ,wn∈∆k
−
1
n
n
i=1
ψT
i wi −
p
s=1
exp(zT
i Φs
wi )
where zi = [1 xT
i ]T
Ψs
= f (X, W)
φj
s = [θj
s (Θj
s )T
]T
ψi = f (X, Φ1...k
)
Φs
= [φ1
s φ2
s · · · φk
s ]
2. Subproblems in summation can be computed in parallel
3. Use fast Newton-like optimization method [Hsieh et al. 2014]
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Timing Results on Wikipedia Dataset (k = 5, λ = 0.5)
1
3.1
3.4
0.6
2.2 2.2
0
1
2
3
4
n = 20,000
p = 5,000
# of Words = 50M
n = 100,000
p = 5,000
# of Words = 133M
n = 20,000
p = 10,000
# of Words = 57M
Time(hrs)
APM Training Time on Wikipedia Dataset
1st Iter. Avg. Next 3 Iter.
Algorithm scales approximately as O(np2)
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Parallel Speedup
0
5
10
15
20
0 5 10 15 20
Speedup
# of MATLAB Workers
Parallel Speedup on BNC Dataset
Perfect Speedup
Actual Speedup
BNC dataset has n = 4049 and p = 1646
Speedup could be O(min(n, p)) on distributed system
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Evaluating APM: No Direct Evaluation of Edge Parameters
Previous metrics evaluate the similarity of word pairs
[Newman et al. 2010, Mimno et al. 2011, Aletras and Court 2013]
Averaged statistic for all 10
2 pairs of top words computed
Attempted to correlate with human judgment
Unlike previous topic models, APM explicitly models
dependencies between words
How can we semantically evaluate the parameters for these
dependencies?
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Evocation [Boyd-Graber et al. 2006]
Evocation denotes the idea of which words “evoke” or “bring
to mind” other words
Different types of evocation:
1. Rose - Flower (example)
2. Brave - Noble (kind)
3. Yell - Talk (manner)
4. Eggs - Bacon (co-occurence)
5. Snore - Sleep (setting)
6. Wet - Desert (antonymy)
7. Work - Lazy (exclusivity)
8. Banana - Kiwi (likeness)
Distinctive from word similarity or synonymy
Collected human scores for approximately 15% of word pairs
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Evocation Metric Illustration
Word Pair H M
w1 ↔ w2
w1 ↔ w3
w1 ↔ w4
w2 ↔ w3
w2 ↔ w4
w3 ↔ w4
w2 ↔ w3
w2 ↔ w4
w3 ↔ w4
w1 ↔ w3
w1 ↔ w2
w1 ↔ w4
Word Pair H M
w2 ↔ w4
w3 ↔ w4
w1 ↔ w3
w1 ↔ w4
Word Pair H M
Rank by model weights M Sum top-m human scores H
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Models for Comparison
APM: Admixture of Poisson MRFs
APM-LowReg: Very small regularization parameter
APM-HeldOut: Chooses λ from held-out documents
CTM: Correlated Topic Models
HDP: Hierarchical Dirichlet Process (Non-parametric)
LDA: Latent Dirichlet Allocation
RSM: Replicated Softmax (Undirected Topic Model)
RND: Random baseline
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Evocation Metric Results
k = 1 3 5 10 25 50 k = 1 3 5 10 25
Evoc-1 (Avg. Evoc. of Topics) Evoc-2 (Evoc. of Avg. Topic)
APM APM-LowReg APM-HeldOut CTM HDP LDA RSM RND
0
200
400
600
800
1000
1200
1400
1600
k = 1 3 5 10 25 50 k = 1 3 5
Evoc-1 (Avg. Evoc. of Topics) Evoc-2 (Ev
Evocation(m=50)
APM APM-LowReg APM-HeldOut CTM HDP LDA
0
200
400
600
800
1000
1200
1400
1600
k = 1 3 5 10 25 50 k = 1 3 5
Evoc-1 (Avg. Evoc. of Topics) Evoc-2 (E
Evocation(m=50)
APM APM-LowReg APM-HeldOut CTM HDP LDA
0
200
400
600
800
1000
1200
1400
1600
k = 1 3 5 10 25 50 k = 1 3 5
Evoc-1 (Avg. Evoc. of Topics) Evoc-2 (Ev
Evocation(m=50)
APM APM-LowReg APM-HeldOut CTM HDP LDA
5 10 25 50 k = 1 3 5 10 25 50
(Avg. Evoc. of Topics) Evoc-2 (Evoc. of Avg. Topic)
PM-LowReg APM-HeldOut CTM HDP LDA RSM RND
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Evocation Metric Top Word Pairs
Table: Top 20 Word Pairs for Best LDA
Human
Score
Human
Score
Human
Score
100 run.v ↔ car.n 38 woman.n ↔ man.n 100 telephone.n
82 teach.v ↔ school.n 38 give.v ↔ church.n 97 husband.n
69 school.n ↔ class.n 38 wife.n ↔ man.n 82 residential.a
63 van.n ↔ car.n 38 engine.n ↔ car.n 76 politics.n
51 hour.n ↔ day.n 35 publish.v ↔ book.n 75 steel.n
50 teach.v ↔ student.n 32 west.n ↔ state.n 75 job.n
44 house.n ↔ government.n 32 year.n ↔ day.n 75 room.n
44 week.n ↔ day.n 25 member.n ↔ give.v 72 aunt.n
38 university.n ↔ institution.n 25 dog.n ↔ animal.n 72 printer.n
38 state.n ↔ government.n 25 seat.n ↔ car.n 60 love.v
Word Pair Word Pair Wo
Table: Top 20 Word Pairs for Best APM
Human
Score
Human
Score
n.n ↔ man.n 100 telephone.n ↔ call.n 57 question.n ↔ answer.n
e.v ↔ church.n 97 husband.n ↔ wife.n 57 prison.n ↔ cell.n
e.n ↔ man.n 82 residential.a ↔ home.n 51 mother.n ↔ baby.n
e.n ↔ car.n 76 politics.n ↔ political.a 50 sun.n ↔ earth.n
h.v ↔ book.n 75 steel.n ↔ iron.n 50 west.n ↔ east.n
.n ↔ state.n 75 job.n ↔ employment.n 44 weekend.n ↔ sunday.n
r.n ↔ day.n 75 room.n ↔ bedroom.n 41 wine.n ↔ drink.v
.n ↔ give.v 72 aunt.n ↔ uncle.n 38 south.n ↔ north.n
g.n ↔ animal.n 72 printer.n ↔ print.v 38 morning.n ↔ afternoon.n
.n ↔ car.n 60 love.v ↔ love.n 38 engine.n ↔ car.n
Word Pair Word Pair Word Pair
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Current and Future Work
networksnetworks
learninglearning
basedbased
usingusing
analysisanalysis
networknetwork
wirelesswireless
datadata
modelmodel
multimulti
controlcontrol
efficientefficient
timetime
performanceperformance
routingrouting
distributeddistributed
optimaloptimal
algorithmsalgorithms
algorithmalgorithm
sensorsensor
traffictraffic
schedulingscheduling
highhigh
largelarge
multiplemultiple
mobilemobile
atmatm
packetpacket
delaydelay
allocationallocation
flowflow
protocolprotocol
accessaccess
multicastmulticast
energyenergy
channelchannel
realreal
scalescale
powerpower
locallocal
hochoc
raterate
randomrandom
serviceservice
evaluationevaluation
codingcoding
radioradio
bandwidthbandwidth
opticaloptical
speedspeed
videovideo
endend
peerpeer
resourceresource
cloudcloud
computingcomputing
congestioncongestion
hophop
distributiondistribution
contentcontent
cognitivecognitive
switchswitch
spectrumspectrum
switchingswitching
privacyprivacy
wdmwdm
layerlayer
streamingstreaming
locationlocation
queueingqueueing
engineeringengineering
inputinput
crosscross
areaarea
qualityquality
loadload
wavelengthwavelength
preservingpreserving
admissionadmission
assignmentassignment
reliablereliable
switchesswitches
macmac
faultfault
toleranttolerant
balancingbalancing
switchedswitched
varyingvarying
registerregister
widewide
centercenter
networksnetworks
learninglearning
basedbased
usingusing
analysisanalysis
networknetwork
wirelesswireless
datadata
modelmodel
multimulti
systemssystems
modelsmodels
timetime
neuralneural
objectobject
optimaloptimal
informationinformation
highhigh
bayesianbayesian
largelarge
optimizationoptimization
multiplemultiple
inferenceinference
linearlinear
nonnon
sparsesparse
clusteringclustering
estimationestimation
selectionselection
kernelkernel
supportsupport
stochasticstochastic
scalescale
gaussiangaussian
featurefeature
markovmarkov
processprocess
processesprocesses
randomrandom
codingcoding
classclass
decisiondecision
recognitionrecognition machinesmachines
predictionprediction
visualvisual
vectorvector
lowlow
supervisedsupervised
structuredstructured
policypolicy
treestrees
functionfunction
approximateapproximate
continuouscontinuous
semisemi
gradientgradient
reductionreduction
maximummaximum
latentlatent
dimensionaldimensional
matrixmatrix
convexconvex
propagationpropagation
marginmargin
graphicalgraphical
variablevariable
hiddenhidden
variationalvariational
tasktask
componentcomponent
mixturemixture
speechspeech
spectralspectral
rankrank
theoretictheoretic
neuronsneurons
fieldsfields
densitydensity
vlsivlsi
instanceinstance
analoganalog
montemonte
messagemessage
carlocarlo topictopic
labellabel
entropyentropy
neighborneighbor
nearnear
dirichletdirichlet
spikingspiking
seriesseries
beliefbelief
factorizationfactorization
dynamicaldynamical
partiallypartially
descentdescent
differencedifference
nearestnearest
dimensionalitydimensionality
passingpassing
completioncompletion principalprincipal
leastleast
boltzmannboltzmann
likelihoodlikelihood
squaressquares
observableobservable
networksnetworkslearninglearning
basedbased
usingusing
analysisanalysis
networknetworkwirelesswireless
datadata
modelmodel
multimulti
systemssystems
timetime
approachapproach
programmingprogramming
objectobjectdistributeddistributed
languagelanguage
designdesign
orientedoriented
systemsystem
informationinformation
highhigh
softwaresoftware
programsprograms
inferenceinference
verificationverification
checkingchecking
flowflow
codecode
typetype
realrealprogramprogram
languageslanguages
orderorder
machinemachine
levellevel
temporaltemporal
studystudy
domaindomain
virtualvirtual
logiclogic
generationgeneration
implementationimplementation
casecase
staticstatic
hybridhybrid
abstractabstract
structuresstructures
formalformal
higherhigher
sessionsession
specificspecific
collectioncollection
firstfirst
garbagegarbage
extendedextended
posterposter
aspectaspect
1. Visualization
2. Better inference of
parameters
3. Extension to other domains
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
Thanks for listening!
Admixture of Poisson MRFS
(APM)
Multinomial Admixture Poisson MRF
Gaussian
MRF
Mixture
LDA
David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs

More Related Content

PDF
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
PPTX
Analysis of Metadata and Topic Modeling for
POTX
LDA Beginner's Tutorial
PDF
Topic Modeling for Learning Analytics Researchers LAK15 Tutorial
PDF
Towards Automated Classification of Discussion Transcripts: A Cognitive Prese...
PDF
Automated Cognitive Presence Detection in Online Discussion Transcripts
PDF
Automated Content Analysis of Discussion Transcripts
PDF
Automated content analysis of cognitive presence: Improving the quality of in...
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
Analysis of Metadata and Topic Modeling for
LDA Beginner's Tutorial
Topic Modeling for Learning Analytics Researchers LAK15 Tutorial
Towards Automated Classification of Discussion Transcripts: A Cognitive Prese...
Automated Cognitive Presence Detection in Online Discussion Transcripts
Automated Content Analysis of Discussion Transcripts
Automated content analysis of cognitive presence: Improving the quality of in...

What's hot (20)

PPTX
An Evolution of Deep Learning Models for AI2 Reasoning Challenge
PPT
Opinion mining for social media and news items in Romanian
PDF
Practical machine learning - Part 1
PPTX
Question answering
PDF
Question Answering - Application and Challenges
PDF
Deep learning Type Inference for Dynamic Programming Languages
PDF
Open domain Question Answering System - Research project in NLP
PPTX
PDF
ESWC 2014 Tutorial part 3
PPT
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
PPTX
From TREC to Watson: is open domain question answering a solved problem?
PDF
LinkedUp kickoff meeting session 4
PPTX
Keynote reusability measurement and social community analysis from mooc con...
PDF
Research on Recommender Systems: Beyond Ratings and Lists
PPT
Fuschi current Research and Developments
PDF
Learning Analytics for Communities of Inquiry
PDF
A Novel Model of Cognitive Presence Assessment Using Automated Learning Analy...
PPTX
Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...
PDF
Challenges in transfer learning in nlp
DOC
Machine learning and Bioinformatics (IMC007)
An Evolution of Deep Learning Models for AI2 Reasoning Challenge
Opinion mining for social media and news items in Romanian
Practical machine learning - Part 1
Question answering
Question Answering - Application and Challenges
Deep learning Type Inference for Dynamic Programming Languages
Open domain Question Answering System - Research project in NLP
ESWC 2014 Tutorial part 3
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
From TREC to Watson: is open domain question answering a solved problem?
LinkedUp kickoff meeting session 4
Keynote reusability measurement and social community analysis from mooc con...
Research on Recommender Systems: Beyond Ratings and Lists
Fuschi current Research and Developments
Learning Analytics for Communities of Inquiry
A Novel Model of Cognitive Presence Assessment Using Automated Learning Analy...
Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...
Challenges in transfer learning in nlp
Machine learning and Bioinformatics (IMC007)
Ad

Viewers also liked (6)

PPTX
Topic modeling using big data analytics
PPTX
Project Proposal Topics Modeling (Ir)
PDF
Tutorial on Relationship Mining In Online Social Networks
PDF
Using Machine Learning to aid Journalism at the New York Times
PDF
Introduction à l'analyse de réseaux avec R
PDF
Optimal Transport between Copulas for Clustering Time Series
Topic modeling using big data analytics
Project Proposal Topics Modeling (Ir)
Tutorial on Relationship Mining In Online Social Networks
Using Machine Learning to aid Journalism at the New York Times
Introduction à l'analyse de réseaux avec R
Optimal Transport between Copulas for Clustering Time Series
Ad

Similar to Admixture of Poisson MRFs: A New Topic Model with Word Dependencies (20)

PDF
Make our Scientific Datasets Accessible and Interoperable on the Web
PPTX
Linked Open Data (LOD) part 1
PPTX
Neural Text Embeddings for Information Retrieval (WSDM 2017)
PDF
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
PDF
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
PPTX
Social Phrases Having Impact in Altmetrics - SOPHIA
PDF
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
PDF
Survey On Building A Database Driven Reverse Dictionary
PPT
Eprints Application Profile
PPT
Names project update
PDF
Toxic Comment Classification
PDF
Linked sensor data
PDF
Query-Focused Extractive Text Summarization for Multi-Topic Document
PDF
Profile-based Dataset Recommendation for RDF Data Linking
PDF
Mining knowledge graphs to map heterogeneous relations between the internet o...
PDF
The RDFIndex-MTSR 2013
PPT
Eprints Special Session - DC-2006, Mexico
PPT
Inter-university Upper atmosphere Global Observation NETwork (IUGONET)
PDF
Information_Retrieval_Models_Nfaoui_El_Habib
PPTX
A pragmatic view on Semantic Technologies
Make our Scientific Datasets Accessible and Interoperable on the Web
Linked Open Data (LOD) part 1
Neural Text Embeddings for Information Retrieval (WSDM 2017)
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Social Phrases Having Impact in Altmetrics - SOPHIA
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Survey On Building A Database Driven Reverse Dictionary
Eprints Application Profile
Names project update
Toxic Comment Classification
Linked sensor data
Query-Focused Extractive Text Summarization for Multi-Topic Document
Profile-based Dataset Recommendation for RDF Data Linking
Mining knowledge graphs to map heterogeneous relations between the internet o...
The RDFIndex-MTSR 2013
Eprints Special Session - DC-2006, Mexico
Inter-university Upper atmosphere Global Observation NETwork (IUGONET)
Information_Retrieval_Models_Nfaoui_El_Habib
A pragmatic view on Semantic Technologies

Recently uploaded (20)

PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Introduction to Data Science and Data Analysis
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
Global Data and Analytics Market Outlook Report
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
Managing Community Partner Relationships
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
Business Analytics and business intelligence.pdf
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PPTX
CYBER SECURITY the Next Warefare Tactics
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Leprosy and NLEP programme community medicine
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
DOCX
Factor Analysis Word Document Presentation
Acceptance and paychological effects of mandatory extra coach I classes.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Data Science and Data Analysis
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Global Data and Analytics Market Outlook Report
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
IMPACT OF LANDSLIDE.....................
Managing Community Partner Relationships
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Business Analytics and business intelligence.pdf
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
CYBER SECURITY the Next Warefare Tactics
ISS -ESG Data flows What is ESG and HowHow
Leprosy and NLEP programme community medicine
[EN] Industrial Machine Downtime Prediction
STERILIZATION AND DISINFECTION-1.ppthhhbx
Factor Analysis Word Document Presentation

Admixture of Poisson MRFs: A New Topic Model with Word Dependencies

  • 1. Admixture of Poisson MRFs: A New Topic Model with Word Dependencies David Inouye*, Pradeep Ravikumar, Inderjit Dhillon April 30, 2015 * Presenter David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 2. Analyzing Large Collections of Documents Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation Bag of Words Matrix: - Removes order and syntax information - Unrealistic but powerful 2 Model Computation 3 Summary 4 Collection of Documents 1 ) Examples: 1. Research papers 2. News articles 3. Twitter posts 1 David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 3. Analyzing Large Collections of Documents Model Computation 3 Summary 4 Collection of Documents 1 ) Examples: 1. Research papers 2. News articles 3. Twitter posts 1 Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation Bag of Words Matrix: - Removes order and syntax information - Unrealistic but powerful 2 David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 4. Analyzing Large Collections of Documents Summary 4 Collection of Documents 1 ) Examples: 1. Research papers 2. News articles 3. Twitter posts 1 Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation Bag of Words Matrix: - Removes order and syntax information - Unrealistic but powerful 2 Model Computation 3 David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 5. Analyzing Large Collections of Documents Collection of Documents 1 ) Examples: 1. Research papers 2. News articles 3. Twitter posts 1 Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation Bag of Words Matrix: - Removes order and syntax information - Unrealistic but powerful 2 Model Computation 3 Summary 4 David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 6. Research Paper Example - Top Words Top Words (Frequency) Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Collection of Documents Digital Representation Model Computation Summary Examples: 1. Research papers 2. News articles 3. Twitter posts Bag of Words Matrix: - Removes order and syntax information - Unrealistic but powerful 1 Titles of Research Papers: 1. Machine Learning (ICML, NIPS) 2. Communication Networks (INFOCOM) 3. Programming Languages (PLDI, CAV, POPL, OOPSLA) 2 David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 7. Research Paper Example - Topic Modeling Topic Modeling Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Collection of Documents Digital Representation Model Computation Summary Examples: 1. Research papers 2. News articles 3. Twitter posts Bag of Words Matrix: - Removes order and syntax information - Unrealistic but powerful 1 Titles of Research Papers: 1. Machine Learning (ICML, NIPS) 2. Communication Networks (INFOCOM) 3. Programming Languages (PLDI, CAV, POPL, OOPSLA) 2 David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 8. Applications for Topic Modeling Applications 1. Summarize/Visualize [Hall et al. 2008] 2. Word sense disambiguation [Boyd-Graber et al. 2007] 3. Multi-lingual understanding [Mimno et al. 2009] 4. Information retrieval [Wei & Croft 2006] Different domains 1. Genetics [Pritchard et al. 2000 (14,000 citations)] 2. Computer vision [Li et al. 2010] 3. Social networks [Airoldi et al. 2008] 4. Social science surveys [Roberts et al. 2014] 5. Social E-commerce [Hu et al. 2014] David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 9. Brief History - Latent Semantic Analysis (LSA) 2 k p n k k k U VT Σ Singular Value Decomposition 3 Low Dimensional Document Representation “Latent Topic” 4 1 Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation n p David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 10. Brief History - Latent Semantic Analysis (LSA) 3 Low Dimensional Document Representation “Latent Topic” 4 1 Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation Singular Value Decomposition n p 2 k p n k k k U VT Σ David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 11. Brief History - Latent Semantic Analysis (LSA) “Latent Topic” 4 1 Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation Singular Value Decomposition n p 2 k p n k k k U VT Σ 3 Low Dimensional Document Representation David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 12. Brief History - Latent Semantic Analysis (LSA) 1 Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation Singular Value Decomposition n p 2 k p n k k k U VT Σ “Latent Topic” 4 3 Low Dimensional Document Representation Positive and negative values difficult to interpret David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 13. Brief History - Probabilistic Topic Models 3 Topic Weights per Document “Topics” (Word weights) 4 1 Doc 1 Doc 2 Doc 3 Doc 4 ... “networks” “learning” “program m ing” … Digital Representation n p 2 k p n k Probabilistic Topic Models Related through probability model Probability vectors are much easier to interpret LDA - Added Bayesian priors for regularization David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 14. Comparison of 2D Projections SVD dimensions are difficult to interpret APM has smooth distribution compared to LDA SVD LDA APM comm−net.6978 mach−learn.8925 prog−lang.6618 David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 15. Brief History - Extensions/Variants 1. Add time information [Blei & Lafferty 2006] 2. Add author information [Rosen-Zvi et al. 2004] 3. Add document category information [Mcauliffe & Blei 2008] 4. Automatically discover number of topics [Teh et al. 2006] 5. Model correlation between topics [Blei & Lafferty 2006] 6. . . . 7. . . . David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 16. Brief History - Extensions/Variants 1. Add time information [Blei & Lafferty 2006] 2. Add author information [Rosen-Zvi et al. 2004] 3. Add document category information [Mcauliffe & Blei 2008] 4. Automatically discover number of topics [Teh et al. 2006] 5. Model correlation between topics [Blei & Lafferty 2006] 6. . . . 7. . . . Previous models - topics only have weights for single words Our model - topics have weights for pairs of words David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 17. Interpreting Topics LDA 3 topics David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 18. Interpreting Topics LDA 3 topics LDA 6 topics David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 19. Interpreting Topics LDA 3 topics LDA 30 topics David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 20. Interpreting Topics LDA 3 topics LDA 30 topics APM 3 topics David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 21. Overview of APM Admixture of Poisson MRFS (APM) Multinomial Admixture Poisson MRF Gaussian MRF Mixture LDA David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 22. Overview of APM Admixture of Poisson MRFS (APM) Multinomial Admixture Poisson MRF Gaussian MRF Mixture LDA David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 23. Mixtures Multiple sub-populations The sub-populations are usually unknown a priori Each individual from the population comes from exactly one subpopulation Figure source: Kalai, Moitra, and Valiant. Disentangling Gaussians. Communications of the ACM. 2012. David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 24. Admixtures Mixtures - Draws from single component distribution. (Top) Admixtures - Draws from a distribution whose parameters are a convex combination of component parameters. (Bottom) x2 x1 "Documents" Mixture Components x2 x1 Dense "Topic" Sparse "Document" Dense "Document” Sparse "Topic" David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 25. Overview of APM Admixture of Poisson MRFS (APM) Multinomial Admixture Poisson MRF Gaussian MRF Mixture LDA David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 26. Gaussian MRFs Allows for dependencies between variables What if the data dimension is large? If dimension is 1000, 10002 /2 =500,000 parameters Assume some conditional independence between variables. David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 27. Independent PMRFIndependent PMRF Count of Word 1 CountofWord2 0 2 4 6 8 8 6 4 2 0 1. Each conditional (”slice”) of a PMRF is 1-D Poisson. 2. Distinct from Gaussian MRF 3. Positive dependencies can model word co-occurence. Positive Dependency PMRFPMRF Positive Dependency Count of Word 1 CountofWord2 0 2 4 6 8 8 6 4 2 0 Negative Dependency PMRFPMRF Negative Dependency Count of Word 1 CountofWord2 0 2 4 6 8 8 6 4 2 0 David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 28. Poisson MRFs [Yang et al., 2012] P(A | B, C) P(B | A, C) P(C | A, B) P(A, B, C) ?? If we assume the node conditional distributions are Poisson, does there exist a joint MRF distribution that has these conditionals? Poisson MRF joint distribution: Pr PMRF (x | θ, Θ) ∝ exp θT x + xT Θx − p s=1 ln(xs!) . Node conditionals are 1-D Poissons: Pr(xs | x−s, θs, Θs) ∝ exp{ (θs + xT Θs ηs ) xs − ln(xs!) }. David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 29. Overview of APM Admixture of Poisson MRFS (APM) Multinomial Admixture Poisson MRF Gaussian MRF Mixture LDA David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 30. Admixture of Poisson MRFs (APM) [Inouye et al. 2014] APM replaces standard Multinomial with Poisson MRF Pr APM (x, w, θ1...k , Θ1...k ) = Pr PMRF x ¯θ = k j=1 wj θj , ¯Θ = k j=1 wj Θj Pr Dir (w) k j=1 Pr(θj , Θj ) David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 31. APM Algorithm 1. Optimization problem is not convex 2. Want to exploit parallel computing 3. Large optimization problem: APM has O(kp2) parameters versus O(kp) for LDA LDA(k = 5, p = 1000) ⇒ 5,000 parameters APM(k = 5, p = 1000) ⇒ 5,000,000 parameters APM(k = 5, NNZ(Θ) = 10 per word) ⇒ 50,000 free parameters David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 32. Parallel Alternating Newton-like Algorithm 1. Split the algorithm into alternating convex problems arg min Φ1,Φ2,··· ,Φp − 1 n p s=1 tr(Ψs Φs ) − n i=1 exp(zT i Φs wi ) + p s=1 λ vec(Φs )1 1 arg min w1,w2,··· ,wn∈∆k − 1 n n i=1 ψT i wi − p s=1 exp(zT i Φs wi ) where zi = [1 xT i ]T Ψs = f (X, W) φj s = [θj s (Θj s )T ]T ψi = f (X, Φ1...k ) Φs = [φ1 s φ2 s · · · φk s ] 2. Subproblems in summation can be computed in parallel 3. Use fast Newton-like optimization method [Hsieh et al. 2014] David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 33. Timing Results on Wikipedia Dataset (k = 5, λ = 0.5) 1 3.1 3.4 0.6 2.2 2.2 0 1 2 3 4 n = 20,000 p = 5,000 # of Words = 50M n = 100,000 p = 5,000 # of Words = 133M n = 20,000 p = 10,000 # of Words = 57M Time(hrs) APM Training Time on Wikipedia Dataset 1st Iter. Avg. Next 3 Iter. Algorithm scales approximately as O(np2) David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 34. Parallel Speedup 0 5 10 15 20 0 5 10 15 20 Speedup # of MATLAB Workers Parallel Speedup on BNC Dataset Perfect Speedup Actual Speedup BNC dataset has n = 4049 and p = 1646 Speedup could be O(min(n, p)) on distributed system David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 35. Evaluating APM: No Direct Evaluation of Edge Parameters Previous metrics evaluate the similarity of word pairs [Newman et al. 2010, Mimno et al. 2011, Aletras and Court 2013] Averaged statistic for all 10 2 pairs of top words computed Attempted to correlate with human judgment Unlike previous topic models, APM explicitly models dependencies between words How can we semantically evaluate the parameters for these dependencies? David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 36. Evocation [Boyd-Graber et al. 2006] Evocation denotes the idea of which words “evoke” or “bring to mind” other words Different types of evocation: 1. Rose - Flower (example) 2. Brave - Noble (kind) 3. Yell - Talk (manner) 4. Eggs - Bacon (co-occurence) 5. Snore - Sleep (setting) 6. Wet - Desert (antonymy) 7. Work - Lazy (exclusivity) 8. Banana - Kiwi (likeness) Distinctive from word similarity or synonymy Collected human scores for approximately 15% of word pairs David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 37. Evocation Metric Illustration Word Pair H M w1 ↔ w2 w1 ↔ w3 w1 ↔ w4 w2 ↔ w3 w2 ↔ w4 w3 ↔ w4 w2 ↔ w3 w2 ↔ w4 w3 ↔ w4 w1 ↔ w3 w1 ↔ w2 w1 ↔ w4 Word Pair H M w2 ↔ w4 w3 ↔ w4 w1 ↔ w3 w1 ↔ w4 Word Pair H M Rank by model weights M Sum top-m human scores H David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 38. Models for Comparison APM: Admixture of Poisson MRFs APM-LowReg: Very small regularization parameter APM-HeldOut: Chooses λ from held-out documents CTM: Correlated Topic Models HDP: Hierarchical Dirichlet Process (Non-parametric) LDA: Latent Dirichlet Allocation RSM: Replicated Softmax (Undirected Topic Model) RND: Random baseline David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 39. Evocation Metric Results k = 1 3 5 10 25 50 k = 1 3 5 10 25 Evoc-1 (Avg. Evoc. of Topics) Evoc-2 (Evoc. of Avg. Topic) APM APM-LowReg APM-HeldOut CTM HDP LDA RSM RND 0 200 400 600 800 1000 1200 1400 1600 k = 1 3 5 10 25 50 k = 1 3 5 Evoc-1 (Avg. Evoc. of Topics) Evoc-2 (Ev Evocation(m=50) APM APM-LowReg APM-HeldOut CTM HDP LDA 0 200 400 600 800 1000 1200 1400 1600 k = 1 3 5 10 25 50 k = 1 3 5 Evoc-1 (Avg. Evoc. of Topics) Evoc-2 (E Evocation(m=50) APM APM-LowReg APM-HeldOut CTM HDP LDA 0 200 400 600 800 1000 1200 1400 1600 k = 1 3 5 10 25 50 k = 1 3 5 Evoc-1 (Avg. Evoc. of Topics) Evoc-2 (Ev Evocation(m=50) APM APM-LowReg APM-HeldOut CTM HDP LDA 5 10 25 50 k = 1 3 5 10 25 50 (Avg. Evoc. of Topics) Evoc-2 (Evoc. of Avg. Topic) PM-LowReg APM-HeldOut CTM HDP LDA RSM RND David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 40. Evocation Metric Top Word Pairs Table: Top 20 Word Pairs for Best LDA Human Score Human Score Human Score 100 run.v ↔ car.n 38 woman.n ↔ man.n 100 telephone.n 82 teach.v ↔ school.n 38 give.v ↔ church.n 97 husband.n 69 school.n ↔ class.n 38 wife.n ↔ man.n 82 residential.a 63 van.n ↔ car.n 38 engine.n ↔ car.n 76 politics.n 51 hour.n ↔ day.n 35 publish.v ↔ book.n 75 steel.n 50 teach.v ↔ student.n 32 west.n ↔ state.n 75 job.n 44 house.n ↔ government.n 32 year.n ↔ day.n 75 room.n 44 week.n ↔ day.n 25 member.n ↔ give.v 72 aunt.n 38 university.n ↔ institution.n 25 dog.n ↔ animal.n 72 printer.n 38 state.n ↔ government.n 25 seat.n ↔ car.n 60 love.v Word Pair Word Pair Wo Table: Top 20 Word Pairs for Best APM Human Score Human Score n.n ↔ man.n 100 telephone.n ↔ call.n 57 question.n ↔ answer.n e.v ↔ church.n 97 husband.n ↔ wife.n 57 prison.n ↔ cell.n e.n ↔ man.n 82 residential.a ↔ home.n 51 mother.n ↔ baby.n e.n ↔ car.n 76 politics.n ↔ political.a 50 sun.n ↔ earth.n h.v ↔ book.n 75 steel.n ↔ iron.n 50 west.n ↔ east.n .n ↔ state.n 75 job.n ↔ employment.n 44 weekend.n ↔ sunday.n r.n ↔ day.n 75 room.n ↔ bedroom.n 41 wine.n ↔ drink.v .n ↔ give.v 72 aunt.n ↔ uncle.n 38 south.n ↔ north.n g.n ↔ animal.n 72 printer.n ↔ print.v 38 morning.n ↔ afternoon.n .n ↔ car.n 60 love.v ↔ love.n 38 engine.n ↔ car.n Word Pair Word Pair Word Pair David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 41. Current and Future Work networksnetworks learninglearning basedbased usingusing analysisanalysis networknetwork wirelesswireless datadata modelmodel multimulti controlcontrol efficientefficient timetime performanceperformance routingrouting distributeddistributed optimaloptimal algorithmsalgorithms algorithmalgorithm sensorsensor traffictraffic schedulingscheduling highhigh largelarge multiplemultiple mobilemobile atmatm packetpacket delaydelay allocationallocation flowflow protocolprotocol accessaccess multicastmulticast energyenergy channelchannel realreal scalescale powerpower locallocal hochoc raterate randomrandom serviceservice evaluationevaluation codingcoding radioradio bandwidthbandwidth opticaloptical speedspeed videovideo endend peerpeer resourceresource cloudcloud computingcomputing congestioncongestion hophop distributiondistribution contentcontent cognitivecognitive switchswitch spectrumspectrum switchingswitching privacyprivacy wdmwdm layerlayer streamingstreaming locationlocation queueingqueueing engineeringengineering inputinput crosscross areaarea qualityquality loadload wavelengthwavelength preservingpreserving admissionadmission assignmentassignment reliablereliable switchesswitches macmac faultfault toleranttolerant balancingbalancing switchedswitched varyingvarying registerregister widewide centercenter networksnetworks learninglearning basedbased usingusing analysisanalysis networknetwork wirelesswireless datadata modelmodel multimulti systemssystems modelsmodels timetime neuralneural objectobject optimaloptimal informationinformation highhigh bayesianbayesian largelarge optimizationoptimization multiplemultiple inferenceinference linearlinear nonnon sparsesparse clusteringclustering estimationestimation selectionselection kernelkernel supportsupport stochasticstochastic scalescale gaussiangaussian featurefeature markovmarkov processprocess processesprocesses randomrandom codingcoding classclass decisiondecision recognitionrecognition machinesmachines predictionprediction visualvisual vectorvector lowlow supervisedsupervised structuredstructured policypolicy treestrees functionfunction approximateapproximate continuouscontinuous semisemi gradientgradient reductionreduction maximummaximum latentlatent dimensionaldimensional matrixmatrix convexconvex propagationpropagation marginmargin graphicalgraphical variablevariable hiddenhidden variationalvariational tasktask componentcomponent mixturemixture speechspeech spectralspectral rankrank theoretictheoretic neuronsneurons fieldsfields densitydensity vlsivlsi instanceinstance analoganalog montemonte messagemessage carlocarlo topictopic labellabel entropyentropy neighborneighbor nearnear dirichletdirichlet spikingspiking seriesseries beliefbelief factorizationfactorization dynamicaldynamical partiallypartially descentdescent differencedifference nearestnearest dimensionalitydimensionality passingpassing completioncompletion principalprincipal leastleast boltzmannboltzmann likelihoodlikelihood squaressquares observableobservable networksnetworkslearninglearning basedbased usingusing analysisanalysis networknetworkwirelesswireless datadata modelmodel multimulti systemssystems timetime approachapproach programmingprogramming objectobjectdistributeddistributed languagelanguage designdesign orientedoriented systemsystem informationinformation highhigh softwaresoftware programsprograms inferenceinference verificationverification checkingchecking flowflow codecode typetype realrealprogramprogram languageslanguages orderorder machinemachine levellevel temporaltemporal studystudy domaindomain virtualvirtual logiclogic generationgeneration implementationimplementation casecase staticstatic hybridhybrid abstractabstract structuresstructures formalformal higherhigher sessionsession specificspecific collectioncollection firstfirst garbagegarbage extendedextended posterposter aspectaspect 1. Visualization 2. Better inference of parameters 3. Extension to other domains David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs
  • 42. Thanks for listening! Admixture of Poisson MRFS (APM) Multinomial Admixture Poisson MRF Gaussian MRF Mixture LDA David Inouye*, Pradeep Ravikumar, Inderjit Dhillon Admixture of Poisson MRFs