SlideShare a Scribd company logo
Topic modeling with Poisson factorization (2)
Tomonari Masada @ Nagasaki University
March 7, 2017
1 ELBO
To obtain update equations, we introduce auxiliary latent variables Z [1, 2, 3, 4]. zdkv is the
number of the tokens of the vth word in the dth document assigned to the kth topic. zdkv is
sampled from the Poisson distribution Poisson(θdkβkv).
The constraint k zdkv = ndv can be expressed with the probability mass function I(ndv= k zdkv).
The full joint distribution is given as below.
p(N, Z, Θ, β, φ; α, s, r) = p(β; α)p(φ; s, r)p(Θ; s, φ)p(N|Z)p(Z|Θ, β)
=
k
p(βk; α) ×
k
p(φk; s, r) ×
k
p(θk; s, φk) ×
d
p(nd|zd)p(zd|θd, β)
=
k
Γ(V α)
Γ(α)V
v
βα−1
kv ×
k
rs
Γ(s)
φs−1
k e−rφk
×
k d
φs
k
Γ(s)
θs−1
dk e−φkθdk
×
d v
I(ndv= k zdkv)
k
(θdkβkv)zdkv
e−θdkβkv
zdkv!
(1)
The generative model is fully described in Eq. (1).
We adopt the variational Bayesian inference for the posterior inference. The evidence lower
bound (ELBO) for the model is obtained as below.
log p(N) = log
Z
p(N, Z, Θ, β, φ)dΘdβdφ
≥
Z
q(Z)q(Θ)q(β)q(φ) log p(N, Z, Θ, β, φ)dΘdβdφ
−
Z
q(Z)q(Θ)q(β)q(φ) log q(Z)q(Θ)q(β)q(φ)dΘdβdφ
=
Z
q(Z)q(Θ)q(β) log p(Z|Θ, β)dΘdβ +
Z
q(Z) log p(N|Z)
+ q(Θ)q(φ) log p(Θ|φ)dΘdφ + q(β) log p(β)dβ + q(φ) log p(φ)dφ
−
z
q(Z) log q(Z) − q(Θ) log q(Θ)dΘ − q(β) log q(β)dβ − q(φ) log q(φ)dφ , (2)
where the approximate posterior q(Z, Θ, β, φ) is factorized.
We assume the followings for the factorized approximate posterior.
• q(zdv) is the multinomial distribution Mult(ndv, ωdv). ωdvk is the probability that a token
of the vth word in the dth document is assigned to the kth topic among the K topics. Note
that k zdkv = ndv holds.
• q(θdk) is the gamma distribution Gamma(adk, bdk).
• q(βk) is the asymmetric Dirichlet distribution Dirichlet(ξk).
• q(φk) is the gamma distribution Gamma(µk, νk).
1
2 Auxiliary latent variables
The update equation for ωdvk can be obtained as below. The second term of the ELBO in Eq. (2)
can be rewritten as follows:
Z
q(Z) log p(N|Z) =
d v zdv
q(zdv) log I(ndv= k zdkv) = 0 , (3)
because k zdkv = ndv. Even when q(zdv) is not assumed to be a multinomial, there are no
problem with respect to this term as long as any sample from q(zdv) satisfies k zdkv = ndv.
The fifth term of the ELBO in Eq. (2) can be rewritten as follows:
Z
q(Z) log q(Z) =
d v zdv
q(zdv) log
ndv!
k zdkv!
k
ωzdkv
dkv
=
d v
log(ndv!) −
d v zdv
q(zdv)
k
log(zdkv!) +
d v zdv
q(zdv)
k
zdkv log ωdkv
=
d v
log(ndv!) −
d v zdv
q(zdv)
k
log(zdkv!) +
d v k
ndvωdkv log ωdkv (4)
The first term of the ELBO in Eq. (2) can be rewritten as follows:
Z
q(Z)q(Θ)q(β) log p(Z|Θ, β)dΘdβ
=
Z
q(Z)q(Θ)q(β)
d v k
log (θdkβkv)zdkv
e−θdkβkv
dΘdβ
−
Z
q(Z)
d v k
log(zdkv!)
=
d v k zdv
q(zdv)zdkv q(θdk) log θdkdθdk +
d v k zdv
q(zdv)zdkv q(βk) log βkvdβk
−
d v k
q(βk) q(θdk)θdkdθdk βkvdβk −
d v zdv
q(zdv)
k
log(zdkv!)
=
d v k
ndvωdkv ψ(adk) − log(bdk) +
d v k
ndvωdkv ψ(ξkv) − ψ(
v
ξkv)
−
d v k
adk
bdk
ξkv
v ξkv
−
d v zdv
q(zdv)
k
log(zdkv!) (5)
Therefore, the terms relevant to ω in the ELBO are summed up as follows:
L(ω) =
d v k
ndvωdkv ψ(adk) − log(bdk) +
d v k
ndvωdkv ψ(ξkv) − ψ(
v
ξkv)
−
d v zdv
q(zdv)
k
log(zdkv!)
+
d v zdv
q(zdv)
k
log(zdkv!) −
d v k
ndvωdkv log ωdkv
=
d v k
ndvωdkv ψ(adk) − log(bdk) +
d v k
ndvωdkv ψ(ξkv) − ψ(
v
ξkv)
−
d v k
ndvωdkv log ωdkv (6)
By introducing Lagrange multipliers, we can obtain the update equation ωdkv ∝
exp ψ(adk)
bdk
exp ψ(ξkv)
exp ψ v ξkv
.
2
3 Gamma posterior 1
The third term of the ELBO in Eq. (2) can be rewritten as follows:
q(φk)q(θdk) log p(θdk; s, φk)dθdkdφk = q(φk)q(θdk) log
φs
k
Γ(s)
θs−1
dk e−φkθdk
dθdkdφk
= s ψ(µk) − log νk − log Γ(s) + (s − 1) ψ(adk) − log bdk −
adk
bdk
µk
νk
(7)
The seventh term of the ELBO in Eq. (2) can be rewritten as follows:
q(θdk) log q(θdk)dθdk = q(θdk) log
badk
dk
Γ(adk)
θadk−1
dk e−bdkθdk
dθdk
= −adk + log bdk − log Γ(adk) + (adk − 1)ψ(adk) (8)
L(adk, bdk) =
v
ndvωdkv ψ(adk) − log bdk −
v
adk
bdk
ξkv
v ξkv
+ (s − 1) ψ(adk) − log bdk −
adk
bdk
µk
νk
+ adk − log bdk + log Γ(adk) − (adk − 1)ψ(adk)
=
v
ndvωdkv − adk + s ψ(adk) + log Γ(adk) + adk
−
v
ndvωdkv + s log bdk −
adk
bdk
µk
νk
+ 1 (9)
∂L(adk, bdk)
∂adk
=
v
ndvωdkv − adk + s ψ (adk) + 1 −
1
bdk
µk
νk
+ 1 (10)
∂L(adk, bdk)
∂bdk
= −
v
ndvωdkv + s
1
bdk
+
adk
b2
dk
µk
νk
+ 1 (11)
Both ∂L(adk,bdk)
∂adk
= 0 and ∂L(adk,bdk)
∂bdk
= 0 are satisfied when adk = v ndvωdkv +s and bdk = µk
νk
+1.
4 Gamma posterior 2
The fifth term of the ELBO in Eq. (2) can be rewritten as follows:
q(φk) log p(φk; s, r)dφk = q(φk) log
rs
Γ(s)
φs−1
k e−rφk
dφk
= s log r − log Γ(s) + (s − 1) ψ(µk) − log νk − r
µk
νk
(12)
The ninth term of the ELBO in Eq. (2) can be rewritten as follows:
q(φk) log q(φk)dφk = q(φk) log
νµk
k
Γ(µk)
φµk−1
k e−νkφk
dφk
= −µk + log νk − log Γ(µk) + (µk − 1)ψ(µk) (13)
L(µdk, νdk) = Ds ψ(µk) − log νk −
µk
νk
d
adk
bdk
+ (s − 1) ψ(µk) − log νk − r
µk
νk
+ µk − log νk + log Γ(µk) − (µk − 1)ψ(µk) (14)
∂L(µk, νk)
∂µk
= (Ds + s − µk)ψ (µk) −
1
νk
d
adk
bdk
+ r + 1 (15)
∂L(µk, νk)
∂νk
= −
Ds + s
νk
+
µk
ν2
k d
adk
bdk
+ r (16)
Both ∂L(µk,νk)
∂µk
= 0 and ∂L(µk,νk)
∂νk
= 0 are satisfied when µk = Ds + s and νk = d
adk
bdk
+ r.
3
5 Dirichlet posterior
The fourth term of the ELBO in Eq. (2) can be rewritten as follows:
q(βk) log p(βk)dβk = q(βk) log
Γ(V α)
Γ(α)V
v
βα−1
kv dβk
= log Γ(V α) − V log Γ(α) + (α − 1)
v
ψ(ξkv) − ψ(
v
ξkv) (17)
The eighth term of the ELBO in Eq. (2) can be rewritten as follows:
q(βk) log q(βk)dβk = q(βk) log
Γ( v ξkv)
v Γ(ξkv) v
βξkv−1
kv dβk
= log Γ(
v
ξkv) −
v
log Γ(ξkv) +
v
(ξkv − 1) ψ(ξkv) − ψ(
v
ξkv)
(18)
L(ξk) =
v d
ndvωdkv ψ(ξkv) − ψ(
v
ξkv)
+ (α − 1)
v
ψ(ξkv) − ψ(
v
ξkv)
− log Γ(
v
ξkv) +
v
log Γ(ξkv) −
v
(ξkv − 1) ψ(ξkv) − ψ(
v
ξkv) (19)
∂L(ξk)
∂ξkv
=
v d
ndvωdkv + α − ξkv
∂
∂ξkv
ψ(ξkv) − ψ(
v
ξkv) (20)
Therefore, we obtain the update equation ξkv = d ndvωdkv + α.
6 Summary
ωdkv ∝
exp ψ(adk)
bdk
exp ψ(ξkv)
exp ψ v ξkv
(21)
adk =
v
ndvωdkv + s (22)
bdk =
µk
νk
+ 1 (23)
ξkv =
d
ndvωdkv + α (24)
µk = Ds + s (25)
νk =
d
adk
bdk
+ r (26)
References
[1] Allison June-Barlow Chaney, Hanna M. Wallach, Matthew Connelly, and David M. Blei. De-
tecting and characterizing events. EMNLP, pp. 1142–1152, 2016.
[2] David B. Dunson and Amy H. Herring. Bayesian latent variable models for mixed discrete
outcomes. Biostatistics, Vol. 6, No. 1, pp. 11–25, 2005.
[3] Prem Gopalan, Laurent Charlin, and David M. Blei. Content-based recommendations with
Poisson factorization. NIPS, pp. 3176–3184, 2014.
[4] Prem Gopalan, Jake M. Hofman, and David M. Blei. Scalable recommendation with hierarchical
Poisson factorization. UAI, pp. 326–335, 2015.
4

More Related Content

PDF
A Note on TopicRNN
PDF
A Note on Latent LSTM Allocation
PDF
A Note on Correlated Topic Models
PDF
New Families of Odd Harmonious Graphs
PDF
PDF
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
PDF
Bayesian Inference and Uncertainty Quantification for Inverse Problems
PDF
Goldberg-Coxeter construction for 3- or 4-valent plane maps
A Note on TopicRNN
A Note on Latent LSTM Allocation
A Note on Correlated Topic Models
New Families of Odd Harmonious Graphs
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
Bayesian Inference and Uncertainty Quantification for Inverse Problems
Goldberg-Coxeter construction for 3- or 4-valent plane maps

What's hot (18)

PDF
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
PPT
2.6 all pairsshortestpath
PDF
The Fundamental Solution of an Extension to a Generalized Laplace Equation
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
SUPER MAGIC CORONATIONS OF GRAPHS
PDF
On maximal and variational Fourier restriction
PDF
Solving the energy problem of helium final report
PDF
Solovay Kitaev theorem
PDF
Minimum spanning tree
PDF
Max cut
PDF
3 zhukovsky
PDF
Kittel c. introduction to solid state physics 8 th edition - solution manual
PDF
A block-step version of KS regularization
PDF
Fast and efficient exact synthesis of single qubit unitaries generated by cli...
PDF
FURTHER RESULTS ON ODD HARMONIOUS GRAPHS
PDF
Number theoretic-rsa-chailos-new
PDF
SMB_2012_HR_VAN_ST-last version
PDF
Kumegawa russia
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
2.6 all pairsshortestpath
The Fundamental Solution of an Extension to a Generalized Laplace Equation
International Journal of Engineering and Science Invention (IJESI)
SUPER MAGIC CORONATIONS OF GRAPHS
On maximal and variational Fourier restriction
Solving the energy problem of helium final report
Solovay Kitaev theorem
Minimum spanning tree
Max cut
3 zhukovsky
Kittel c. introduction to solid state physics 8 th edition - solution manual
A block-step version of KS regularization
Fast and efficient exact synthesis of single qubit unitaries generated by cli...
FURTHER RESULTS ON ODD HARMONIOUS GRAPHS
Number theoretic-rsa-chailos-new
SMB_2012_HR_VAN_ST-last version
Kumegawa russia
Ad

Viewers also liked (20)

PDF
Poisson factorization
PPTX
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
PPTX
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
PDF
論文紹介 Advances in Learning Bayesian Networks of Bounded Treewidth
PDF
畳み込みニューラルネットワークを用いた複単語表現の解析
PDF
Analysis of Learning from Positive and Unlabeled Data
PDF
A Safe Rule for Sparse Logistic Regression
PDF
PDF
33.1.005 Education Company Case Study Story
PDF
NIPS2014読み会 NIPS参加報告
PDF
第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法
PDF
LDA等のトピックモデル
PDF
SEOGuardian - Centros de reproducción asistida - Actualización anual
PDF
第二回機械学習アルゴリズム実装会 - LDA
PPTX
20151221 public
PPT
Likovna kultura 15
PPT
Likovna kultura 2
PDF
핀테크 열풍을 통해_본_은행업의_과제
PDF
Manual gerenciamento residuos
PDF
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
Poisson factorization
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
論文紹介 Advances in Learning Bayesian Networks of Bounded Treewidth
畳み込みニューラルネットワークを用いた複単語表現の解析
Analysis of Learning from Positive and Unlabeled Data
A Safe Rule for Sparse Logistic Regression
33.1.005 Education Company Case Study Story
NIPS2014読み会 NIPS参加報告
第3章 変分近似法 LDAにおける変分ベイズ法・周辺化変分ベイズ法
LDA等のトピックモデル
SEOGuardian - Centros de reproducción asistida - Actualización anual
第二回機械学習アルゴリズム実装会 - LDA
20151221 public
Likovna kultura 15
Likovna kultura 2
핀테크 열풍을 통해_본_은행업의_과제
Manual gerenciamento residuos
サルでもわかるディープラーニング入門 (2017年) (In Japanese)
Ad

Similar to Topic modeling with Poisson factorization (2) (17)

PDF
Expectation propagation for latent Dirichlet allocation
PDF
A Note on Expectation-Propagation for Latent Dirichlet Allocation
PDF
Toward Disentanglement through Understand ELBO
PDF
Stability criterion of periodic oscillations in a (10)
PDF
20191026 bayes dl
PDF
A Note on the Derivation of the Variational Inference Updates for DILN
PDF
Meta-learning and the ELBO
PDF
A Note on PCVB0 for HDP-LDA
PDF
Rousseau
PDF
Trilinear embedding for divergence-form operators
PDF
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
PDF
Deep VI with_beta_likelihood
PPT
Variational Inference
PDF
Variational Bayes: A Gentle Introduction
PDF
Otrzymywaniemagnetcgutdfghhitrrrrrrriohf
PDF
A note on variational inference for the univariate Gaussian
PDF
Chapter2: Likelihood-based approach
Expectation propagation for latent Dirichlet allocation
A Note on Expectation-Propagation for Latent Dirichlet Allocation
Toward Disentanglement through Understand ELBO
Stability criterion of periodic oscillations in a (10)
20191026 bayes dl
A Note on the Derivation of the Variational Inference Updates for DILN
Meta-learning and the ELBO
A Note on PCVB0 for HDP-LDA
Rousseau
Trilinear embedding for divergence-form operators
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Deep VI with_beta_likelihood
Variational Inference
Variational Bayes: A Gentle Introduction
Otrzymywaniemagnetcgutdfghhitrrrrrrriohf
A note on variational inference for the univariate Gaussian
Chapter2: Likelihood-based approach

More from Tomonari Masada (20)

PDF
Learning Latent Space Energy Based Prior Modelの解説
PDF
Denoising Diffusion Probabilistic Modelsの重要な式の解説
PDF
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
PDF
A note on the density of Gumbel-softmax
PPTX
トピックモデルの基礎と応用
PDF
Mini-batch Variational Inference for Time-Aware Topic Modeling
PDF
Document Modeling with Implicit Approximate Posterior Distributions
PDF
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
PDF
A Note on ZINB-VAE
TXT
Word count in Husserliana Volumes 1 to 28
PDF
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
PDF
FDSE2015
PDF
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
PDF
A Note on BPTT for LSTM LM
PDF
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
PPTX
ChronoSAGE: Diversifying Topic Modeling Chronologically
PPT
A Topic Model for Traffic Speed Data Analysis
PDF
Supplementary material for my following paper: Infinite Latent Process Decomp...
PDF
Nonparametric Factor Analysis with Beta Process Priors の式解説
PDF
Gaussian Processes for Machine Learning のExercise 3.10.5 の答え
Learning Latent Space Energy Based Prior Modelの解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
A note on the density of Gumbel-softmax
トピックモデルの基礎と応用
Mini-batch Variational Inference for Time-Aware Topic Modeling
Document Modeling with Implicit Approximate Posterior Distributions
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
A Note on ZINB-VAE
Word count in Husserliana Volumes 1 to 28
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
FDSE2015
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A Note on BPTT for LSTM LM
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
ChronoSAGE: Diversifying Topic Modeling Chronologically
A Topic Model for Traffic Speed Data Analysis
Supplementary material for my following paper: Infinite Latent Process Decomp...
Nonparametric Factor Analysis with Beta Process Priors の式解説
Gaussian Processes for Machine Learning のExercise 3.10.5 の答え

Recently uploaded (20)

PPTX
UNIT 4 Total Quality Management .pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPT
Mechanical Engineering MATERIALS Selection
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
DOCX
573137875-Attendance-Management-System-original
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPT
Project quality management in manufacturing
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
UNIT 4 Total Quality Management .pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Internet of Things (IOT) - A guide to understanding
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Operating System & Kernel Study Guide-1 - converted.pdf
R24 SURVEYING LAB MANUAL for civil enggi
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Automation-in-Manufacturing-Chapter-Introduction.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Mechanical Engineering MATERIALS Selection
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
573137875-Attendance-Management-System-original
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Project quality management in manufacturing
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf

Topic modeling with Poisson factorization (2)

  • 1. Topic modeling with Poisson factorization (2) Tomonari Masada @ Nagasaki University March 7, 2017 1 ELBO To obtain update equations, we introduce auxiliary latent variables Z [1, 2, 3, 4]. zdkv is the number of the tokens of the vth word in the dth document assigned to the kth topic. zdkv is sampled from the Poisson distribution Poisson(θdkβkv). The constraint k zdkv = ndv can be expressed with the probability mass function I(ndv= k zdkv). The full joint distribution is given as below. p(N, Z, Θ, β, φ; α, s, r) = p(β; α)p(φ; s, r)p(Θ; s, φ)p(N|Z)p(Z|Θ, β) = k p(βk; α) × k p(φk; s, r) × k p(θk; s, φk) × d p(nd|zd)p(zd|θd, β) = k Γ(V α) Γ(α)V v βα−1 kv × k rs Γ(s) φs−1 k e−rφk × k d φs k Γ(s) θs−1 dk e−φkθdk × d v I(ndv= k zdkv) k (θdkβkv)zdkv e−θdkβkv zdkv! (1) The generative model is fully described in Eq. (1). We adopt the variational Bayesian inference for the posterior inference. The evidence lower bound (ELBO) for the model is obtained as below. log p(N) = log Z p(N, Z, Θ, β, φ)dΘdβdφ ≥ Z q(Z)q(Θ)q(β)q(φ) log p(N, Z, Θ, β, φ)dΘdβdφ − Z q(Z)q(Θ)q(β)q(φ) log q(Z)q(Θ)q(β)q(φ)dΘdβdφ = Z q(Z)q(Θ)q(β) log p(Z|Θ, β)dΘdβ + Z q(Z) log p(N|Z) + q(Θ)q(φ) log p(Θ|φ)dΘdφ + q(β) log p(β)dβ + q(φ) log p(φ)dφ − z q(Z) log q(Z) − q(Θ) log q(Θ)dΘ − q(β) log q(β)dβ − q(φ) log q(φ)dφ , (2) where the approximate posterior q(Z, Θ, β, φ) is factorized. We assume the followings for the factorized approximate posterior. • q(zdv) is the multinomial distribution Mult(ndv, ωdv). ωdvk is the probability that a token of the vth word in the dth document is assigned to the kth topic among the K topics. Note that k zdkv = ndv holds. • q(θdk) is the gamma distribution Gamma(adk, bdk). • q(βk) is the asymmetric Dirichlet distribution Dirichlet(ξk). • q(φk) is the gamma distribution Gamma(µk, νk). 1
  • 2. 2 Auxiliary latent variables The update equation for ωdvk can be obtained as below. The second term of the ELBO in Eq. (2) can be rewritten as follows: Z q(Z) log p(N|Z) = d v zdv q(zdv) log I(ndv= k zdkv) = 0 , (3) because k zdkv = ndv. Even when q(zdv) is not assumed to be a multinomial, there are no problem with respect to this term as long as any sample from q(zdv) satisfies k zdkv = ndv. The fifth term of the ELBO in Eq. (2) can be rewritten as follows: Z q(Z) log q(Z) = d v zdv q(zdv) log ndv! k zdkv! k ωzdkv dkv = d v log(ndv!) − d v zdv q(zdv) k log(zdkv!) + d v zdv q(zdv) k zdkv log ωdkv = d v log(ndv!) − d v zdv q(zdv) k log(zdkv!) + d v k ndvωdkv log ωdkv (4) The first term of the ELBO in Eq. (2) can be rewritten as follows: Z q(Z)q(Θ)q(β) log p(Z|Θ, β)dΘdβ = Z q(Z)q(Θ)q(β) d v k log (θdkβkv)zdkv e−θdkβkv dΘdβ − Z q(Z) d v k log(zdkv!) = d v k zdv q(zdv)zdkv q(θdk) log θdkdθdk + d v k zdv q(zdv)zdkv q(βk) log βkvdβk − d v k q(βk) q(θdk)θdkdθdk βkvdβk − d v zdv q(zdv) k log(zdkv!) = d v k ndvωdkv ψ(adk) − log(bdk) + d v k ndvωdkv ψ(ξkv) − ψ( v ξkv) − d v k adk bdk ξkv v ξkv − d v zdv q(zdv) k log(zdkv!) (5) Therefore, the terms relevant to ω in the ELBO are summed up as follows: L(ω) = d v k ndvωdkv ψ(adk) − log(bdk) + d v k ndvωdkv ψ(ξkv) − ψ( v ξkv) − d v zdv q(zdv) k log(zdkv!) + d v zdv q(zdv) k log(zdkv!) − d v k ndvωdkv log ωdkv = d v k ndvωdkv ψ(adk) − log(bdk) + d v k ndvωdkv ψ(ξkv) − ψ( v ξkv) − d v k ndvωdkv log ωdkv (6) By introducing Lagrange multipliers, we can obtain the update equation ωdkv ∝ exp ψ(adk) bdk exp ψ(ξkv) exp ψ v ξkv . 2
  • 3. 3 Gamma posterior 1 The third term of the ELBO in Eq. (2) can be rewritten as follows: q(φk)q(θdk) log p(θdk; s, φk)dθdkdφk = q(φk)q(θdk) log φs k Γ(s) θs−1 dk e−φkθdk dθdkdφk = s ψ(µk) − log νk − log Γ(s) + (s − 1) ψ(adk) − log bdk − adk bdk µk νk (7) The seventh term of the ELBO in Eq. (2) can be rewritten as follows: q(θdk) log q(θdk)dθdk = q(θdk) log badk dk Γ(adk) θadk−1 dk e−bdkθdk dθdk = −adk + log bdk − log Γ(adk) + (adk − 1)ψ(adk) (8) L(adk, bdk) = v ndvωdkv ψ(adk) − log bdk − v adk bdk ξkv v ξkv + (s − 1) ψ(adk) − log bdk − adk bdk µk νk + adk − log bdk + log Γ(adk) − (adk − 1)ψ(adk) = v ndvωdkv − adk + s ψ(adk) + log Γ(adk) + adk − v ndvωdkv + s log bdk − adk bdk µk νk + 1 (9) ∂L(adk, bdk) ∂adk = v ndvωdkv − adk + s ψ (adk) + 1 − 1 bdk µk νk + 1 (10) ∂L(adk, bdk) ∂bdk = − v ndvωdkv + s 1 bdk + adk b2 dk µk νk + 1 (11) Both ∂L(adk,bdk) ∂adk = 0 and ∂L(adk,bdk) ∂bdk = 0 are satisfied when adk = v ndvωdkv +s and bdk = µk νk +1. 4 Gamma posterior 2 The fifth term of the ELBO in Eq. (2) can be rewritten as follows: q(φk) log p(φk; s, r)dφk = q(φk) log rs Γ(s) φs−1 k e−rφk dφk = s log r − log Γ(s) + (s − 1) ψ(µk) − log νk − r µk νk (12) The ninth term of the ELBO in Eq. (2) can be rewritten as follows: q(φk) log q(φk)dφk = q(φk) log νµk k Γ(µk) φµk−1 k e−νkφk dφk = −µk + log νk − log Γ(µk) + (µk − 1)ψ(µk) (13) L(µdk, νdk) = Ds ψ(µk) − log νk − µk νk d adk bdk + (s − 1) ψ(µk) − log νk − r µk νk + µk − log νk + log Γ(µk) − (µk − 1)ψ(µk) (14) ∂L(µk, νk) ∂µk = (Ds + s − µk)ψ (µk) − 1 νk d adk bdk + r + 1 (15) ∂L(µk, νk) ∂νk = − Ds + s νk + µk ν2 k d adk bdk + r (16) Both ∂L(µk,νk) ∂µk = 0 and ∂L(µk,νk) ∂νk = 0 are satisfied when µk = Ds + s and νk = d adk bdk + r. 3
  • 4. 5 Dirichlet posterior The fourth term of the ELBO in Eq. (2) can be rewritten as follows: q(βk) log p(βk)dβk = q(βk) log Γ(V α) Γ(α)V v βα−1 kv dβk = log Γ(V α) − V log Γ(α) + (α − 1) v ψ(ξkv) − ψ( v ξkv) (17) The eighth term of the ELBO in Eq. (2) can be rewritten as follows: q(βk) log q(βk)dβk = q(βk) log Γ( v ξkv) v Γ(ξkv) v βξkv−1 kv dβk = log Γ( v ξkv) − v log Γ(ξkv) + v (ξkv − 1) ψ(ξkv) − ψ( v ξkv) (18) L(ξk) = v d ndvωdkv ψ(ξkv) − ψ( v ξkv) + (α − 1) v ψ(ξkv) − ψ( v ξkv) − log Γ( v ξkv) + v log Γ(ξkv) − v (ξkv − 1) ψ(ξkv) − ψ( v ξkv) (19) ∂L(ξk) ∂ξkv = v d ndvωdkv + α − ξkv ∂ ∂ξkv ψ(ξkv) − ψ( v ξkv) (20) Therefore, we obtain the update equation ξkv = d ndvωdkv + α. 6 Summary ωdkv ∝ exp ψ(adk) bdk exp ψ(ξkv) exp ψ v ξkv (21) adk = v ndvωdkv + s (22) bdk = µk νk + 1 (23) ξkv = d ndvωdkv + α (24) µk = Ds + s (25) νk = d adk bdk + r (26) References [1] Allison June-Barlow Chaney, Hanna M. Wallach, Matthew Connelly, and David M. Blei. De- tecting and characterizing events. EMNLP, pp. 1142–1152, 2016. [2] David B. Dunson and Amy H. Herring. Bayesian latent variable models for mixed discrete outcomes. Biostatistics, Vol. 6, No. 1, pp. 11–25, 2005. [3] Prem Gopalan, Laurent Charlin, and David M. Blei. Content-based recommendations with Poisson factorization. NIPS, pp. 3176–3184, 2014. [4] Prem Gopalan, Jake M. Hofman, and David M. Blei. Scalable recommendation with hierarchical Poisson factorization. UAI, pp. 326–335, 2015. 4