SlideShare a Scribd company logo
http://www.cs.umd.edu/linqs

                                                       Computing Marginal Distributions over Continuous
                                                       Markov Networks for Statistical Relational Learning
                                                                                                    Matthias Bröcheler and Lise Getoor                                                             Supported by NSF Grant No. 0937094


                                                                                                                                                                                                    The complexity of computing an approximate




                                                                                                                                                                            Lovasz & Vempala ‘04
                                            Problem?
                                                                                                                                                                                                    distribution σ* using hit-and-run sampling such that
     Computing marginal distributions in constrained                                                                                                                                                the total variation distance of σ* and P is less than ε is
     continuous MRFs (CCMRF)                                                                                                                                                                                                                       ∗
                                                                                                                                                                                                                                                           3
                                                                                                                                                                                                                                                                                       
                                                                                                                                                              d                                                                                O           n (kB + n + m)
                                                                                                                                                                                                                                                           ˜       ˜
                                          Motivation?                                                                                                                                               where ñ=n-kA, under the assumptions that we start from an initial distribution σ such
     Many applications of CCMRF, probabilistic soft logic                                                                        Xi                           p                                     that the density function dσ/dP is bounded by M except on a set S with σ(S)≤ε/s

     being one of them
                                       Contributions?                                                        Hit-and-Run Sampling                                                                       In  Theory…  
                                                                                                                                                          q                                                                                                                    In  Prac@ce…  
     Analysis of the theoretical and practical aspects of                                              1.  Sample random direction
     computing marginals in CCMRFs                                                                     2.  Compute line segment                               d
                                                                                                       3.  Induce density on line                                                                                Algorithm                                           ε1
                                                                                                       4.  Sample from induced density                        p
                                                                                                                                                                                      1.  Start=MAP state
                                             What’s  a  CCMRF?  
                                                                                                                                                                                      2.  Dimensionality
                                                                                                                                                                                          reduction and LA
      Constrained Continuous Markov Random Field                                                                                        Let’s  approximate!                           3.  How do we get out                                                               ε2

                                                                                                                                                                                          of corners?
     X = {X1 , .., Xn } : Di ⊂ R D = ×n Di                                                                                                                                                                                                                                zk − W k d i T
                                          i=1                                                          Computing the marginal probability density function
                                                                                                                                                                                          1.  Corner heuristic                                        di+1      = di + 2             Wk
     φ = {φ1 , .., φm } : φj : D → [0, M]                                                                                                                                             4.  Induce f efficiently
                                                                      Constraints                      fX (x ) =
                                                                                                                                  f (x , y)dy for a subset X ⊂ X under                                                                                                   Wk 2
     Λ = {λ1 , .., λm }                                                                                                ˜     
                                                                                                                    y∈×D ,s.t.X ∈X
                                                                                                                     i   i      /
                                                                  Equality Constraints
                                                                                                       the probability measure defined by a CCMRF is #P
 Probability measure P over X defined through                A : D → RkA , a ∈ Rk A
         1                  m                                                                         hard in the worst case.                                                                                                                                   Experimental  Results  
                                                                 Inequality Constraints
f (x) =      exp[−     λj φj (x)]
        Z(Λ)                                                B : D → Rk B , b ∈ Rk B                                                                                                                                            Collective classification of 1717 Wikipedia articles with 20% seed documents
                   j=1
                                                         ˜
                                                           D = D ∩ {x|A(x) = a ∧ B(x) ≤ b}                    In  Theory…                                                                          Setup                       using tf/idf weighted cosine similarity as baseline and comparing against a
                              m                                                                                                                                                                                               PSL program with learned weights over K-folds cross validation.
                               
     Z(Λ) =           exp −         λj φj (x) dx                       / ˜
                                                            f (x) = 0 ∀x ∈ D                                                                  Why  CCMRF?                                                                                                                         Std. Deviation Indicator of
                  D            j=1                                                                                                                                                           Folds                      Improvement         P(Null              Relative                 Confidence
                                                                                                                                                                                                                        over baseline     Hypothesis)       Difference Δ(σ)
                                                                                                       Probabilistic soft logic (PSL) is a declarative language                                                                                                                    ∆(σ) = 2
                                                                                                                                                                                                                                                                                            σ− − σ+
                                                                                                                                                                                                   20                         41.4%        1.95E-09              38.3%
                                                                                                       for collective probabilistic reasoning about similarity                                                                                                                              σ+ + σ−
                                            What  does  it  look  like?                                or uncertainty in relational domains. PSL focuses on
                                                                                                                                                                                                   25                         31.7%        2.40E-13              41.2%
                                                                                                                                                                                                   30                         39.1%        1.00E-16             43.5%                     Hypothesis

              X1
                                                                                                       statistical relational learning problems with continuous                                    35                         46.1%        4.54E-08              39.0%                   ∆(σ)  0
         1                                                        1       X1
                       φ3 (x) = max(0, x2 − x3 )                               f
                                                                                                       RVs and supports sets and aggregation.
                                                                                                                                                                                                                                                Convergence Analysis
                          φ2 (x) = max(0, x1 − x2 )                                0            1      PSL programs get grounded into CCMRFs for inference.                                                              5




                                                                                                                                                                                                        KL Divergence
                              φ1 (x) = x1
                                  x1 + x3 ≤ 1                                                            w1 : class(B,C)  A.text≈B.text                class(A,C)                                                                Average KL Divergence
                                                                               P(0.4 ≤ X2 ≤ 0.6)                                                                                                                        0.5
                                       X3
         0
                                               Highest
                                             Probability              0
                                                                                             X3
                                                                                                         w2 : class(B,C)  link(A,B)              class(A,C)                                                                      Lowest Quartile KL RV)
                                                                                                                                                                                                                                  Divergence
                                                                                                                                                                                                                                              (322-413
                                      1                                                     1                                                                                                                                     Highest Quartile KL RV)
                                                                                                                                                                                                                                              (174-224

X2
                                 Λ = {1, 2, 1}                                                           Constraint: functional(class)                                                                          0.05
                                                                                                                                                                                                                                  Divergence
                                 X = {X1 , X2 , X3 }                                                                                                                                                               30000                                        300000    Number of Samples     3000000

More Related Content

PDF
Baum2
PDF
Contel.final
PDF
supervised and relational topic models
PDF
Cb25464467
PDF
Std10 maths-em-1
PDF
FK_icassp_2014
PDF
Simple Visuals for Complex Research (poster)
PDF
Art%3 a10.1155%2fs1110865704401036
Baum2
Contel.final
supervised and relational topic models
Cb25464467
Std10 maths-em-1
FK_icassp_2014
Simple Visuals for Complex Research (poster)
Art%3 a10.1155%2fs1110865704401036

More from Matthias Broecheler (14)

PDF
Titan: Scaling Graphs and TinkerPop3
PDF
Titan @ Gitpro Conference 2014
PPTX
Titan NYC Meetup March 2014
PDF
Graph Computing @ Strangeloop 2013
PDF
Titan - Graph Computing with Cassandra
PDF
Data Day Texas 2013
PDF
Adding Value through graph analysis using Titan and Faunus
PDF
Big Graph Data
PDF
Titan: Big Graph Data with Cassandra
PDF
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PDF
Budget-Match: Cost Effective Subgraph Matching on Large Networks
PDF
Probabilistic Soft Logic
PDF
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
PDF
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
Titan: Scaling Graphs and TinkerPop3
Titan @ Gitpro Conference 2014
Titan NYC Meetup March 2014
Graph Computing @ Strangeloop 2013
Titan - Graph Computing with Cassandra
Data Day Texas 2013
Adding Value through graph analysis using Titan and Faunus
Big Graph Data
Titan: Big Graph Data with Cassandra
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
Budget-Match: Cost Effective Subgraph Matching on Large Networks
Probabilistic Soft Logic
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
Ad

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
cuic standard and advanced reporting.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Cloud computing and distributed systems.
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Unlocking AI with Model Context Protocol (MCP)
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Spectral efficient network and resource selection model in 5G networks
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
The AUB Centre for AI in Media Proposal.docx
cuic standard and advanced reporting.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
“AI and Expert System Decision Support & Business Intelligence Systems”
Cloud computing and distributed systems.
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Electronic commerce courselecture one. Pdf
Empathic Computing: Creating Shared Understanding
Dropbox Q2 2025 Financial Results & Investor Presentation
Building Integrated photovoltaic BIPV_UPV.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Ad

Computing Marginal in CCMRFs - NIPS 2010

  • 1. http://www.cs.umd.edu/linqs Computing Marginal Distributions over Continuous Markov Networks for Statistical Relational Learning Matthias Bröcheler and Lise Getoor Supported by NSF Grant No. 0937094 The complexity of computing an approximate Lovasz & Vempala ‘04 Problem? distribution σ* using hit-and-run sampling such that Computing marginal distributions in constrained the total variation distance of σ* and P is less than ε is continuous MRFs (CCMRF) ∗ 3 d O n (kB + n + m) ˜ ˜ Motivation? where ñ=n-kA, under the assumptions that we start from an initial distribution σ such Many applications of CCMRF, probabilistic soft logic Xi p that the density function dσ/dP is bounded by M except on a set S with σ(S)≤ε/s being one of them Contributions? Hit-and-Run Sampling In  Theory…   q In  Prac@ce…   Analysis of the theoretical and practical aspects of 1.  Sample random direction computing marginals in CCMRFs 2.  Compute line segment d 3.  Induce density on line Algorithm ε1 4.  Sample from induced density p 1.  Start=MAP state What’s  a  CCMRF?   2.  Dimensionality reduction and LA Constrained Continuous Markov Random Field Let’s  approximate!   3.  How do we get out ε2 of corners? X = {X1 , .., Xn } : Di ⊂ R D = ×n Di zk − W k d i T i=1 Computing the marginal probability density function 1.  Corner heuristic di+1 = di + 2 Wk φ = {φ1 , .., φm } : φj : D → [0, M] 4.  Induce f efficiently Constraints fX (x ) = f (x , y)dy for a subset X ⊂ X under Wk 2 Λ = {λ1 , .., λm } ˜ y∈×D ,s.t.X ∈X i i / Equality Constraints the probability measure defined by a CCMRF is #P Probability measure P over X defined through A : D → RkA , a ∈ Rk A 1 m hard in the worst case. Experimental  Results   Inequality Constraints f (x) = exp[− λj φj (x)] Z(Λ) B : D → Rk B , b ∈ Rk B Collective classification of 1717 Wikipedia articles with 20% seed documents j=1   ˜ D = D ∩ {x|A(x) = a ∧ B(x) ≤ b} In  Theory…   Setup using tf/idf weighted cosine similarity as baseline and comparing against a m PSL program with learned weights over K-folds cross validation. Z(Λ) = exp − λj φj (x) dx / ˜ f (x) = 0 ∀x ∈ D Why  CCMRF?   Std. Deviation Indicator of D j=1 Folds Improvement P(Null Relative Confidence over baseline Hypothesis) Difference Δ(σ) Probabilistic soft logic (PSL) is a declarative language ∆(σ) = 2 σ− − σ+ 20 41.4% 1.95E-09 38.3% for collective probabilistic reasoning about similarity σ+ + σ− What  does  it  look  like?   or uncertainty in relational domains. PSL focuses on 25 31.7% 2.40E-13 41.2% 30 39.1% 1.00E-16 43.5% Hypothesis X1 statistical relational learning problems with continuous 35 46.1% 4.54E-08 39.0% ∆(σ) 0 1 1 X1 φ3 (x) = max(0, x2 − x3 ) f RVs and supports sets and aggregation. Convergence Analysis φ2 (x) = max(0, x1 − x2 ) 0 1 PSL programs get grounded into CCMRFs for inference. 5 KL Divergence φ1 (x) = x1 x1 + x3 ≤ 1 w1 : class(B,C)  A.text≈B.text class(A,C) Average KL Divergence P(0.4 ≤ X2 ≤ 0.6) 0.5 X3 0 Highest Probability 0 X3 w2 : class(B,C)  link(A,B) class(A,C) Lowest Quartile KL RV) Divergence (322-413 1 1 Highest Quartile KL RV) (174-224 X2 Λ = {1, 2, 1} Constraint: functional(class) 0.05 Divergence X = {X1 , X2 , X3 } 30000 300000 Number of Samples 3000000