COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Automated Construction of
Bayesian Networks from
Qualitative Knowledge
• Dr. Ed Wright
• Dr. Bob Schrag
Haystax Advanced Threat Analytics
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Outline
• Intro
• Fusion, situation assessment
• Bayesian Networks
• Challenges – where do all the numbers come from?
• Qualitative representation of Common Patterns
• Concepts, Indicators, Summary, Mitigation and Relevance
• Qualitative representation: default values => parameters for the CPTs
• Some implementation issues
• Examples
• BN - Chest Xray
• Complex Model
• Implementation
• Future enhancements
• Conclusion
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Information Fusion - Situation Assessment
• Making inferences from heterogeneous, incomplete, contradictory information
• Intelligence data fusion, Political forecasts, Medical diagnosis, Marketing, …
• Characteristics
• Hypotheses of Interest – that can not be directly observed
• Indicating hypotheses – that are likely true (or false) if a Hypothesis of Interest is true
• Evidence / information related to one or more of the hypotheses
• Incomplete knowledge and Uncertainty
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Information Fusion - Example
Hypotheses of Interest
Will the Blue insurgency succeed in country Orange?
Indicating hypotheses
Blue is popular
Blue has the military capacity to succeed
Blue has adequate military communications
Blue has adequate weapons
Blue has operational success
Blue has operational failure
Blue leadership is adequate
Orange is Popular
Evidence
Intel reports, radio intercepts, news reports, blogs, twitter, …
Incomplete knowledge and Uncertainty
Blue
Insurgency
Succeeds
Blue is
Popular
Blue has
Military
Capability
Blue has
Adequate
Leadership
Orange is
Vulnerable
Blue has
military
Comms
Blue has
Adequate
Weapons
Blue has
Operational
Success
Blue has
Operational
Failure
radio
intercept
News
Report
News
Report
Intel.
Report
Twitter
Sentiment
News
Report
News
Report
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Bayesian Networks for Information Fusion
• Probabilistic Model
• Executable model
• Uncertainty representation is built in
• Explicit / Efficient representation
• Makes assumptions explicit
• Facilitates communication between analysts
• Support for learning, and for encoding prior knowledge
• Inference propagates in all directions
• Computational model
• ‘What if’ analysis
• Sensitivity analysis
Hypothesis
Indicator1
true false
0.1 0.9
true false
true 0.8 0.2
false 0.1 0.9
Indicator2
Hypothesis
Indicator1
true false
T T 0.96 0.04
T F 0.8 0.2
F T 0.8 0.2
F F 0.1 0.9
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Challenges – Where do all of the numbers come from?
• Each node requires a local probability table
• No parents - > Prior distribution
• Node with Parents – Conditional Probability Table (CPT)
• One row in the table for each combination of the states of
the parent
• Where do the numbers come from?
• Learning
• Expert knowledge from Subject Matter Experts
• Combination Knowledge + Learning
• DARPA Program Manager: ‘It is a humongous knowledge engineering challenge!’
Parent1 t/f Parent2 t/f Parent3 t/f Parent4 t/f
Child t/f
Parent1
Parent2
Parent3
Parent4
true false
T T T T
T T T F
T T F T
T T F F
T F T T
T F T F
T F F T
T F F F
F T T T
F T T F
F T F T
F T F F
F F T T
F F T F
F F F T
F F F F
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Patterns
• Concepts – Hypotheses
• Indicator
• Evidence for or against one or more hypotheses
• Also a hypothesis
• We need a CPT function -> NoisyOR
• Mitigation, Relevance
• Also a hypothesis
• Summary
• Also a Hypothesis
Hypotheses
Indicator
Hypotheses
Indicator
Mitigation
Indicator
Summary
IndicatorIndicator
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Qualitative assessment: Default Parameter Values
• Strength
• Setting an evidence node true =>
same as applying likelihood to the
hypothesis
• Strength of evidence:
• Use ratios
• Strong = 8:1
• Medium = 4:1
• Weak = 2:1
• Absolute = 
• Polarity. If negative polarity, swap
the parent state for the CPT
calculation
Hypothesis
Indicator
Hypothesis
true
false
12.0
88.0
Indicator
true
false
18.4
81.6
Hypothesis
true
false
100
0
Absolute Indicator
true
false
100
0
Hypothesis
true
false
0
100
Absolute Neg Indicator
true
false
100
0
Hypothesis
true
false
52.2
47.8
Strong Indicator
true
false
100
0
Hypothesis
true
false
1.68
98.3
Strong Neg Indicator
true
false
100
0
Hypothesis
true
false
35.3
64.7
Medium Indicator
true
false
100
0
Hypothesis
true
false
3.30
96.7
Medium Neg Indicator
true
false
100
0
Hypothesis
true
false
21.4
78.6
Weak Indicator
true
false
100
0
Hypothesis
true
false
6.38
93.6
Weak Neg Indicator
true
false
100
0
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
CPT Generation - Indicators for Multiple Hypotheses
Need a function: F(a set of states of the parent nodes ) => one row of the CPT
Using Strength and polarity
strong, positive
medium,
positive
weak,
positive
strong, negative
Hypothesis1
true
false
32.8
67.2
Hypothesis2
true
false
9.32
90.7
Hypothesis3
true
false
11.3
88.7
Hypothesis4
true
false
59.1
40.9
ind2no
true
false
100
0
Hypothesis1
true
false
12.0
88.0
Hypothesis2
true
false
5.00
95.0
Hypothesis3
true
false
8.00
92.0
Hypothesis4
true
false
85.0
15.0
ind2no
true
false
31.0
69.0
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
CPT Generation - NoisyOR (Other Functions are possible, i.e. NoisyAnd)
• NoisyOR for CPT
• Need strength(s) and Leak
• Default leak = 0.1
• Can be overridden
• Strength
• Strength of evidence - Use ratios
• Strong = 8:1
• Medium = 4:1
• Weak = 2:1
• Absolute = inf
• Polarity. If negative polarity, swap the parent
state for the CPT calculation
• Absolute strength: replace the row where the
absolute parent is false with [0, 1.0]
E is the child node
C is the set of parent nodes
True ( C ) is the set of parents whose
state is true for the CPT element
being calculated
pi is the causal strength
p0 is the leak
Noisy-Or, -And, -Max and -Sum Nodes in Netica
2008-02-08
© 2000-2008 Norsys Software Corp.
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
CPT Generation: Relevance and Mitigation
Mitigation
true
false
12.0
88.0
Indicator with Mitigation
true
false
22.2
77.8
Hypothesis
true
false
12.0
88.0
Hypothesis
Indicator with Mitigation
Mitigation
Evidence for or
against Mitigation
Evidence for Mitigation
true
false
18.4
81.6
Mitigation
true
false
0
100
Indicator with Mitigation
true
false
100
0
Hypothesis
true
false
52.2
47.8
Evidence for Mitigation
true
false
10.0
90.0
Mitigation
true
false
100
0
Indicator with Mitigation
true
false
100
0
Hypothesis
true
false
12.0
88.0
Evidence for Mitigation
true
false
80.0
20.0
Indicator with Mitigation
true
false
100
0
Hypothesis
true
false
22.1
77.9
Mitigation
true
false
74.8
25.2
Hypothesis
Relevance
Indicator with Relevance
Evidence for or
Evidence for Mitigation
true 100
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
CPT Generation: Relevance and Mitigation
• Desired(?): applying evidence
does not change the mitigation
Indicator with Mitigation
true
false
1.62
98.4
Evidence in Mitigation
true
false
0
100
Mitigation
true
false
2.94
97.1
Hypothesis
true
false
0.50
99.5
Indicator with Mitigation
true
false
100
0
Evidence in Mitigation
true
false
0
100
Mitigation
true
false
91.0
9.01
Hypothesis
true
false
9.46
90.5
Indicator with Mitigation
true
false
100
0
Evidence in Mitigation
true
false
0
100
Mitigation
true
false
2.94
97.1
Hypothesis
true
false
97.1
2.93
Need to do the algebra, figure out what
numbers will result in no change to the
mitigation when evidence is applied.
Note: when evidence is applied elsewhere in
the network, mitigation will change.
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Patterns - Extras
• Copy of / opposite of
• Rare event
• Over ride the default Leak
• Target beliefs
Hypothesis
OppositeSynonym
Deterministic Relationship
Hypotheses
Indicator
Hypotheses
Indicator
Mitigation
Indicator
Summary
IndicatorIndicatorPrior P(t) is too small or too large
TgtBelCnstrnt
Calculate CPT for artificial evidence, when
applied, brings P(t) to desired value
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Patterns - Extras
If a node is a parent to a summary node,
the effect of mitigation or relevance on
that node gets ignored
• Hypothesis Copy
The construction software automatically
detects this, and introduces a Hypothesis
copy as the parent of the original
hypothesis, and of the summary
HypothesesHypotheses
Mitigation
Indicator Summary
Hypotheses
Hypotheses
Mitigation
Indicator
Summary
Hyp Copy
Indicator
Indicator
Deterministic Relationship
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Visualization
Positive Influence
Negative Influence
Absolute Influence
Strong Influence
Moderate Influence
Weak Influence
Absolute,
Positive
Absolute,
Negative
Weak,
Negative
Strong,
Positive
Moderate,
Positive
Strong,
Positive
Strong,
Positive
Absolute
,
Negative
Moderate,
Positive
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Example - Chest X-Ray Qualitative statements
Tuberculosis is a medium positive indicator of VisitToAsia
LungCancer is a strong positive indicator of Smoking
Bronchitis is a weak positive indicator of smoking
XRay is a strong positive indicator of
TuberculosisOrCancer
Dyspnea is a strong positive indicator of Bronchitis
Dyspnea is a strong positive indicator of
TuberculosisOrCancer
TuberculosisOrCancer is an OR summary of [Tuberculosis,
LungCancer]
VisitToAsia prior: 0.01
Smoking prior: 0.5
Tuberculosis targetBelief: 0.01
LungCancer leak: 0.01
Bronchitis leak: 0.30
XRay leak 0.05
Tuberculosis
present
absent
1.04
99.0
Tuberculosis or Cancer
true
false
6.48
93.5
XRay
abnormal
normal
11.0
89.0
Lung Cancer
present
absent
5.50
94.5
Dyspnea
present
absent
43.6
56.4
Bronchitis
present
absent
45.0
55.0
Smoking
smoker
nonsmoker
50.0
50.0
Visit to Asia
Visited Asia within the last 3 y...
no visit
1.00
99.0
Original model: Based on Lauritzen & Spiegelhalter
1988. Distributed by Norsys Software Corp.
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Example - Chest Xray
Tuberculosis
present
absent
1.04
99.0
Tuberculosis or Cancer
true
false
6.48
93.5
XRay
abnormal
normal
11.0
89.0
Lung Cancer
present
absent
5.50
94.5
Dyspnea
present
absent
43.6
56.4
Bronchitis
present
absent
45.0
55.0
Smoking
smoker
nonsmoker
50.0
50.0
Visit to Asia
Visited Asia within the last 3 y...
no visit
1.00
99.0
Original model: Based on Lauritzen & Spiegelhalter
1988. Distributed by Norsys Software Corp.
VisitToAsia
true
false
0.64
99.4
Smoking
true
false
50.0
50.0
XRay
true
false
7.25
92.8
Dyspnea
true
false
48.8
51.2
TuberculosisOrCancer
true
false
5.91
94.1
TgtBelCnstForTuberculosis
true
false
100
0
Tuberculosis
true
false
1.0
99.0
LungCancer
true
false
4.96
95.0
Bronchitis
true
false
51.0
49.0
Automatically constructed from qualitative
representation.
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Example – Complex Model
• Bayesian Network representing concepts and
relationships defined in a set of source documents
• 700(+) concepts
• The Government Customer: “for the first time, we
have a computational model of the concepts in this
document!”
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
1. Extract key concept nodes.
2. Extract influence arcs, specifying…
3. Create master influence graph.
4. Compile into standard Bayesian network (BN).
5. Exploit, improve the model.
• Strength: Absolute, strong, moderate, weak, …
• Polarity: Positive, negative
• Assemble graph from extracted arcs.
• Review extracted concepts, influences.
• Normalize extracted concept names, definitions.
• Insert missing concepts, influences.
• Add missing logical structure: AND, OR, Opposite.
• Collect concepts’ parent nodes, build conditional probability
tables (CPTs) per influence spec’s.
• Insert pattern-required nodes for mitigation, relevance.
• Run the model against test cases.
• Review model inferences with SMEs.
• Revise as appropriate.
Model Development Process
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Master Influence Graph
(defparameter *Influences*
'((VisitToAsia
(:IndicatedBy
(:Moderately (Tuberculosis))))
(Smoking
(:IndicatedBy
(:Strongly (LungCancer))
(:Weakly (Bronchitis
(:IndicatedBy
(:Strongly (Dyspnea)))))))
(TuberculosisOrCancer
(:ImpliedByDisjunction
(Tuberculosis)
(LungCancer))
(:IndicatedBy
(:Strongly
(XRay)
(Dyspnea))))))
Tuberculosis
present
absent
1.04
99.0
Tuberculosis or Cancer
true
false
6.48
93.5
XRay
abnormal
normal
11.0
89.0
Lung Cancer
present
absent
5.50
94.5
Dyspnea
present
absent
43.6
56.4
Bronchitis
present
absent
45.0
55.0
Smoking
smoker
nonsmoker
50.0
50.0
Visit to Asia
Visited Asia within the last 3 y...
no visit
1.00
99.0
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Implementation
• Netica, NeticaJ API
• Jython
• Integrate with Java/NeticaJ
• Higher level tools on top of NeticaJ
• GraphViz
• Netica does not do graph layout
• Build the BN in Netica (all the nodes are on top of each other)
• Extract nodes and links
• Use GraphViz to layout
• Update each Node in Netica with new coordinates
• Franz Lisp / AllegroGraph – Netica API
• Master Influence Graph
• Ingesting data and applying evidence to the network
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Limitations & Future Enhancements
• Limitations
• Binary Nodes
• Limited (but extendable) set of patterns
• Future Enhancements
• Strength of mitigation / relevance
• Richer set of qualitative statements
• Additional CPT models (NoisyAnd, …)
• Global / local mitigation
• Visual editor
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Conclusion
For a large set of information fusion problems, it is possible to shortcut the ‘humongous knowledge
engineering challenge’, by automatically generating a usable Bayesian Network from English like
qualitative statements (and a few numbers).
The resulting Bayesian Network is immediately useful, and can be a start point for further
knowledge refinement.
COMPANY PROPRIETARY INFORMATION
Haystax Advanced Analytics Lab
Thank You
Contact us: info@haystax.com
Visit us: www.haystax.com
8251 Greensboro Drive, Suite 1111
McLean, VA 22012

More Related Content

PPTX
Hypothesis Testing: Proportions (Compare 1:Standard)
PPTX
Hypothesis Testing: Finding the Right Statistical Test
PPTX
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
PPTX
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
PPTX
Hypothesis Testing: Proportions (Compare 1:1)
PPTX
Hypothesis Testing: Spread (Compare 1:Standard)
PPTX
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
PPTX
Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Finding the Right Statistical Test
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)

What's hot (20)

PPTX
Hypothesis Testing: Proportions (Compare 2+ Factors)
PPTX
Hypothesis Testing: Statistical Laws and Confidence Intervals
PPTX
Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)
PPTX
Hypothesis Testing: Spread (Compare 1:1)
PPTX
Hypothesis Testing: Spread (Compare 2+ Factors)
PPTX
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
PPTX
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
PPTX
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
PPTX
Hypothesis Testing: Relationships (Overview)
PPTX
Hypothesis Testing: Overview
PDF
910 plenary Elder
PPTX
Hypothesis Testing: Formal and Informal Sub-Processes
PPT
Lecture 7
PDF
On the Measurement of Test Collection Reliability
DOCX
Project two guidelines and rubric.html competencyin this pr
PDF
Causal Inference in Data Science and Machine Learning
PPTX
Dowhy: An end-to-end library for causal inference
PPTX
Hypothesis Testing: Relationships (Compare 2+ Factors)
DOCX
Hypothesis testing
PDF
MAT80 - White paper july 2017 - Prof. P. Irwing
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 1:1)
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Relationships (Overview)
Hypothesis Testing: Overview
910 plenary Elder
Hypothesis Testing: Formal and Informal Sub-Processes
Lecture 7
On the Measurement of Test Collection Reliability
Project two guidelines and rubric.html competencyin this pr
Causal Inference in Data Science and Machine Learning
Dowhy: An end-to-end library for causal inference
Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis testing
MAT80 - White paper july 2017 - Prof. P. Irwing
Ad

Viewers also liked (20)

PPT
Ahuakate hezurretik abiatuz...
PPTX
The Future of Advanced Analytics
PPTX
Enterprise Threat Management
PDF
Sponda Results presentation Q1 2016
PDF
Sponda Results presentation Q4 2015
PPTX
facebook m
PDF
Results presentation q2 2015
PPTX
First class presentation GE14 G01 Jan 6th - Feb 26th
PDF
Exploiting inference to improve temporal RDF annotations and queries for mach...
PDF
Results presentation q3 2016
PPTX
Haystax - Analytic Products and Enterprise Network Services
PPT
Freedom Health interview questions and answers
PPTX
Processing Events in Probabilistic Risk Assessment
PDF
Sponda Financial Results Q1 2015 presentation 050515
PPTX
Evaluation Q7
PDF
Sponda yleisesitys 31032014
PPTX
Haystax Carbon for Insider Threat Management
PPTX
Drupal commerce
PDF
Crack control of slabs design booklet
PPT
Діагностування електродвигунів
Ahuakate hezurretik abiatuz...
The Future of Advanced Analytics
Enterprise Threat Management
Sponda Results presentation Q1 2016
Sponda Results presentation Q4 2015
facebook m
Results presentation q2 2015
First class presentation GE14 G01 Jan 6th - Feb 26th
Exploiting inference to improve temporal RDF annotations and queries for mach...
Results presentation q3 2016
Haystax - Analytic Products and Enterprise Network Services
Freedom Health interview questions and answers
Processing Events in Probabilistic Risk Assessment
Sponda Financial Results Q1 2015 presentation 050515
Evaluation Q7
Sponda yleisesitys 31032014
Haystax Carbon for Insider Threat Management
Drupal commerce
Crack control of slabs design booklet
Діагностування електродвигунів
Ad

Similar to Haystax bayesian networks (20)

PDF
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
PDF
The math behind big systems analysis.
PDF
Using data science to automate event correlation - June 2016 - Dan Turchin - ...
PPTX
Programming-Introduction-to-Machine-Learning.pptx
PDF
Yo. big data. understanding data science in the era of big data.
PPTX
Causality in Python PyCon 2021 ISRAEL
PPTX
machine leraning : main principles and techniques
PDF
data history / data science @ NYT
PPT
AML_030607.ppt
PPTX
AI for PM.pptx
PDF
Data Driven Engineering 2014
PPTX
Analytics Boot Camp - Slides
PPTX
Pushing Machine Learning Down the Security Stack to Make It More Effective fo...
PPT
Utah Code Camp 2014 - Learning from Data by Thomas Holloway
PPT
Download presentation source
PPTX
rsec2a-2016-jheaton-morning
PDF
Business Optimization via Causal Inference
PPTX
Predire il futuro con Machine Learning & Big Data
PDF
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
PDF
The role of NLP & ML in Cognitive System by Sunantha Krishnan
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
The math behind big systems analysis.
Using data science to automate event correlation - June 2016 - Dan Turchin - ...
Programming-Introduction-to-Machine-Learning.pptx
Yo. big data. understanding data science in the era of big data.
Causality in Python PyCon 2021 ISRAEL
machine leraning : main principles and techniques
data history / data science @ NYT
AML_030607.ppt
AI for PM.pptx
Data Driven Engineering 2014
Analytics Boot Camp - Slides
Pushing Machine Learning Down the Security Stack to Make It More Effective fo...
Utah Code Camp 2014 - Learning from Data by Thomas Holloway
Download presentation source
rsec2a-2016-jheaton-morning
Business Optimization via Causal Inference
Predire il futuro con Machine Learning & Big Data
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
The role of NLP & ML in Cognitive System by Sunantha Krishnan

More from Haystax Technology (12)

PPTX
Haystax Technology - About Us
PPTX
Whole Person Risk Modeling
PPTX
The constellation analytics platform
PPTX
Overview of Haystax Technology
PPTX
School Safety Center
PPTX
Major events
PDF
Public safety cloud overview
PPTX
Pre incident fire
PPTX
Critical assets protection
PDF
Haystax Technology Machine learning white paper
PPTX
Haystax carbon for Insider Threat Management & Continuous Evaluation
PPTX
Haystax: Actionable Intelligence Platform
Haystax Technology - About Us
Whole Person Risk Modeling
The constellation analytics platform
Overview of Haystax Technology
School Safety Center
Major events
Public safety cloud overview
Pre incident fire
Critical assets protection
Haystax Technology Machine learning white paper
Haystax carbon for Insider Threat Management & Continuous Evaluation
Haystax: Actionable Intelligence Platform

Recently uploaded (20)

PDF
Navigating the Thai Supplements Landscape.pdf
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPT
Image processing and pattern recognition 2.ppt
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
Global Data and Analytics Market Outlook Report
DOCX
Factor Analysis Word Document Presentation
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PPTX
modul_python (1).pptx for professional and student
PDF
Transcultural that can help you someday.
PDF
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
PPTX
Leprosy and NLEP programme community medicine
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PPTX
SET 1 Compulsory MNH machine learning intro
Navigating the Thai Supplements Landscape.pdf
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Image processing and pattern recognition 2.ppt
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Optimise Shopper Experiences with a Strong Data Estate.pdf
Global Data and Analytics Market Outlook Report
Factor Analysis Word Document Presentation
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
modul_python (1).pptx for professional and student
Transcultural that can help you someday.
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
Leprosy and NLEP programme community medicine
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
retention in jsjsksksksnbsndjddjdnFPD.pptx
SET 1 Compulsory MNH machine learning intro

Haystax bayesian networks

  • 1. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Automated Construction of Bayesian Networks from Qualitative Knowledge • Dr. Ed Wright • Dr. Bob Schrag Haystax Advanced Threat Analytics
  • 2. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Outline • Intro • Fusion, situation assessment • Bayesian Networks • Challenges – where do all the numbers come from? • Qualitative representation of Common Patterns • Concepts, Indicators, Summary, Mitigation and Relevance • Qualitative representation: default values => parameters for the CPTs • Some implementation issues • Examples • BN - Chest Xray • Complex Model • Implementation • Future enhancements • Conclusion
  • 3. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Information Fusion - Situation Assessment • Making inferences from heterogeneous, incomplete, contradictory information • Intelligence data fusion, Political forecasts, Medical diagnosis, Marketing, … • Characteristics • Hypotheses of Interest – that can not be directly observed • Indicating hypotheses – that are likely true (or false) if a Hypothesis of Interest is true • Evidence / information related to one or more of the hypotheses • Incomplete knowledge and Uncertainty
  • 4. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Information Fusion - Example Hypotheses of Interest Will the Blue insurgency succeed in country Orange? Indicating hypotheses Blue is popular Blue has the military capacity to succeed Blue has adequate military communications Blue has adequate weapons Blue has operational success Blue has operational failure Blue leadership is adequate Orange is Popular Evidence Intel reports, radio intercepts, news reports, blogs, twitter, … Incomplete knowledge and Uncertainty Blue Insurgency Succeeds Blue is Popular Blue has Military Capability Blue has Adequate Leadership Orange is Vulnerable Blue has military Comms Blue has Adequate Weapons Blue has Operational Success Blue has Operational Failure radio intercept News Report News Report Intel. Report Twitter Sentiment News Report News Report
  • 5. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Bayesian Networks for Information Fusion • Probabilistic Model • Executable model • Uncertainty representation is built in • Explicit / Efficient representation • Makes assumptions explicit • Facilitates communication between analysts • Support for learning, and for encoding prior knowledge • Inference propagates in all directions • Computational model • ‘What if’ analysis • Sensitivity analysis Hypothesis Indicator1 true false 0.1 0.9 true false true 0.8 0.2 false 0.1 0.9 Indicator2 Hypothesis Indicator1 true false T T 0.96 0.04 T F 0.8 0.2 F T 0.8 0.2 F F 0.1 0.9
  • 6. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Challenges – Where do all of the numbers come from? • Each node requires a local probability table • No parents - > Prior distribution • Node with Parents – Conditional Probability Table (CPT) • One row in the table for each combination of the states of the parent • Where do the numbers come from? • Learning • Expert knowledge from Subject Matter Experts • Combination Knowledge + Learning • DARPA Program Manager: ‘It is a humongous knowledge engineering challenge!’ Parent1 t/f Parent2 t/f Parent3 t/f Parent4 t/f Child t/f Parent1 Parent2 Parent3 Parent4 true false T T T T T T T F T T F T T T F F T F T T T F T F T F F T T F F F F T T T F T T F F T F T F T F F F F T T F F T F F F F T F F F F
  • 7. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Patterns • Concepts – Hypotheses • Indicator • Evidence for or against one or more hypotheses • Also a hypothesis • We need a CPT function -> NoisyOR • Mitigation, Relevance • Also a hypothesis • Summary • Also a Hypothesis Hypotheses Indicator Hypotheses Indicator Mitigation Indicator Summary IndicatorIndicator
  • 8. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Qualitative assessment: Default Parameter Values • Strength • Setting an evidence node true => same as applying likelihood to the hypothesis • Strength of evidence: • Use ratios • Strong = 8:1 • Medium = 4:1 • Weak = 2:1 • Absolute =  • Polarity. If negative polarity, swap the parent state for the CPT calculation Hypothesis Indicator Hypothesis true false 12.0 88.0 Indicator true false 18.4 81.6 Hypothesis true false 100 0 Absolute Indicator true false 100 0 Hypothesis true false 0 100 Absolute Neg Indicator true false 100 0 Hypothesis true false 52.2 47.8 Strong Indicator true false 100 0 Hypothesis true false 1.68 98.3 Strong Neg Indicator true false 100 0 Hypothesis true false 35.3 64.7 Medium Indicator true false 100 0 Hypothesis true false 3.30 96.7 Medium Neg Indicator true false 100 0 Hypothesis true false 21.4 78.6 Weak Indicator true false 100 0 Hypothesis true false 6.38 93.6 Weak Neg Indicator true false 100 0
  • 9. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab CPT Generation - Indicators for Multiple Hypotheses Need a function: F(a set of states of the parent nodes ) => one row of the CPT Using Strength and polarity strong, positive medium, positive weak, positive strong, negative Hypothesis1 true false 32.8 67.2 Hypothesis2 true false 9.32 90.7 Hypothesis3 true false 11.3 88.7 Hypothesis4 true false 59.1 40.9 ind2no true false 100 0 Hypothesis1 true false 12.0 88.0 Hypothesis2 true false 5.00 95.0 Hypothesis3 true false 8.00 92.0 Hypothesis4 true false 85.0 15.0 ind2no true false 31.0 69.0
  • 10. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab CPT Generation - NoisyOR (Other Functions are possible, i.e. NoisyAnd) • NoisyOR for CPT • Need strength(s) and Leak • Default leak = 0.1 • Can be overridden • Strength • Strength of evidence - Use ratios • Strong = 8:1 • Medium = 4:1 • Weak = 2:1 • Absolute = inf • Polarity. If negative polarity, swap the parent state for the CPT calculation • Absolute strength: replace the row where the absolute parent is false with [0, 1.0] E is the child node C is the set of parent nodes True ( C ) is the set of parents whose state is true for the CPT element being calculated pi is the causal strength p0 is the leak Noisy-Or, -And, -Max and -Sum Nodes in Netica 2008-02-08 © 2000-2008 Norsys Software Corp.
  • 11. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab CPT Generation: Relevance and Mitigation Mitigation true false 12.0 88.0 Indicator with Mitigation true false 22.2 77.8 Hypothesis true false 12.0 88.0 Hypothesis Indicator with Mitigation Mitigation Evidence for or against Mitigation Evidence for Mitigation true false 18.4 81.6 Mitigation true false 0 100 Indicator with Mitigation true false 100 0 Hypothesis true false 52.2 47.8 Evidence for Mitigation true false 10.0 90.0 Mitigation true false 100 0 Indicator with Mitigation true false 100 0 Hypothesis true false 12.0 88.0 Evidence for Mitigation true false 80.0 20.0 Indicator with Mitigation true false 100 0 Hypothesis true false 22.1 77.9 Mitigation true false 74.8 25.2 Hypothesis Relevance Indicator with Relevance Evidence for or Evidence for Mitigation true 100
  • 12. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab CPT Generation: Relevance and Mitigation • Desired(?): applying evidence does not change the mitigation Indicator with Mitigation true false 1.62 98.4 Evidence in Mitigation true false 0 100 Mitigation true false 2.94 97.1 Hypothesis true false 0.50 99.5 Indicator with Mitigation true false 100 0 Evidence in Mitigation true false 0 100 Mitigation true false 91.0 9.01 Hypothesis true false 9.46 90.5 Indicator with Mitigation true false 100 0 Evidence in Mitigation true false 0 100 Mitigation true false 2.94 97.1 Hypothesis true false 97.1 2.93 Need to do the algebra, figure out what numbers will result in no change to the mitigation when evidence is applied. Note: when evidence is applied elsewhere in the network, mitigation will change.
  • 13. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Patterns - Extras • Copy of / opposite of • Rare event • Over ride the default Leak • Target beliefs Hypothesis OppositeSynonym Deterministic Relationship Hypotheses Indicator Hypotheses Indicator Mitigation Indicator Summary IndicatorIndicatorPrior P(t) is too small or too large TgtBelCnstrnt Calculate CPT for artificial evidence, when applied, brings P(t) to desired value
  • 14. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Patterns - Extras If a node is a parent to a summary node, the effect of mitigation or relevance on that node gets ignored • Hypothesis Copy The construction software automatically detects this, and introduces a Hypothesis copy as the parent of the original hypothesis, and of the summary HypothesesHypotheses Mitigation Indicator Summary Hypotheses Hypotheses Mitigation Indicator Summary Hyp Copy Indicator Indicator Deterministic Relationship
  • 15. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Visualization Positive Influence Negative Influence Absolute Influence Strong Influence Moderate Influence Weak Influence Absolute, Positive Absolute, Negative Weak, Negative Strong, Positive Moderate, Positive Strong, Positive Strong, Positive Absolute , Negative Moderate, Positive
  • 16. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Example - Chest X-Ray Qualitative statements Tuberculosis is a medium positive indicator of VisitToAsia LungCancer is a strong positive indicator of Smoking Bronchitis is a weak positive indicator of smoking XRay is a strong positive indicator of TuberculosisOrCancer Dyspnea is a strong positive indicator of Bronchitis Dyspnea is a strong positive indicator of TuberculosisOrCancer TuberculosisOrCancer is an OR summary of [Tuberculosis, LungCancer] VisitToAsia prior: 0.01 Smoking prior: 0.5 Tuberculosis targetBelief: 0.01 LungCancer leak: 0.01 Bronchitis leak: 0.30 XRay leak 0.05 Tuberculosis present absent 1.04 99.0 Tuberculosis or Cancer true false 6.48 93.5 XRay abnormal normal 11.0 89.0 Lung Cancer present absent 5.50 94.5 Dyspnea present absent 43.6 56.4 Bronchitis present absent 45.0 55.0 Smoking smoker nonsmoker 50.0 50.0 Visit to Asia Visited Asia within the last 3 y... no visit 1.00 99.0 Original model: Based on Lauritzen & Spiegelhalter 1988. Distributed by Norsys Software Corp.
  • 17. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Example - Chest Xray Tuberculosis present absent 1.04 99.0 Tuberculosis or Cancer true false 6.48 93.5 XRay abnormal normal 11.0 89.0 Lung Cancer present absent 5.50 94.5 Dyspnea present absent 43.6 56.4 Bronchitis present absent 45.0 55.0 Smoking smoker nonsmoker 50.0 50.0 Visit to Asia Visited Asia within the last 3 y... no visit 1.00 99.0 Original model: Based on Lauritzen & Spiegelhalter 1988. Distributed by Norsys Software Corp. VisitToAsia true false 0.64 99.4 Smoking true false 50.0 50.0 XRay true false 7.25 92.8 Dyspnea true false 48.8 51.2 TuberculosisOrCancer true false 5.91 94.1 TgtBelCnstForTuberculosis true false 100 0 Tuberculosis true false 1.0 99.0 LungCancer true false 4.96 95.0 Bronchitis true false 51.0 49.0 Automatically constructed from qualitative representation.
  • 18. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Example – Complex Model • Bayesian Network representing concepts and relationships defined in a set of source documents • 700(+) concepts • The Government Customer: “for the first time, we have a computational model of the concepts in this document!”
  • 19. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab 1. Extract key concept nodes. 2. Extract influence arcs, specifying… 3. Create master influence graph. 4. Compile into standard Bayesian network (BN). 5. Exploit, improve the model. • Strength: Absolute, strong, moderate, weak, … • Polarity: Positive, negative • Assemble graph from extracted arcs. • Review extracted concepts, influences. • Normalize extracted concept names, definitions. • Insert missing concepts, influences. • Add missing logical structure: AND, OR, Opposite. • Collect concepts’ parent nodes, build conditional probability tables (CPTs) per influence spec’s. • Insert pattern-required nodes for mitigation, relevance. • Run the model against test cases. • Review model inferences with SMEs. • Revise as appropriate. Model Development Process
  • 20. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Master Influence Graph (defparameter *Influences* '((VisitToAsia (:IndicatedBy (:Moderately (Tuberculosis)))) (Smoking (:IndicatedBy (:Strongly (LungCancer)) (:Weakly (Bronchitis (:IndicatedBy (:Strongly (Dyspnea))))))) (TuberculosisOrCancer (:ImpliedByDisjunction (Tuberculosis) (LungCancer)) (:IndicatedBy (:Strongly (XRay) (Dyspnea)))))) Tuberculosis present absent 1.04 99.0 Tuberculosis or Cancer true false 6.48 93.5 XRay abnormal normal 11.0 89.0 Lung Cancer present absent 5.50 94.5 Dyspnea present absent 43.6 56.4 Bronchitis present absent 45.0 55.0 Smoking smoker nonsmoker 50.0 50.0 Visit to Asia Visited Asia within the last 3 y... no visit 1.00 99.0
  • 21. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Implementation • Netica, NeticaJ API • Jython • Integrate with Java/NeticaJ • Higher level tools on top of NeticaJ • GraphViz • Netica does not do graph layout • Build the BN in Netica (all the nodes are on top of each other) • Extract nodes and links • Use GraphViz to layout • Update each Node in Netica with new coordinates • Franz Lisp / AllegroGraph – Netica API • Master Influence Graph • Ingesting data and applying evidence to the network
  • 22. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Limitations & Future Enhancements • Limitations • Binary Nodes • Limited (but extendable) set of patterns • Future Enhancements • Strength of mitigation / relevance • Richer set of qualitative statements • Additional CPT models (NoisyAnd, …) • Global / local mitigation • Visual editor
  • 23. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Conclusion For a large set of information fusion problems, it is possible to shortcut the ‘humongous knowledge engineering challenge’, by automatically generating a usable Bayesian Network from English like qualitative statements (and a few numbers). The resulting Bayesian Network is immediately useful, and can be a start point for further knowledge refinement.
  • 24. COMPANY PROPRIETARY INFORMATION Haystax Advanced Analytics Lab Thank You Contact us: info@haystax.com Visit us: www.haystax.com 8251 Greensboro Drive, Suite 1111 McLean, VA 22012