Processing events in 
probabilistic risk 
assessment 
9th International Conference on Semantic 
Technologies for Intelligence, Defense, and 
Security (STIDS). November 20, 2014 
Annotated presentation—see Notes Page view.
Three event-informed person risk models 
1. MC (“Carbon”): 
Information disclosure risk 
 Belief that a (candidate) member person 
P will disclose an organization’s private 
information 
Life (“macro”) events 
 Education, employment 
 Crime, civil judgment 
 Bankruptcy, credit 
 … 
2. MS (“Silicon”): 
IT system insider exploitation risk 
 Belief that a user will access, disclose, 
or destroy an organization’s computer 
network-resident information) 
Computer network (“micro”) events 
 Log in after hours 
 Access “decoy” file 
 Copy file to… 
 External location 
 Thumb drive 
3. MG = MC • MS 
NOTE: Carbon and Silicon are names of Haystax Analytic Products
2 
Theme 
Issue: Apply event evidence to person attribute concept random variables 
(RVs) in a risk assessment Bayesian network (BN), modeling events’ changing 
relevance over time. 
Given: 
 Person P 
 Events E, in P’s past or present 
 Generic person BN B 
 Risk-related person attribute concept RVs (Boolean) 
 Concept-relating probabilistic influences 
 A reference time t (in an ordered set T of such points) 
Develop: 
 Person-specific BN BP reflecting E 
 Beliefs in P’s attribute concept at t, per BP 
 (P’s historical risk profile over T)
3 
Elided B with ingested event categories (MC) 
Trustworthy 
Reliable 
CommitsMisdemeanor 
CommittedToSchool … CommittedToCareeer 
School events Employment events 
Law 
enforcement 
events 
… 
…
Approaches to realizing BP 
1. Event “ingestion”: 
For each event e in E, … 
 Include a new event RV δ indicating 
person attribute concept π in BP 
 Specify per-event half life decay as 
new temporal relevance RV ρ 
 Enter hard evidence finding on δ 
 Appropriate when events are of a 
given type τ are individually salient 
 Feasible when |E| << |nodes(B )| 
Ingestion 
π ρ 
concept relevance 
δ 
event
5 
Life events timeline (MC)
Three event-informed person risk models 
1. MC (“Carbon”): 
Information disclosure risk 
 100s of RVs 
 B extracted from official policy / 
guidelines (under in situ test) 
Life (“macro”) events 
 10s of types 
 10s of events / person 
 10s of years of data 
Ingestion only (“hard” salience) 
 10s of rules 
2. MS (“Silicon”): 
IT system insider exploitation risk 
 10s of RVs 
 B eyeballed (preliminary proof of 
concept) 
Computer network (“micro”) events 
 10s of types 
 100Ks of events / person 
 1.5 years of data 
Summarization, primarily (“soft” 
salience) 
 1s of ingestion rules 
3. MG = MC • MS
Three event-informed person risk models 
2. MS (“Silicon”): 
IT system insider exploitation risk 
 Belief that a user will access, disclose, 
or destroy an organization’s computer 
network-resident information) 
Computer network (“micro”) events 
 Log in after hours 
 Access “decoy” file 
 Copy file to… 
 External location 
 Thumb drive 
3. MG = MC • MS
Approaches to realizing BP 
2. Event “summarization”: 
For each event type τ represented in 
E, … 
 Include an event “summary” RV Δ 
indicating π in B 
 Develop a likelihood summarizing the 
impact of events τ collected into 
temporal buckets 
 Enter likelihood finding on Δ 
 Appropriate when the salience of 
events type τ tends to depend on 
trends w.r.t. an individual or a 
population thereof 
 Useful when ⌐(|E| << |nodes(B )|) 
Summarization 
concept relevance 
π ρ 
Δ 
summary 
events δ1 δ2 … δn
9 
Summarization elements (per RV) 
Summarize events over a practically unlimited duration, by using temporal 
buckets of geometrically increasing size. 
Infer salience from event volume variation w.r.t. a person’s own and the 
population’s history. 
Weight buckets per desired temporal relevance decay.
10 
MS 
Summarization metric: Count (CopyDecoyToExternal) 
600 
500 
400 
300 
200 
100 
0 
64 16 4 1 
Day 
Count 
Bucket
11 
MS 
Summarization metric: Variation re self (CopyDecoyToExternal) 
1 
0.8 
0.6 
0.4 
0.2 
0 
64 16 4 1 
Day 
Variation: self 
Bucket
12 
MS 
Summarization metric: Variation re all (CopyDecoyToExternal) 
1 
0.8 
0.6 
0.4 
0.2 
0 
1 4 16 64 
Day 
Variation: all 
Bucket
13 
MS 
Summarization metric: Variations mean (CopyDecoyToExternal) 
1 
0.8 
0.6 
0.4 
0.2 
0 
64 16 4 1 
Day 
Variations mean 
Bucket
14 
MS 
Summarization metric: Suspicion warrant (CopyDecoyToExternal) 
1 
0.8 
0.6 
0.4 
0.2 
0 
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 
Suspicion warrant 
Day
Approaches to realizing BP 
2. Event “summarization”: 
For each event type τ represented in 
E, … 
 Include an event “summary” RV Δ 
indicating π in B 
 Develop a likelihood summarizing the 
impact of events τ collected into 
temporal buckets 
 Enter likelihood finding on Δ 
 Appropriate when the salience of 
events type τ tends to depend on 
trends w.r.t. an individual or a 
population thereof 
 Useful when ⌐(|E| << |nodes(B )|) 
Summarization 
concept relevance 
π ρ 
Δ 
summary 
events δ1 δ2 … δn
16 
Computer network events timeline (MS)
17 
Influence graph specification (MS) 
(defparameter *Influences* 
'((ExploitsITSystemAsInsider 
(:ImpliedByDisjunction 
(CommitsITExploitation 
(:ImpliedBy (DestroysInformationUnauthorized) 
(AccessesInformationUnauthorized) ; Ingested: HandlesKeylogger_Event 
(DisclosesInformationUnauthorized) ; Ingested: CopyFileToWikileaks_Event 
(StealsInformation))) ; Ingested: CopyFileToCompetitor_Event 
(WarrantsITExploitationSuspicion 
(:ImpliedBy (WarrantsInformationDestructionSuspicion 
(:IndicatedBy (:Strongly (DeleteFileOnOthersPC_Summary)) 
(:Moderately (DeleteFileOnLabsPC_Summary)))) 
(WarrantsUnauthorizedInformationAccessSuspicion 
(:IndicatedBy (:Moderately (AfterHoursLogin_Summary)) 
(:Weakly (OpenFileOnOthersPC_Summary)))) 
(WarrantsUnauthorizedInformationDisclosureSuspicion 
(:IndicatedBy (:Strongly (CopyOthersFileToThumb_Summary) 
(CopyDecoyToExternal_Summary)) 
(:Moderately (OpenDecoyFile_Summary) 
(AcquireDecoyFile_Summary) 
(CopyFileToExternal_Summary)) 
(:Weakly (CopyFromThumbToOwnPC_Summary) 
(CopyOwnFileToThumb_Summary) 
(CopyOthersFileToExternal_Summary))))) 
(:RelevantIf (:Locally (:Absolutely (Untrustworthy)))) 
(:MitigatedBy (:Locally (:Strongly (HasRole-ITAdmin)))))))))
18 
Computer network events timeline (MS)
Combined timeline (MG = MC • MS)
20 
Ingestion issue: Interacting temporal relevance nodes 
Temporal relevance nodes participate in belief propagation in BP—making 
their beliefs (so, effective temporal relevance) subject to departure from 
nominal specification. 
Multiple temporal and/or semantically close events’ relevance nodes 
reinforce each other—inducing temporal relevance beyond nominal 
specification. 
 5 simultaneous events’ decay only 6% after half life interval. 
 We might naively expect 50%. 
Summarization largely insulates a temporal relevance node from surrounding 
belief propagation.
21 
Supporting software “stack” 
Allegro Common Lisp® (ACL) 
AllegoGraph® Lisp direct client 
Allegro Prolog macros (e.g., select) 
Lisp macros (e.g., iterate-cursor) 
ACL API to the Netica® API 
Netica® API
22 
Ingestion rule (MC) 
(defIngestionRule RestrainingOrder 
(+process-reportedEvent ?person ?*asOfDate) 
(reportedEvent ?person 
?*asOfDate 
?event 
!agent:ProtectiveRestrainingOrder 
?*startDate 
?*endDate 
?*ongoing? 
?*reportDate) 
(lisp (create-EventConceptIndication 
?person 
:IndicatedConcept CommitsDomesticViolence 
:+IndicatingEvent ?event 
:Terminus :end 
:DeltaDays (- ?*asOfDate ?*endDate) 
:HalfLife (* 6 365) 
:Strength :strong 
:Polarity :positive)))
23 
Ontology and data specifications (MC) 
(defOntologyInstance !data:P (Person)) 
(defOntologyInstance 
!data:PHighSchoolAttendance 
(SchoolAttendance) 
(riskRatingSubject !data:P) 
(schoolCredentialAward !data:PDiplomaAward) 
(startDate "2000-09-04") 
(endDate "2004-06-15")) 
(defOntologyInstance !data:PDiplomaAward 
(SchoolCredentialAward) 
(riskRatingSubject !data:P) 
(startDate "2004-06-15") 
(schoolCredentialAwarded HighSchoolDiploma)) 
(defOntologyInstance !data:PEmployment 
(Employment) 
(riskRatingSubject !data:P) 
(startDate "2004-07-05") 
(endDate "2009-09-05")) 
(defOntologyInstance !data:PMisdemeanorAssault 
(PoliceOffense) 
(riskRatingSubject !data:P) 
(offenseChargeSchedule Misdemeanor) 
(startDate "2007-06-30")) 
(defOntologyClass Person (Thing) 
(hasGender Gender :Functional)) 
(defOntologyClass Gender (Thing) 
(:enumeration Male Female OtherGender)) 
(defOntologyType Date !xsd:date) 
(defOntologyClass Event (Thing) 
(riskRatingSubject Person :Functional) 
(startDate Date (:cardinality 1)) 
(endDate Date :Functional) 
(sourceReport Report :Functional)) 
(defOntologyClass PointEvent (Event) 
(hasConsequentEvent Event)) 
(defOntologyClass DurativeEvent (Event) 
(hasSubEvent Event)) 
(defOntologyClass ProtectiveRestrainingOrder 
(PointEvent))
24 
Thank you. 
Questions ?
25 
Extras…
Approaches to realizing BP 
1. Event “ingestion”: 
For each event e in E, … 
 Include a new event RV δ indicating 
person attribute concept π in BP 
 Specify per-event half life decay as 
new temporal relevance RV ρ 
 Enter hard evidence finding on δ 
 Appropriate when events are of a 
given type τ are individually salient 
 Feasible when |E| << |nodes(B )| 
2. Event “summarization”: 
For each event type τ represented in 
E, … 
 Include an event “summary” RV Δ 
indicating π in B 
 Develop a likelihood summarizing the 
impact of events τ collected into 
geometrically larger buckets 
 Enter likelihood finding on Δ 
 Appropriate when the salience of 
events type τ tends to depend on 
trends w.r.t. an individual or a 
population thereof 
 Needed when ⌐(|E| << |nodes(B )|)
Approaches to realizing BP 
Ingestion 
π ρ 
concept relevance 
δ 
event 
Summarization 
concept relevance 
π ρ 
Δ 
summary 
events δ1 δ2 … δn
28 
π ρ 
δ 
π ρ 
Δ 
δ1 δn δ2 … 
BN fragment patterns 
Ingestion 
Multi-ingestion 
(bridge to summarization)
29 
Life events timeline (MC)
30 
MS 
Summarization metric: Count (CopyDecoyToExternal) 
Event type instance count
31 
MS 
Summarization metric: Variation re self (CopyDecoyToExternal) 
Event type historical variation re self
32 
MS 
Summarization metric: Variation re all (CopyDecoyToExternal) 
Event type historical variation re all
33 
MS 
Summarization metric: Suspicion warrant (CopyDecoyToExternal) 
Event type summary RV likelihood (suspicion warrant)

More Related Content

PDF
StimulusWk13
PDF
Crack control of slabs design booklet
DOCX
Modelo ecologico y_modelo_integral_de_intervencion
PPTX
Shooting
PDF
FINAL PR Portfolio Slides
PDF
Sponda yleisesitys 31032014
PDF
Sponda Results presentation Q4 2015
PPT
Freedom Health interview questions and answers
StimulusWk13
Crack control of slabs design booklet
Modelo ecologico y_modelo_integral_de_intervencion
Shooting
FINAL PR Portfolio Slides
Sponda yleisesitys 31032014
Sponda Results presentation Q4 2015
Freedom Health interview questions and answers

Viewers also liked (16)

PPTX
Haystax carbon for Insider Threat Management & Continuous Evaluation
PPTX
American mfg
PPTX
Overview of Haystax Technology
PDF
Haystax Technology Machine learning white paper
PPTX
Question 4
PPT
Frac Tech Services interview questions and answers
PPTX
Why is learning about pdhpe importatnt
PDF
Exploiting inference to improve temporal RDF annotations and queries for mach...
PPTX
Whole Person Risk Modeling
PPTX
Evaluation Q7
PPTX
презентація опера
PDF
Sponda Financial Results Q1 2015 presentation 050515
PPTX
Haystax Technology - About Us
PPTX
Importance pdhpe
PPTX
Importance of pdhpe
PPTX
The Future of Advanced Analytics
Haystax carbon for Insider Threat Management & Continuous Evaluation
American mfg
Overview of Haystax Technology
Haystax Technology Machine learning white paper
Question 4
Frac Tech Services interview questions and answers
Why is learning about pdhpe importatnt
Exploiting inference to improve temporal RDF annotations and queries for mach...
Whole Person Risk Modeling
Evaluation Q7
презентація опера
Sponda Financial Results Q1 2015 presentation 050515
Haystax Technology - About Us
Importance pdhpe
Importance of pdhpe
The Future of Advanced Analytics
Ad

Similar to Processing Events in Probabilistic Risk Assessment (20)

PPTX
“AI techniques in cyber-security applications”. Flammini lnu susec19
PDF
Cyber Analytics Applications for Data-Intensive Computing
PDF
IDS / IPS Survey
PPT
Thin Slicing a Black Swan: When Less Is More
PDF
2010.08 Applied Threat Modeling: Live (Hutton/Miller)
PDF
2014_protect_presentation
PPTX
"Data Provenance: Principles and Why it matters for BioMedical Applications"
PDF
Thin Slicing a Black Swan: A Search for the Unknowns
PPTX
How to Operationalize Big Data Security Analytics - Technology Spotlight at I...
PDF
Kb2417221726
PPTX
Assessing Quality in Cyber Risk Forecasting
PPTX
9th may net sci presentation (1)
PPTX
PDF
2016 09-19 - stephan jou - machine learning meetup v1
PPTX
IANS Forum Dallas - Technology Spotlight Session
PDF
Monitoring Smart Grid Operations and Maintaining Missions Assurance
PPTX
Examples of working with streaming data
PDF
Information security risk assessment under uncertainty using dynamic bayesian...
PPTX
Intrusion Detection Systems Pedagogy.pptx
“AI techniques in cyber-security applications”. Flammini lnu susec19
Cyber Analytics Applications for Data-Intensive Computing
IDS / IPS Survey
Thin Slicing a Black Swan: When Less Is More
2010.08 Applied Threat Modeling: Live (Hutton/Miller)
2014_protect_presentation
"Data Provenance: Principles and Why it matters for BioMedical Applications"
Thin Slicing a Black Swan: A Search for the Unknowns
How to Operationalize Big Data Security Analytics - Technology Spotlight at I...
Kb2417221726
Assessing Quality in Cyber Risk Forecasting
9th may net sci presentation (1)
2016 09-19 - stephan jou - machine learning meetup v1
IANS Forum Dallas - Technology Spotlight Session
Monitoring Smart Grid Operations and Maintaining Missions Assurance
Examples of working with streaming data
Information security risk assessment under uncertainty using dynamic bayesian...
Intrusion Detection Systems Pedagogy.pptx
Ad

More from Haystax Technology (11)

PPTX
The constellation analytics platform
PPTX
Haystax Carbon for Insider Threat Management
PPTX
School Safety Center
PPTX
Major events
PDF
Public safety cloud overview
PPTX
Pre incident fire
PPTX
Critical assets protection
PPTX
Haystax bayesian networks
PPTX
Enterprise Threat Management
PPTX
Haystax - Analytic Products and Enterprise Network Services
PPTX
Haystax: Actionable Intelligence Platform
The constellation analytics platform
Haystax Carbon for Insider Threat Management
School Safety Center
Major events
Public safety cloud overview
Pre incident fire
Critical assets protection
Haystax bayesian networks
Enterprise Threat Management
Haystax - Analytic Products and Enterprise Network Services
Haystax: Actionable Intelligence Platform

Recently uploaded (20)

PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
Getting Started with Data Integration: FME Form 101
DOCX
search engine optimization ppt fir known well about this
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPT
Geologic Time for studying geology for geologist
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
Tartificialntelligence_presentation.pptx
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PPT
What is a Computer? Input Devices /output devices
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Five Habits of High-Impact Board Members
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Unlock new opportunities with location data.pdf
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
A review of recent deep learning applications in wood surface defect identifi...
Getting Started with Data Integration: FME Form 101
search engine optimization ppt fir known well about this
Assigned Numbers - 2025 - Bluetooth® Document
Geologic Time for studying geology for geologist
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Tartificialntelligence_presentation.pptx
observCloud-Native Containerability and monitoring.pptx
Zenith AI: Advanced Artificial Intelligence
What is a Computer? Input Devices /output devices
NewMind AI Weekly Chronicles – August ’25 Week III
Module 1.ppt Iot fundamentals and Architecture
Five Habits of High-Impact Board Members
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
sustainability-14-14877-v2.pddhzftheheeeee
Univ-Connecticut-ChatGPT-Presentaion.pdf
Unlock new opportunities with location data.pdf
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Taming the Chaos: How to Turn Unstructured Data into Decisions

Processing Events in Probabilistic Risk Assessment

  • 1. Processing events in probabilistic risk assessment 9th International Conference on Semantic Technologies for Intelligence, Defense, and Security (STIDS). November 20, 2014 Annotated presentation—see Notes Page view.
  • 2. Three event-informed person risk models 1. MC (“Carbon”): Information disclosure risk  Belief that a (candidate) member person P will disclose an organization’s private information Life (“macro”) events  Education, employment  Crime, civil judgment  Bankruptcy, credit  … 2. MS (“Silicon”): IT system insider exploitation risk  Belief that a user will access, disclose, or destroy an organization’s computer network-resident information) Computer network (“micro”) events  Log in after hours  Access “decoy” file  Copy file to…  External location  Thumb drive 3. MG = MC • MS NOTE: Carbon and Silicon are names of Haystax Analytic Products
  • 3. 2 Theme Issue: Apply event evidence to person attribute concept random variables (RVs) in a risk assessment Bayesian network (BN), modeling events’ changing relevance over time. Given:  Person P  Events E, in P’s past or present  Generic person BN B  Risk-related person attribute concept RVs (Boolean)  Concept-relating probabilistic influences  A reference time t (in an ordered set T of such points) Develop:  Person-specific BN BP reflecting E  Beliefs in P’s attribute concept at t, per BP  (P’s historical risk profile over T)
  • 4. 3 Elided B with ingested event categories (MC) Trustworthy Reliable CommitsMisdemeanor CommittedToSchool … CommittedToCareeer School events Employment events Law enforcement events … …
  • 5. Approaches to realizing BP 1. Event “ingestion”: For each event e in E, …  Include a new event RV δ indicating person attribute concept π in BP  Specify per-event half life decay as new temporal relevance RV ρ  Enter hard evidence finding on δ  Appropriate when events are of a given type τ are individually salient  Feasible when |E| << |nodes(B )| Ingestion π ρ concept relevance δ event
  • 6. 5 Life events timeline (MC)
  • 7. Three event-informed person risk models 1. MC (“Carbon”): Information disclosure risk  100s of RVs  B extracted from official policy / guidelines (under in situ test) Life (“macro”) events  10s of types  10s of events / person  10s of years of data Ingestion only (“hard” salience)  10s of rules 2. MS (“Silicon”): IT system insider exploitation risk  10s of RVs  B eyeballed (preliminary proof of concept) Computer network (“micro”) events  10s of types  100Ks of events / person  1.5 years of data Summarization, primarily (“soft” salience)  1s of ingestion rules 3. MG = MC • MS
  • 8. Three event-informed person risk models 2. MS (“Silicon”): IT system insider exploitation risk  Belief that a user will access, disclose, or destroy an organization’s computer network-resident information) Computer network (“micro”) events  Log in after hours  Access “decoy” file  Copy file to…  External location  Thumb drive 3. MG = MC • MS
  • 9. Approaches to realizing BP 2. Event “summarization”: For each event type τ represented in E, …  Include an event “summary” RV Δ indicating π in B  Develop a likelihood summarizing the impact of events τ collected into temporal buckets  Enter likelihood finding on Δ  Appropriate when the salience of events type τ tends to depend on trends w.r.t. an individual or a population thereof  Useful when ⌐(|E| << |nodes(B )|) Summarization concept relevance π ρ Δ summary events δ1 δ2 … δn
  • 10. 9 Summarization elements (per RV) Summarize events over a practically unlimited duration, by using temporal buckets of geometrically increasing size. Infer salience from event volume variation w.r.t. a person’s own and the population’s history. Weight buckets per desired temporal relevance decay.
  • 11. 10 MS Summarization metric: Count (CopyDecoyToExternal) 600 500 400 300 200 100 0 64 16 4 1 Day Count Bucket
  • 12. 11 MS Summarization metric: Variation re self (CopyDecoyToExternal) 1 0.8 0.6 0.4 0.2 0 64 16 4 1 Day Variation: self Bucket
  • 13. 12 MS Summarization metric: Variation re all (CopyDecoyToExternal) 1 0.8 0.6 0.4 0.2 0 1 4 16 64 Day Variation: all Bucket
  • 14. 13 MS Summarization metric: Variations mean (CopyDecoyToExternal) 1 0.8 0.6 0.4 0.2 0 64 16 4 1 Day Variations mean Bucket
  • 15. 14 MS Summarization metric: Suspicion warrant (CopyDecoyToExternal) 1 0.8 0.6 0.4 0.2 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 Suspicion warrant Day
  • 16. Approaches to realizing BP 2. Event “summarization”: For each event type τ represented in E, …  Include an event “summary” RV Δ indicating π in B  Develop a likelihood summarizing the impact of events τ collected into temporal buckets  Enter likelihood finding on Δ  Appropriate when the salience of events type τ tends to depend on trends w.r.t. an individual or a population thereof  Useful when ⌐(|E| << |nodes(B )|) Summarization concept relevance π ρ Δ summary events δ1 δ2 … δn
  • 17. 16 Computer network events timeline (MS)
  • 18. 17 Influence graph specification (MS) (defparameter *Influences* '((ExploitsITSystemAsInsider (:ImpliedByDisjunction (CommitsITExploitation (:ImpliedBy (DestroysInformationUnauthorized) (AccessesInformationUnauthorized) ; Ingested: HandlesKeylogger_Event (DisclosesInformationUnauthorized) ; Ingested: CopyFileToWikileaks_Event (StealsInformation))) ; Ingested: CopyFileToCompetitor_Event (WarrantsITExploitationSuspicion (:ImpliedBy (WarrantsInformationDestructionSuspicion (:IndicatedBy (:Strongly (DeleteFileOnOthersPC_Summary)) (:Moderately (DeleteFileOnLabsPC_Summary)))) (WarrantsUnauthorizedInformationAccessSuspicion (:IndicatedBy (:Moderately (AfterHoursLogin_Summary)) (:Weakly (OpenFileOnOthersPC_Summary)))) (WarrantsUnauthorizedInformationDisclosureSuspicion (:IndicatedBy (:Strongly (CopyOthersFileToThumb_Summary) (CopyDecoyToExternal_Summary)) (:Moderately (OpenDecoyFile_Summary) (AcquireDecoyFile_Summary) (CopyFileToExternal_Summary)) (:Weakly (CopyFromThumbToOwnPC_Summary) (CopyOwnFileToThumb_Summary) (CopyOthersFileToExternal_Summary))))) (:RelevantIf (:Locally (:Absolutely (Untrustworthy)))) (:MitigatedBy (:Locally (:Strongly (HasRole-ITAdmin)))))))))
  • 19. 18 Computer network events timeline (MS)
  • 20. Combined timeline (MG = MC • MS)
  • 21. 20 Ingestion issue: Interacting temporal relevance nodes Temporal relevance nodes participate in belief propagation in BP—making their beliefs (so, effective temporal relevance) subject to departure from nominal specification. Multiple temporal and/or semantically close events’ relevance nodes reinforce each other—inducing temporal relevance beyond nominal specification.  5 simultaneous events’ decay only 6% after half life interval.  We might naively expect 50%. Summarization largely insulates a temporal relevance node from surrounding belief propagation.
  • 22. 21 Supporting software “stack” Allegro Common Lisp® (ACL) AllegoGraph® Lisp direct client Allegro Prolog macros (e.g., select) Lisp macros (e.g., iterate-cursor) ACL API to the Netica® API Netica® API
  • 23. 22 Ingestion rule (MC) (defIngestionRule RestrainingOrder (+process-reportedEvent ?person ?*asOfDate) (reportedEvent ?person ?*asOfDate ?event !agent:ProtectiveRestrainingOrder ?*startDate ?*endDate ?*ongoing? ?*reportDate) (lisp (create-EventConceptIndication ?person :IndicatedConcept CommitsDomesticViolence :+IndicatingEvent ?event :Terminus :end :DeltaDays (- ?*asOfDate ?*endDate) :HalfLife (* 6 365) :Strength :strong :Polarity :positive)))
  • 24. 23 Ontology and data specifications (MC) (defOntologyInstance !data:P (Person)) (defOntologyInstance !data:PHighSchoolAttendance (SchoolAttendance) (riskRatingSubject !data:P) (schoolCredentialAward !data:PDiplomaAward) (startDate "2000-09-04") (endDate "2004-06-15")) (defOntologyInstance !data:PDiplomaAward (SchoolCredentialAward) (riskRatingSubject !data:P) (startDate "2004-06-15") (schoolCredentialAwarded HighSchoolDiploma)) (defOntologyInstance !data:PEmployment (Employment) (riskRatingSubject !data:P) (startDate "2004-07-05") (endDate "2009-09-05")) (defOntologyInstance !data:PMisdemeanorAssault (PoliceOffense) (riskRatingSubject !data:P) (offenseChargeSchedule Misdemeanor) (startDate "2007-06-30")) (defOntologyClass Person (Thing) (hasGender Gender :Functional)) (defOntologyClass Gender (Thing) (:enumeration Male Female OtherGender)) (defOntologyType Date !xsd:date) (defOntologyClass Event (Thing) (riskRatingSubject Person :Functional) (startDate Date (:cardinality 1)) (endDate Date :Functional) (sourceReport Report :Functional)) (defOntologyClass PointEvent (Event) (hasConsequentEvent Event)) (defOntologyClass DurativeEvent (Event) (hasSubEvent Event)) (defOntologyClass ProtectiveRestrainingOrder (PointEvent))
  • 25. 24 Thank you. Questions ?
  • 27. Approaches to realizing BP 1. Event “ingestion”: For each event e in E, …  Include a new event RV δ indicating person attribute concept π in BP  Specify per-event half life decay as new temporal relevance RV ρ  Enter hard evidence finding on δ  Appropriate when events are of a given type τ are individually salient  Feasible when |E| << |nodes(B )| 2. Event “summarization”: For each event type τ represented in E, …  Include an event “summary” RV Δ indicating π in B  Develop a likelihood summarizing the impact of events τ collected into geometrically larger buckets  Enter likelihood finding on Δ  Appropriate when the salience of events type τ tends to depend on trends w.r.t. an individual or a population thereof  Needed when ⌐(|E| << |nodes(B )|)
  • 28. Approaches to realizing BP Ingestion π ρ concept relevance δ event Summarization concept relevance π ρ Δ summary events δ1 δ2 … δn
  • 29. 28 π ρ δ π ρ Δ δ1 δn δ2 … BN fragment patterns Ingestion Multi-ingestion (bridge to summarization)
  • 30. 29 Life events timeline (MC)
  • 31. 30 MS Summarization metric: Count (CopyDecoyToExternal) Event type instance count
  • 32. 31 MS Summarization metric: Variation re self (CopyDecoyToExternal) Event type historical variation re self
  • 33. 32 MS Summarization metric: Variation re all (CopyDecoyToExternal) Event type historical variation re all
  • 34. 33 MS Summarization metric: Suspicion warrant (CopyDecoyToExternal) Event type summary RV likelihood (suspicion warrant)

Editor's Notes

  • #2: This work has been conducted at Haystax Technology’s headquarters in McLean, VA USA.
  • #3: Our paper begins with some generic motivation for person risk assessment and then moves to actual models we’ve built. We’ll try to make it real right away in this talk. Our model “Carbon” (because we are assessing risk for carbon-based life forms) assesses information disclosure risk based on a person’s life (or “macro”) events. Our model “Silicon” (because computers are silicon-based) assesses IT system exploitation risk based on a person’s computer network (or “micro”) events. Carbon is older and more mature (under deployment). Silicon is an exploratory proof of concept. MG (no brand name, yet) is an early exercise.
  • #4: To formalize things (just) a little bit, we have this problem statement. We’ll jump right into a Carbon example with a historical risk profile.
  • #5: Our risk assessment model core is a Bayesian network (BN) using binary random variables (top). The full Carbon BN includes too many nodes to show here. As our top-level proxy for information disclosure risk, we offer the person attribute concept Trustworthy. Person attribute concepts serve both as hypotheses and as indicators. BN arcs point in the causal direction, from hypotheses to indicators. Indication strength is suggested by line thickness, polarity by other line format. BN is per qualitative specification, forthcoming, of which this is a graphical rendition. Carbon “ingests” events (as suggested at bottom) to realize a person-specific BN.
  • #6: A concept like CommitsMisdemeanor is indicated by an event like MisdemeanorAssault.
  • #7: The (horizontal) time axis does double duty, for events (bars) and beliefs (lines). Hint: compare most rapidly changing beliefs to events (visual sensitivity analysis). We have applied the model to develop beliefs at the plotted time points. Note how beliefs in CommittedToSchool and CommittedToCareer tend to build while the related (HighSchoolAttendance and Employment) events are ongoing. Influence interactions in B cause belief in CommittedToCareer to grow even while P is still in high school. (We tend to believe that someone who does well in school will also do well in a career.) Belief in CommittedToSchool increases when P graduates but then become less relevant per half lives specified in ingestion rules for school-related events. The 2007 MisdemeanorAssault charge decreases belief in all the other, positive concept RVs. See also Lisp macro calls expressing associated event data, forthcoming. Questions? (We will see this format again.) Source: STIDS-2014-A
  • #8: Our Carbon and Silicon risk models have been driven by qualitatively different event sets. While the event set for Carbon is sensitive, we can tell you that the event set for Silicon is a synthetic insider threat dataset that had been generated by US CERT for DARPA’s ADAMS program. Because Silicon must address so many more events—and because these events have qualitatively different salience—we invented a different scheme to specialize the BN w.r.t. a given person’s events.
  • #9: Remember what kinds of events Silicon is dealing with…
  • #10: That is, we use a fixed BN with “summary” RVs for the different event types.
  • #11: In summarization, we attain three objectives simultaneously. Event volume compaction Event temporal relevance decay Event volume variation characterization
  • #12: The paper’s figure 4 exhibits key metrics for a US CERT dataset person. Here, we have contrived data for a given event type (for a made-up person) to have a linear increase after 33 days (of 64 plotted). Each temporal bucket labeled n counts events in only the most recent n days.
  • #13: The difference between adjacent buckets tells us what is “new” vs. “old” in the longer bucket. The ratio of this difference to the shorter bucket tells us this person’s variation w.r.t. this bucket size. We normalize this ratio to the range [0, 1] using a sigmoid function. Note that different-sized buckets have different derivatives w.r.t. time.
  • #14: Variation w.r.t. the population uses a simple comparison against a computed statistic (e.g., mean).
  • #15: We average the foregoing variations on our way to informing Big Delta (Δ). (Alternatively, we could include separate RVs in B.)
  • #16: Finally, we collapse variation means for all the buckets into a single “suspicion warrant” score for a given day and enter this as a likelihood finding on Big Delta (Δ).
  • #17: This is just a reminder…
  • #18: All that leads to this belief timeline plot for our made-up person.
  • #19: From the appendix (topic of a future paper…)
  • #20: Compare this to the following slide about MG, where we link Carbon and Silicon via the concepts Trustworthy and Untrustworthy. Keep your eye on the (red) belief line for Silicon’s top-level concept, ExploitsITSystemAsInsider. (It’s going to turn umber…)
  • #21: …and get a lot steeper at the marked point—because we have given this person some life events that affect not only his trustworthiness but (in MG) also our belief that he exploits the IT system as an insider. So, we have presented some results from each of our three models.
  • #22: The paper discusses strengths and weaknesses in ingestion’s and summarization’s different treatments of temporal relevance. We’ve reduced that to one “issue” slide here—so that we can say more at this venue about our supporting semantic technology.
  • #23: We exploit semantic technology for… Event ontology definition and application Ingestion rule definition. AllegroGraph® is an RDF triple store management system that happens to be written in Allegro Common Lisp®. While Franz supports AllegroGraph® clients for a number of different languages, the direct (vs. remote) Lisp client benefits us in that it shares memory with AllegroGraph® itself. Allegro Prolog®, written in and included in Allegro Common Lisp®, is a logic programming facility that the Lisp direct client extends with Lisp macros and Prolog predicates affording access (alternatively to SPARQL) to AllegroGraph® triple stores.
  • #24: Because Allegro Prolog® supports calls to Lisp functions from within logic programming rules, our ingestion rules can invoke the Allegro Common Lisp® API to the Netica® API to augment an existing generic person Bayesian network (BN) model B to add random variables (RVs) corresponding to a person P’s events E, resulting in a person-specific BN BP. defIngestionRule is a macro wrapping Allegro Prolog® <-, registering the ingestion rule and performing static analysis to ensure well-formedness. We call +process-reportedEvent, the ingestion rule predicate, to launch the ingestion process for a given person as of a given time. When reportedEvent, our create-EventConceptIndication is called with the bound logic variables. Allegro Prolog® includes predicate-level functors supporting logical operations (e.g., and, or), backtracking control (varieties of if, cut), and Lisp calls evaluated at predicate level for their truth values (i.e., not just execution for side effect as here). Under AllegroGraph®’s direct Lisp client, user-defined Allegro Prolog® rules (so ingestion rules and their supporting predicates) may include any RDF resources (i.e., URIs) or literals in their heads or bodies. So, the language of ingestion rules is relatively expressive.
  • #25: With its signature treatment of programs as data (both expressed as lists), Lisp has long been a favorite language for creating embedded knowledge representation languages and supporting utilities. We exploit this facility in designing our models’ ontologies for person-related events—using Lisp macros to express class, property, and individual (instance) definitions. Macro calls here add triples to a specified graph in an active store. Store-resident triples may be serialized to a standard OWL file in (e.g.) RDF/XML format, then viewed in an available ontology browser (e.g., Protégé). For a specified class (e.g., Person), an object or datatype property (e.g., hasGender or startDate) is created per the type (e.g., Gender or Date) specified. OWL closed enumeration classes (e.g., Gender) are supported, as are OWL property types (e.g., Functional) and restrictions (e.g., cardinality). Validation machinery ensures a specified ontology’s global consistency with respect to effective cardinalities allowed. The framework validates any loaded dataset with respect to declared subject and object classes, literal data types, and property types (e.g., Functional) and restrictions (e.g., cardinality).
  • #28: For comparison’s sake…
  • #29: For comparison’s sake…
  • #31: STIDS-2014-A