Ontology-based Adaptive Systems of Cyber Defense
Noam Ben-Asher∗‡, Alessandro Oltramari†, Robert F. Erbacher∗, Cleotilde Gonzalez†
∗U.S. Army Research Laboratory Adelphi, MD, USA
nbenash@us.ibm.com, robert.f.erbacher.civ@mail.mil
†Carnegie Mellon University Pittsburgh, PA, USA
aoltrama@andrew.cmu.edu, coty@cmu.edu
‡IBM T.J.Watson Research Center, Yorktown Heights, NY
Abstract—In this paper we outline a holistic approach
for understanding and simulating human decision making in
knowledge-intensive tasks. To this purpose, we integrate semantic
and cognitive models in a hybrid computational architecture. The
contribution of the paper is twofold: first we describe a packet-
centric ontology to represent network traffic. We show how the
ontology is used to describe real-world network traffic and also
serve as a basis for higher level ontologies of cyber operation,
threat and risk. Second, we demonstrate how the combination
of the packet-centric ontology with an adaptive cognitive agent
with learning capabilities, can be used to understand the human
defender reasoning processes when monitoring network traffic.
Through simulation experiments we evaluated the proposed
hybrid computational architecture and demonstrate its ability
to successfully detect malicious port scanning within legitimate
network traffic. We discuss the implications of these findings
for improving our understanding of the cognitive processes and
knowledge requirements of the cyber defender, as well as the
possible use of the hybrid architecture as a cognitively inspired
decision support tool.
I. INTRODUCTION
Disruption of computers and the loss of sensitive infor-
mation through cyber-attacks are becoming a widespread
threat and a critical concern for citizens, organizations, and
governments. Even with recent advances in information and
network security and the development of new monitoring
and threat detection tools, many of the tasks performed by
cyber-defenders (i.e., security analysts) remain challenging,
resulting in weak and uncertain cyber-defense. The analytical
capabilities of the human decision maker are needed and indis-
pensable for the process of cyber-defense [1]. Security analysts
transform network traffic data into cyber situation awareness, a
high level of processing that is difficult to automate [2]. This
process may be seen as analogous to the Data-Information-
Knowledge-Wisdom (DIKW) hierarchical model that is central
for information and knowledge management [3]. Within this
context, cognition serves as the driver that governs the transi-
tions between the different levels of information representation
[4]. While there is a large body of research on technologies
that detect port scanning [5], there is a limited understanding
of the cognitive processes cyber security analysts use to detect
port scanning and specifically how these cognitive abilities
interact with and information representation. In this regard,
the contribution of this paper is twofold: first we describe a
packet-level ontology that represents network traffic. Second,
we demonstrate how the integration of this ontology with a
computational cognitive agent can be used to understand the
human analyst reasoning process, which may then serve as
guide to develop decision support technology for the analyst.
II. KNOWLEDGE MODEL
From a cyber security standpoint, variations in network traf-
fic are the primary prompts of analyst’s behavioral responses;
nevertheless, full situational awareness can emerge only from
a projection of observations and decisions into a more com-
prehensive context that includes knowledge about threat and
attack types, executable defensive maneuvers, system vulner-
abilities, risk mitigation and time constraints, among others.
In this regard, building a rigorous model of this complex
context is a key requirement for the study of human decision
making in cyber security. Computational ontologies are the
knowledge component in this holistic approach, as they can
provide a machine-readable semantic representation of cyber
scenarios. In virtue of their logical properties and schematic
structure, ontologies can be used by automatic reasoners in
dynamic tasks: in particular, in our work we apply ontology-
based reasoning to a detection task, where an agent simulates
a human analyst’s cognitive capabilities, including the capa-
bility of using domain knowledge and temporal information
to reason about perceived events [6]. To this purpose, we
engineered a packet-centric ontology of network traffic, a
module of a larger ontology framework called CRATELO [7],
the suite of modular ontologies under development in the U.S.
Army Research Laboratory Cyber Security Collaborative Al-
liance. CRATELO is constituted of several domain ontologies
(collectively indicated as OSCO), integrated on the basis of
DOLCE top level [8] extended with a security-related middle
layer. These top, middle and domain level ontologies currently
add up to 330 classes, connected by 162 relationships (132
object properties and 30 datatype properties) and encoded in
OWL-DL. The packet-centric ontology presented in this paper,
henceforth abbreviated to PACO, is a partition of OSCO1
.
Our reseach efforts in developing CRATELO are inspired by
Obrst and colleagues’s proposal of a wide-ranging ontology
framework of cyber security [9], that spans from top-level,
system-oriented ontologies and human factors ontologies. In
1CRATELO stands for ‘Three Levels Ontology for the ARL
Collaborative Research Alliance’. OSCO stands for ontology of
cyber operations. For more details about the program see also:
http://www.arl.army.mil/www/default.cfm?page=1417
this long-term endeavour, we have been working with ARL
domain experts and cyber analysts to distill the necessary
knowledge of the cyber domain. As the state of the art shows,
a preliminary step in understanding any new domain is to
produce accessible definitions and classifications of entities
[10]: discussions on cyber security often begin with the
difficulties created by misused terminology (such as char-
acterizing cyber espionage as an attack). In this regard, the
Joint Chiefs of Staff created a list of cyber term definitions
(allegedly extended and refined for a classified version). None
of these definitions, however, were formulated as an ontology.
Likewise, various agencies and corporations (NIST, MITRE,
Verizon) have formulated enumerations of types of malware,
vulnerabilities, and exploitations. In particular MITRE, which
has been very active in the field, maintains two dictionar-
ies, CVE (Common Vulnerabilities and Exposure) and CWE
(Common Weakness Enumeration), a classification of attack
patterns (CAPEC - Common Attack Pattern Enumeration and
Classification), and an XML-structured language to represent
cyber threat information (STIX - Structure Threat Information
Expression).
Despite of the important role played by these and further
initiatives, the lack of a shared formal semantics make ter-
minologies hard to define, sustain, and port into a machine-
processable format: here we try to overcome these problems,
embracing a holistic approach to model cyber security factors.
In fact, if the ontology outlined in this paper is tailored to a
packet-centric model of network traffic, it can be framed at a
higher level of conceptualization by means of the integration
with CRATELO: for instance, when modeling the behavior of
a cyber analysts during an attack, packets can be seen as parts
of the evidence collection process, and specific attributes of
packets (e.g. internal or external IP addresses, low or high
packet rate, etc.) may hint to specific intentions of the adver-
sary (also called anti-goals). As mentioned at the beginning of
the section, ontologies can serve as knowledge bases to agents:
conversely, the dynamics of the agent’s decision process and
learning from experience are captured by an Instance-based
Learning (IBL) cognitive model [11], which is a computational
representation of the processes that guide human behavior.
Next section reviews what cognitive models are, and how they
can be used to study human decision making.
III. COGNITIVE MODEL
In a dynamic decision making setting, cognitive architec-
tures, such as ACT-R [12], SOAR [13] and others, have
been commonly used to provide an integrated representation
of human cognition. Cognitive models, constructed using
these architectures, allow for a careful examination of various
cognitive processes that drive human decision making [11].
Cognitive models based on IBL theory (IBLT) focus on
decision making and learning from experience in dynamic
settings [11]. IBLT emerging from ACT-R, proposes a generic
decision-making process that recognizes decision situations,
generates instances through the interaction with the decision
task, and finishes with reinforcement of the instance leading
to desired outcomes. According to IBLT, the decision maker
represents decision making situations as instances stored in
memory. An instance is composed of three parts: (1) situation
(S) a set of attributes representing a situation; (2) decision (D)
that is made in the particular situation; and (3) utility (U) that is
the experienced outcome from a decision. The IBLT decision
cycle includes several stages: recognition, judgment, choice,
and execution. In the Recognition stage, a decision maker
identifies relevant attributes for a specific decision situation.
Judgment stage determines the relevancy of past experiences
(instances) in current decision making situation. The activation
of instances in memory is a representation of relevancy. Acti-
vation is influenced by the recency and frequency an instance
occurred in the past and the similarity between the current
decision situation and the situation stored in the instance. This
activation mechanism is a simplification of the mechanism
originally proposed in the ACT-R architecture. Memory ac-
tivation determines the probability that an instance will be
retrieved from memory and participate in the next phase. In
the absence of previous experiences that may be relevant to
the current situation, pre-defined heuristics are triggered for
decision making. In the Choice, the retrieved instances and
their retrieval probability are used to calculate the expected
utility for each of the decision options, and the option with
the highest expected utility is chosen. Finally, in the Execution,
feedback regarding the last decision is provided to the decision
maker [11]. In this work, we chose IBL to model the decision
making as it captures the adaptive human decision making
and learning processes in dynamic environment as well as the
transition between exploration and maximization.
Agents based on IBL models successfully account for
human decision making and behavior in a variety of tasks.
Lejarraga et al. [14] demonstrate that a single IBL model
constructed for a specific repeated binary choice task can be
generalized to different variants of repeated tasks requiring a
binary decision as well as to probability learning tasks. More
specifically, IBL models can reflect human behavior in simple
stimulus-response practice and skill acquisition tasks and train-
ing. Furthermore, the experience-based learning process of an
IBL model was successfully extended to include descriptive
information and biases as risk aversion [15]. A pair of IBL
models successfully consider the dynamics of cooperation in
iterated Prisoner’s Dilemma as well as reciprocity and other
complex social interactions [16], [17].
IV. A PACKET-CENTRIC NETWORK ONTOLOGY
In this section we describe the structure of PACO, and how
it can be used to instantiate thousands of packets generated
by capturing actual network traffic. As Fig. 1 shows, the
class ‘PacketTransmission’ is considered the atomic element
of a ‘NetworkSession’. Intuitively, this means that without an
actual exchange of packets between a source and a destination
node, no network session can be deemed as properly complete.
In fact, there are additional features of network sessions:
for instance, when considering TCP connections, a complete
handshake with SYN, SYN+ACK and ACK packets transmis-
Fig. 1. A Prot´eg´e visualization of PACO. From the bottom-left corner (clockwise): 1) The DL expressivity derived by the HermiT 1.3.8 reasoner; 2) the
backbone taxonomy of classes; 3) an informal definition of packet transmission as value of annotation property; 4) property restrictions.
sion is necessary to enable a packet transmission between two
nodes, though this is not the case for communication protocols
like UDP, where handshake dialogues are not supported. Fol-
lowing the actual packet transmission between the two network
nodes and after the data are exchanged, a session is usually
resetted (although this final stage is not essential to qualify
it as complete - and session can also end due to a timeout).
In summary, when a communication between a source and a
destination node is established, a complete network session
consists of the transmission of a unit of data from source A to
destination B, and of the transmission of a unit of data from
source B to destination A. From the ontological standpoint,
this constraint is represented by the cardinality restriction ‘min
2’ on the object property ‘has member’ holding between ‘Net-
workSession’ and ‘PacketTransmission’ classes, respectively
the domain and the range of ‘has member’.
Apart from network-specific information associated to source
and destination nodes, like IP and port numbers, communi-
cation protocols, packet size, etc., we have introduced a data
property ‘has time stamp’ that assigns a specific time stamp
to each network event and a data property ‘has order’ that
binds each individual network event to its relative position in
a given sequence (the first event, the second event, etc.). This
twofold modeling choice provides us with a flexible model
of temporal knowledge: 1) it pinpoints the discrete temporal
coordinates of each event according to a universal time format
(based on the XML schema specifications2
); 2) it allows for
representing and reasoning over qualitative temporal relations
like ‘before’, ‘after’, and ‘overlap’, as defined by Allen’s
temporal axioms [18]). Figure 2 shows a situation where the
ordinal scale of the packet is captured (i.e., the 1024th
packet)
but the time stamp is not represented: the reason is that the
2http://www.w3.org/TR/xmlschema11-2/
former is more appropriate than the latter for the simulation
experiment reported in the next section, since the dataset was
collected with a rate of about 83 packets per second. In other
words, in our specific cyber scenario knowing the sequence of
events is more meaningful than knowing the real time stamps
from the defender’s perspective, although - to be general
enough - the ontology has to support both representational
formats. As depicted in Fig. 2, the role of a packet in the
handshake sequence can be captured by three booleans data
properties, respectively ‘has tcp.flags.syn’, ‘has tcp.flags.ack’
and ‘has tcp.flags.reset’. In the ‘PacketTrasmission1024’ case,
however it is unclear whether this packet represents the first
stage of a handshake or is part of a port scanning [19]. This
can be resolved by evaluating the properties of the proceeding
packet exchange (i.e., session) between the two nodes. As the
next section will show, we conducted an experiment to elicit
relevant information from instantiated ontology, and make the
resulting knowledge chunks available to the cognitive model
of a cyber defender. This process of knowledge elicitation
from PACO is driven by a set of SPARQL queries3
, properly
designed to extract and present relevant information that an
agent can use to decide whether a specific event is a threat or
not. For instance, the query in Fig. 3 is designed to collect all
the pairs of distinct source and destination ports in the dataset
of network events: on the basis of the retrieved information,
an analyst can gauge the volume of network traffic on a per
unique port basis; moreover, Fig. 4 represents a query built to
assess how many times a given source has sent a packet to a
closed port. In the latter case, the returned result, around one
thousand times, can be used as a clue of the maliciousness of
the source: so many attempts of communication with closed
ports may, in fact, suggest a port scanning attack. Note that
3http://www.w3.org/TR/rdf-sparql-query/
Fig. 2. A Prot´eg´e visualization of a specific instance of the ‘PacketTransmission’ class.
both queries have been used dynamically in the experiment
described in the next section, where the goal is to replicate the
analyst’s incremental understanding of the considered cyber
scenario.
Following a basic modeling strategy, in PACO we directly
assign specific data sizes to each network event through
the data property ‘has frame length’: an alternative option
would have been to introduce the class ‘Packet’ (a subclass
of ‘information object’ in DOLCE), and use the object prop-
erty ‘participation’ to link ‘Packet’ and ‘PacketTransmission’,
switching the domain of the data property ‘has frame length’
from PacketTransmission’ to ‘Packet’. At the current stage of
development, representing the data contents of packet trans-
missions doesn’t add any fundamental benefit to our modeling
framework, although we don’t exclude this option in the future.
Additional semantic structures of PACO concern network
topology and services: for instance, every network node runs
a set of services, and each service uses an official commu-
nication port and a specific protocol to establish a network
session with another node. It follows that when a port is open,
a service is running on a node, and if a port is closed, no
services are currently running for that particular node. Thanks
to the interoperability between PACO and CRATELO, services
can be modeled in the context of user’s actions: for instance,
a system administrator can decide to start or stop an HTTP
service, or access to the event log service on a server. By and
large, the originality of our approach relies on the flexibility in
the granularity of the representation: PACO is only a module
of a more comprehensive framework that sees the detection as
a socio-technical task, where packet-centric information can
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX IBLOd: <http://www.cra.psu.edu/IBLOd#>
SELECT DISTINCT ?srcport ?dstport
WHERE {{?event IBLOd:member IBLOd:NetworkTraffic-041215;
IBLOd:has_src_port ?srcport;
IBLOd:has_dst_port ?dstport;
IBLOd:has_source_node ?s;
IBLOd:has_destination_node ?d;
IBLOd:has_order ?order.
FILTER(?order >= "1"ˆˆxsd:positiveInteger &&
?order<= "4735"ˆˆxsd:positiveInteger).}}
Fig. 3. A SPARQL query that returns all the distinct combinations of source
and destination ports for a packets exchange sequence between two nodes.
be used by the decision maker at the cyber operation level. In
principle, using CRATELO we can also model beliefs, goals
and emotions of defenders and attackers, although it’s beyond
the scope of the current work to address these dimensions.
V. USING HYBRID MODELS IN CYBER DEFENSE
Next, we examine the interplay between knowledge and
cognition in cyber defense by integrating the packet-centric
ontology with cognitive agents who make decisions regarding
the state of a network into a hybrid computational architecture.
For the packet-centric knowledge-base we use PACO and the
agents are computational models of the IBL theory.
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX IBLOd: <http://www.cra.psu.edu/IBLOd#>
SELECT (COUNT(?order) AS ?numberOfACKResponses)
WHERE {?event IBLOd:member IBLOd:NetworkTraffic-041215;
IBLOd:has_source_node ?sn;
IBLOd:has_destination_node ?dn;
IBLOd:has_tcp.flags.syn false;
IBLOd:has_tcp.flags.ack true;
IBLOd:has_tcp.flags.reset true;
IBLOd:has_order ?order.
FILTER (?order >= "1"ˆˆxsd:positiveInteger &&
?order <= "4735"ˆˆxsd:positiveInteger).}
Fig. 4. A SPARQL query that returns the number of times a source node
sent packets to a closed port on the destination node.
A. Port Scanning Scenario
Port scanning is designed to probe network nodes for open
ports. The existence of an open port can provide some indi-
cation on the availability of services. This type of information
gathering can be part of a defensive or offensive operation.
From the attacker’s perspective, a port scan is useful for gather-
ing relevant information for launching a successful attack and
indeed most attacks are preceded by some form of scanning
activity (reconnaissance), particularly vulnerability scanning
[20]. Therefore, the defender will try to detect external scans
while the attacker interest is to perform a scan without being
detected [21].
In this work, we assume first that the attacker uses external
resources to identify the attack IP address (i.e., the target).
Following, the attacker identifies port ranges to scan on the
specific target. These are the ports for services for which the
attacker has sophisticated attacks available. We also assume,
that the target is using standard ports and not randomized
ports. Thus, knowing that a port is open provides an accurate
indication that a service is running on the target.
B. Cognitive Models for Port Scanning Detection
To better understand the interplay between cognition and
knowledge and how semantic information supports the ongo-
ing work of the cyber defender, we developed two cognitive
models for cyber defender agents. Both agents observe a
situation, make decisions whether there is a scan or not,
and learn from feedback and past experiences. However, the
one agent operates without the knowledge based provided by
PACO, while the other is querying PACO to acquire temporal
information and situational awareness.
1) Experience Only Agent: To examine the interplay be-
tween information, cognition and knowledge, we initially con-
structed an agent using an IBL model which classifies network
events based on their attributes and learns from experience
only. The decision making process of this IBL agent depends
on the low level network traffic information, and the agent
could learn only from its own experiences without the ability
to acquire knowledge by querying the ontology. The situation
as observed by the agent in this condition is given by
Si = {p, sIP, dIP, SY N, ACK, RST} (1)
TABLE I
PAYOFF MATRIX WHICH DETERMINING THE FEEDBACK FOR AN AGENT
MAKING A DECISION IN A GIVEN SITUATION
Agent’s Decision
Scan No Scan
Packet Type
Scan Hit: 10 Miss: -10
No Scan False alarm: -5 Correct Rejection: 5
Where p is the protocol type (e.g., TCP, HTTP) of the
packet, sIP and dIP are the source and destination IP
addresses of the packet. SY N, ACK and RST are 1-bit
boolean flags that indicate on the state of a connection.
The agent observed a situation Si and made a decision
which corresponds to classifying a packet as being part of a
scan or not. This decision process involves retrieving relevant
instances (i.e., past experiences) from the agent’s memory,
computing retrieval probability for each of the instances and,
choosing the decision option that yields the highest expected
utility, based on the previous decisions recorded in the in-
stances. The process of choosing the option with the highest
expected utility is influenced by the recency and frequency of
past experiences, memory decay (d) and a noise parameter for
capturing the variability in memory activation (σ) [11].
After making a decision, the agent received a utility feed-
back, representing the outcome of the decision in a given
situation. The experienced utility (i.e., payoff) is determined
based on the payoff matrix illustrated in Table I. The payoff
that an agent receives following a decision, is determined
by the accuracy of the decision, based on the ground truth,
detailed in section V-C. The payoffs in the matrix emphasize
the positive and negative utilities from hits and misses over
correct rejections and false alarms.
2) Semantic Information and Experience Agent: In contrast
to the previous agent model, this agent can send SPARQL
queries to the PACO ontology, that provides specific knowl-
edge of the scenario, temporal information and augmented
situational awareness. As such, this model observes the same
situation as the Experience Only agent: however, instead of
using this information to make a decision, the agent uses the
information to generate queries (which, in turn, provides richer
information). Using PACO, the agent can generalize from
and reason about the characteristics of a sequence of packets
transferred from one network node to the other. Therefore,
the situation observed by the agent consist of the outputs
from multiple queries regarding the conversation between two
specific IP addresses, where one is the source and the other
is the destination. The situation for any packet, transmitted
between a source and a destination IP addresses, is given by
Si = {p, sPorts, dPorts, avgSY N, avgACK-RST} (2)
Where the attributes of the situation represent properties of
a communication between source and destination IPs, using
protocol p. The communication consists of a sequence of
packets exchanged between the two network nodes up to the
current packet. Thus, the agent can examine each packet within
the context of a sequence. Given the source IP of the current
package, attribute sPorts indicates on the average number of
ports in the source node that sent packets to the destination
node. Similarly, attribute dPorts indicates how many ports
in the destination node received packets from the source.
The attribute avgSY N describes the average ratio between
SYN packets and normal traffic recived from the source of
the packet. Attribute avgACK-RST provides complementary
information, the average ratio of between ACK-RST packets
and normal traffic the destination sent back to the source. This
type of answer indicates that the packet was sent to a closed
port (i.e., a port that is not used by any service on the target
node).
Based on the set of attributes described above, the Semantic
information and Experience agent classified packets. The
Semantic information and Experience agent received feedback
for these decisions using the same payoff matrix as the
Experience Only agent.
C. Simulation Experiment
We evaluated the differences between the two agent models
through simulation experiment. In the experiment, agents
classified the packets captured from the traffic in a small
network with 16 nodes (i.e., unique IP addresses). The cap-
tured communication between the network nodes included
4735 packets. The nodes used several types of protocols to
exchange packets, for example SMB and SSL. However, the
majority of the traffic (99.56%) used the TCP protocol. Within
this network, the adversary was located in a node with the IP
address of 192.168.1.8. The adversary used a specific port to
scan the 1000 common ports of the target node (192.168.1.3)
using Nmap defaults [22]. This information was not provided
to the agents and served as the ground truth for evaluating the
detection performance of the agents and providing them with
feedback. The captured network traffic was converted into an
XML data structure that was used to populate PACO and the
Semantic Information and Experience agent could then query
using SPARQL. The output of the SPARQL queries served as
the attributes of a situation as described in Eq. 2.
The values of the free parameters across the two agents
were kept the same, with d = 1.5 for memory decay and
σ = .25 for noise. These values are considered to be the
ACT-R defaults and are commonly used for IBL models as
well [23]. Each agent classified the 4735 packets and received
feedback following each decision, and this was repeated for
20 iterations.
To compare the performance of the Experience Only agent
with the Semantic Information and Experience agent we used
the following metrics:
1) Correct packet classification indicates on the propor-
tion of packets classified correctly as being a Scan or
No Scan packet.
2) Correct detection of scanning sequence indicates on
the proportion of conversations between two IPs that
were correctly classified as scans.
Fig. 5. Proportions of hits and false alarms for the two agents.
3) Learned classification rule indicates on the decision
rule the agents constructed from the repeated experi-
ences.
VI. RESULTS
In this section, we show our experimental results and ana-
lyze the observed trends based on the performance comparison
of the two modeling approaches.
Correct packet classification When analyzing the ability
of the agents to classify correctly a scan packet, and as
seen in Fig. 5, we find that the Experience Only agent
(mean=.999, SD=0) and the Semantic Information and Experi-
ence agent (mean=.992, SD=.002) performed similarly with a
minor advantage to the Experience Only agent, t(38)=.387,
p=ns. However, the Semantic Information and Experience
agent (mean=.050, SD=.077) generated a significantly higher
number of false alerts compared to the Experience Only agent
(mean=.004, SD=0), t(38)=2.661, p=.011.
Correct detection of scanning sequence utilizes the classi-
fication of a packet as belonging to a scan or to normal traffic
between two network nodes. This high level decision aims
to answer the question whether network node A is scanning
network node B. With respect to this question, if the network
traffic from node A to node B includes one or more packets
that are classified as scan packets, then node A is scanning
node B. When analyzing the ability of the two agents to answer
the question whether node A is scanning node B, both agents
detected that the adversary was scanning a specific network
node (i.e., 192.168.1.8
SY Nscan
→ 192.168.1.3). However, the
Experience Only agent detected on average additional 2.3 out
of 22 sequences between network nodes as scans (i.e., 10%
false scans), while the decisions of the Semantic Information
and Experience agent yielded 0 false classification of packet
sequences. Despite the higher false classification rate of inde-
vidual packets the Semantic Information and Experience agent
had, all these false classified packets belonged to the responses
of the scanned node (ACK packets) to the adversary scan (i.e.,
192.168.1.3
ACKresponse
→ 192.168.1.8).
Fig. 6. Detection outcomes of the Experience Only agent during a single
iteration with black arrows highlighting false classification of packets, red
cross marks indicating on sequences of packets that were incorrectly classified
as scans and green check mark for correct classification.
Fig. 7. Detection outcomes of the Semantic Information and Experience agent
during a single iteration with green check mark indicating on the correct
classification of a packet sequence.
Figures 6 and 7 illustrate the interplay between packet
classification and sequence classification. In both figures, we
present the same network sequences and how the packets were
classified by each agent. As seen in Fig. 6, the Experience Only
agent generated a very low number of false alerts (highlighted
by arrows). However, these packets corresponding to these
alarms were distributed across multiple sequences. As a result,
the entire sequence was classified as a scan. On the other
hand, and as seen in Fig. 7, the false alerts generated by the
Semantic Information and Experience agent were all part of
the communication between the scanned node to the source of
the scan. Note that both agents were able to separate between
legitimate traffic between 192.168.1.8 and 192.168.1.3 that
was not part of the scan and used UDP and other protocols.
The Learned classification rule used by each agent can be
formalized by examining the instances stored in the memory of
each agent, and their activation. The combination of attributes
and decision in highly activated instances represent beliefs re-
garding a relationship between a situation and the appropriate
decision. The decision rule formulated by the Experience only
agent was that any TCP packet with a SYN flag is part of
an ongoing scan between the source of the packet and its
destination. This rule yields high accuracy in detecting scan
packets as all the scan packets had a SYN flag. However,
packets with SYN flag are also part of legitimate handshake
between network node and for that reason the Experience Only
agent detected a higher proportion of packets sequences as
scans. In contrast, the Semantic Information and Experience
agent observed the temporal properties of a packet sequence.
The decision rule formulated by this agent suggests that a
scan packet uses TCP protocol and is part of a sequence
of packets in which the source node is using a low number
of ports to send packets to a high number of destination
ports and the average number of SYN packets sent to a
port is very close to 1. In addition, the rule constructed by
the Semantic Information and Experience agent indicates the
based on experience, the target node of the packet is very likely
to respond to the current packet with a ACK-RST packet,
indicating that the destination of the packets coming from the
source node tends to be a closed port.
VII. DISCUSSION AND CONCLUSION
Analytical capabilities of the human decision maker are
needed and are indispensable when ensuring the security of
any cyber infrastructure. It is the human abilities to inte-
grate information, to reason, to learn and to quickly adjust
to changes that make such significant contribution to cyber
security. The understanding of these processes relies on our
integration of knowledge from human cognitive theories and
knowledge-based technologies. In this study we propose an
architecture to combine cognitive models and ontologies in
the domain of cyber defense.
We developed a packet-centric ontology PACO which allows
us to represent and capture the atomic elements of network
communication, i.e., packets and sequences of packets. PACO
serves as the basis for more holistic semantic representations
of cyber operation, cyber assets, threats and risks, available
through CRATELO. We also developed an IBL cognitive model
capable of accessing the information in PACO and using it
when detecting adversarial port scan. When making decisions,
the ability of the IBL agent to access PACO and retrieve
information improved its performance, compared to the same
IBL agent that did not utilize PACO. We show that when
answering the questions ’Is IP A scanning IP B?’, an agent
with access to a packet-centric ontology delivers a much
lower false alerts rate and by that show superior performance.
Overall, the access to semantic information allowed the agent
to acquire better situation awareness by incorporating sum-
marized information into the decision making process. PACO
extended the agents ability to inspect temporal relationship
between a packet sent from a specific source and previous
replays of packet’s destination to communication coming from
that source. Such reasoning requires a representation of a
source and a destination, as well as the ability to switch
between these roles in order to observe the response patterns.
The agents explored rules in the form of IF a situation
THEN a decision, and learned which rule maximizes their
payoff. While the attributes of the situation part are influenced
by the availability of information, the cutoff values of the
attributes were learned from experience. Furthermore, the
decision rule that the agent with access to a packet-centric
ontology learned from experience is valid and useful beyond
the limited scope of the network scenario we used in the study.
However, the existence of knowledge is a precondition
rather than a guarantee for improvement: correctly querying
the information is the key for the major improvement. In the
process of modeling, we used domain experts to construct
the queries that aggregate and retrieve information. By using
cognitive agent we were able to test different queries and
combinations of attributes, to identify representations that
facilitate the decision making process of a network defender.
While PACO has the potential of representing packet level
information for complex and diverse network communication,
the current cognitive model was developed to accommodate
a simplistic network scenario. Port scanning can take many
forms (vertical and horizontal scans), can use different pro-
tocols and can be highly distributed over time (i.e., low-
and-slow scan). Therefore, although we used a high fidelity
network traffic, future research should scale up the volume of
traffic as well as the complexity of the network scan. Such
additions will likely challenge the cognitive agent. However,
providing the agent access to the middle and high levels
of CRATELO might be the key component for the agent’s
success in more complex and challenging tasks. The benefit
of pairing cognitive agents and ontologies goes beyond the
ability to gauge into the decision making process of the human
analyst. Such combination can serve as an initial step towards
the development of cognitively inspired decision aid tool for
automating some tasks that are currently performed by human
analyst.
ACKNOWLEDGMENT
This research was sponsored by the Army Research Lab-
oratory and was accomplished under Cooperative Agreement
Number W911NF-13-2-0045 (ARL Cyber Security CRA). The
views and conclusions contained in this document are those
of the authors and should not be interpreted as representing
the official policies, either expressed or implied, of the Army
Research Laboratory or the U.S. Government. The U.S. Gov-
ernment is authorized to reproduce and distribute reprints for
Government purposes notwithstanding any copyright notation
here on.
REFERENCES
[1] C. Gonzalez, N. Ben-Asher, A. Oltramari, and C. Lebiere, “Cognition
and technology,” in Cyber Defense and Situational Awareness, ser.
Advances in Information Security, A. Kott, C. Wang, and R. F. Erbacher,
Eds. Springer International Publishing, 2014, vol. 62, pp. 93–117.
[2] A. DAmico and K. Whitley, “The real work of computer network defense
analysts,” in VizSEC 2007. Springer, 2008, pp. 19–37.
[3] J. E. Rowley, “The wisdom hierarchy: representations of the dikw
hierarchy,” Journal of information science, 2007.
[4] N. Ben-Asher and C. Gonzalez, “Effects of cyber security knowledge
on attack detection,” Computers in Human Behavior, vol. 48, pp. 51–61,
2015.
[5] C. B. Lee, C. Roedel, and E. Silenok, “Detection and characterization
of port scan attacks,” Technical report, Univeristy of California, Depart-
ment of Computer Science and Engineering, 2003.
[6] A. Oltramari, N. Ben-Asher, L. Cranor, L. Bauer, and N. Christin, “Gen-
eral requirements of a hybrid-modeling framework for cyber security,”
in Military Communications Conference (MILCOM). IEEE, 2014, pp.
129–135.
[7] A. Oltramari, L. F. Cranor, R. J. Walls, and P. McDaniel, “Building an
ontology of cyber security,” in 9th International Conference on Semantic
Technologies for Defense, Intelligence and Security (STIDS), 2014, pp.
54–61.
[8] C. Masolo, S. Borgo, A. Gangemi, N. Guarino, A. Oltramari, R. Oltra-
mari, L. Schneider, L. P. Istc-cnr, and I. Horrocks, “Wonderweb deliv-
erable d17. the wonderweb library of foundational ontologies and the
dolce ontology,” 2002.
[9] L. Obrst, P. Chase, and R. Markeloff, “Developing an ontology of the
cyber security domain.” in 7th International Conference on Semantic
Technologies for Defense, Intelligence and Security (STIDS), 2012, pp.
49–56.
[10] D. A. Mundie and D. M. McIntire, “The mal: A malware analysis
lexicon,” DTIC Document, Tech. Rep., 2013.
[11] C. Gonzalez, J. F. Lerch, and C. Lebiere, “Instance-based learning in
dynamic decision making,” Cognitive Science, vol. 27, no. 4, pp. 591–
635, 2003.
[12] J. R. Anderson and C. Lebiere, The atomic components of thought.
Lawrence Erlbaum Associates Publishers, 1998.
[13] J. E. Laird, A. Newell, and P. S. Rosenbloom, “Soar: An architecture
for general intelligence,” Artificial intelligence, vol. 33, no. 1, pp. 1–64,
1987.
[14] T. Lejarraga, V. Dutt, and C. Gonzalez, “Instance-based learning: A gen-
eral model of repeated binary choice,” Journal of Behavioral Decision
Making, vol. 25, no. 2, pp. 143–153, 2012.
[15] N. Ben-Asher, V. Dutt, and C. Gonzalez, “Accounting for the integration
of descriptive and experiential information in a repeated prisoner’s
dilemma using an instance-based learning model,” in 22th Behavior
Representation in Modeling & Simulation (BRiMS) Conference, 2013,
pp. 11–14.
[16] C. Gonzalez, N. Ben-Asher, J. M. Martin, and V. Dutt, “A cognitive
model of dynamic cooperation with varied interdependency informa-
tion,” Cognitive science, 2014.
[17] C. Gonzalez and N. Ben-Asher, “Learning to cooperate in the prison-
ers dilemma: Robustness of predictions of an instance-based learning
model,” in 35th annual meeting of the Cognitive Science Society (CogSci
2014), 2014, pp. 2287–2292.
[18] J. F. Allen, “Maintaining knowledge about temporal intervals,” Commu-
nications of the ACM, vol. 26, no. 11, pp. 832–843, 1983.
[19] M. De Vivo, E. Carrasco, G. Isern, and G. O. de Vivo, “A review of
port scanning techniques,” ACM SIGCOMM Computer Communication
Review, vol. 29, no. 2, pp. 41–48, 1999.
[20] E. M. Hutchins, M. J. Cloppert, and R. M. Amin, “Intelligence-driven
computer network defense informed by analysis of adversary campaigns
and intrusion kill chains,” Leading Issues in Information Warfare &
Security Research, vol. 1, p. 80, 2011.
[21] M. H. Bhuyan, D. Bhattacharyya, and J. K. Kalita, “Surveying port scans
and their detection methodologies,” The Computer Journal, vol. 10, pp.
1565–1581, 2011.
[22] Nmap network mapper. [Online]. Available: https://nmap.org/
[23] N. Ben-Asher, J.-H. Cho, and S. Adalı, “Cognitive leadership frame-
work using instance-based learning,” in 24th Conference on Behavior
Representation in Modeling and Simulation (BRiMS), March 2015.

More Related Content

DOC
Comparison of relational and attribute-IEEE-1999-published ...
PDF
A systematic review on sequence-to-sequence learning with neural network and ...
PDF
Paper id 252014107
DOC
Technical Paper.doc.doc
PDF
Soft computing
DOC
Soft computing from net
DOC
Only Abstract
PDF
ON SOFT COMPUTING TECHNIQUES IN VARIOUS AREAS
Comparison of relational and attribute-IEEE-1999-published ...
A systematic review on sequence-to-sequence learning with neural network and ...
Paper id 252014107
Technical Paper.doc.doc
Soft computing
Soft computing from net
Only Abstract
ON SOFT COMPUTING TECHNIQUES IN VARIOUS AREAS

What's hot (20)

PPSX
Phd defence presentation
PPTX
Drug Target Interaction (DTI) prediction (MSc. thesis)
PDF
8421ijbes01
PDF
A scenario based approach for dealing with
DOC
ETRnew.doc.doc
PDF
Ch35473477
PPTX
VLSI IN NEURAL NETWORKS
PDF
Analysis of Neocognitron of Neural Network Method in the String Recognition
PDF
Symbolic-Connectionist Representational Model for Optimizing Decision Making ...
PDF
pln_ecan_agi_16
PDF
Nature Inspired Reasoning Applied in Semantic Web
PDF
Soft computing
 
PDF
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
PDF
Artificial Neural Networks: Applications In Management
PDF
Machine Learning and Reasoning for Drug Discovery
DOCX
Soft computing abstracts
PDF
A Stacked Generalization Ensemble Approach for Improved Intrusion Detection
PDF
Computer Aided Development of Fuzzy, Neural and Neuro-Fuzzy Systems
PDF
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
PDF
IRJET-In sequence Polemical Pertinence via Soft Enumerating Repertoire
Phd defence presentation
Drug Target Interaction (DTI) prediction (MSc. thesis)
8421ijbes01
A scenario based approach for dealing with
ETRnew.doc.doc
Ch35473477
VLSI IN NEURAL NETWORKS
Analysis of Neocognitron of Neural Network Method in the String Recognition
Symbolic-Connectionist Representational Model for Optimizing Decision Making ...
pln_ecan_agi_16
Nature Inspired Reasoning Applied in Semantic Web
Soft computing
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Artificial Neural Networks: Applications In Management
Machine Learning and Reasoning for Drug Discovery
Soft computing abstracts
A Stacked Generalization Ensemble Approach for Improved Intrusion Detection
Computer Aided Development of Fuzzy, Neural and Neuro-Fuzzy Systems
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
IRJET-In sequence Polemical Pertinence via Soft Enumerating Repertoire
Ad

Similar to Army Study: Ontology-based Adaptive Systems of Cyber Defense (20)

PPTX
Situational awareness for computer network security
PDF
ONTOLOGY-BASED MODEL FOR SECURITY ASSESSMENT: PREDICTING CYBERATTACKS THROUGH...
PDF
Cyber Defense And Situational Awareness Alexander Kott Cliff Wang
PDF
AI-Driven Logical Argumentation in Active Cyber Defense
PPTX
Cyber Threat Hunting with Phirelight
DOCX
L. Marinos and I. Askoxylakis (Eds.) HASHCII 2013, LNCS 8030.docx
PDF
Cyber Threat Hunting Workshop.pdf
PDF
Cyber Threat Hunting Workshop.pdf
PPTX
Cyber Threat Intelligence, CTI Lifecycle and CTI Framework
PPTX
Cyber Threat Hunting Workshop
PDF
Detecting network attacks model based on a convolutional neural network
PDF
[Bucharest] Attack is easy, let's talk defence
PPTX
Network security situational awareness
PDF
research project Generative oversasmling
PDF
Cyber Threat Intelligence - It's not just about the feeds
PDF
Empowering Cyber Threat Intelligence with AI
PPTX
TakeDownCon Rocket City: Research Advancements Towards Protecting Critical As...
PDF
Cyber Threat Intelligence
PDF
Kb2417221726
PDF
Network Intrusion Detection using MRF Technique
Situational awareness for computer network security
ONTOLOGY-BASED MODEL FOR SECURITY ASSESSMENT: PREDICTING CYBERATTACKS THROUGH...
Cyber Defense And Situational Awareness Alexander Kott Cliff Wang
AI-Driven Logical Argumentation in Active Cyber Defense
Cyber Threat Hunting with Phirelight
L. Marinos and I. Askoxylakis (Eds.) HASHCII 2013, LNCS 8030.docx
Cyber Threat Hunting Workshop.pdf
Cyber Threat Hunting Workshop.pdf
Cyber Threat Intelligence, CTI Lifecycle and CTI Framework
Cyber Threat Hunting Workshop
Detecting network attacks model based on a convolutional neural network
[Bucharest] Attack is easy, let's talk defence
Network security situational awareness
research project Generative oversasmling
Cyber Threat Intelligence - It's not just about the feeds
Empowering Cyber Threat Intelligence with AI
TakeDownCon Rocket City: Research Advancements Towards Protecting Critical As...
Cyber Threat Intelligence
Kb2417221726
Network Intrusion Detection using MRF Technique
Ad

More from RDECOM (7)

PDF
Historic Ballistic Facility
PDF
Individual Differences and Human Variability
PDF
Analyzing Production Processes of Energetic Materials using Ultrasound Techno...
PDF
Analyzing Production Processes of Energetic Materials using Ultrasound Techno...
PDF
AEWE 2015 Participating Technologies
PDF
Systems Book
PPTX
RDECOM AUSA Sustainment Briefing
Historic Ballistic Facility
Individual Differences and Human Variability
Analyzing Production Processes of Energetic Materials using Ultrasound Techno...
Analyzing Production Processes of Energetic Materials using Ultrasound Techno...
AEWE 2015 Participating Technologies
Systems Book
RDECOM AUSA Sustainment Briefing

Recently uploaded (20)

PPTX
endocrine - management of adrenal incidentaloma.pptx
PPT
Animal tissues, epithelial, muscle, connective, nervous tissue
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
PPT
Cell Structure Description and Functions
PDF
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
PPTX
Introduction to Immunology (Unit-1).pptx
PDF
Integrative Oncology: Merging Conventional and Alternative Approaches (www.k...
PPTX
congenital heart diseases of burao university.pptx
PPTX
Presentation1 INTRODUCTION TO ENZYMES.pptx
PDF
Cosmology using numerical relativity - what hapenned before big bang?
PPTX
Preformulation.pptx Preformulation studies-Including all parameter
PPT
Mutation in dna of bacteria and repairss
PPTX
PMR- PPT.pptx for students and doctors tt
PDF
Science Form five needed shit SCIENEce so
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PDF
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
PPT
Enhancing Laboratory Quality Through ISO 15189 Compliance
PPTX
A powerpoint on colorectal cancer with brief background
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PPTX
Understanding the Circulatory System……..
endocrine - management of adrenal incidentaloma.pptx
Animal tissues, epithelial, muscle, connective, nervous tissue
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
Cell Structure Description and Functions
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
Introduction to Immunology (Unit-1).pptx
Integrative Oncology: Merging Conventional and Alternative Approaches (www.k...
congenital heart diseases of burao university.pptx
Presentation1 INTRODUCTION TO ENZYMES.pptx
Cosmology using numerical relativity - what hapenned before big bang?
Preformulation.pptx Preformulation studies-Including all parameter
Mutation in dna of bacteria and repairss
PMR- PPT.pptx for students and doctors tt
Science Form five needed shit SCIENEce so
TORCH INFECTIONS in pregnancy with toxoplasma
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
Enhancing Laboratory Quality Through ISO 15189 Compliance
A powerpoint on colorectal cancer with brief background
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
Understanding the Circulatory System……..

Army Study: Ontology-based Adaptive Systems of Cyber Defense

  • 1. Ontology-based Adaptive Systems of Cyber Defense Noam Ben-Asher∗‡, Alessandro Oltramari†, Robert F. Erbacher∗, Cleotilde Gonzalez† ∗U.S. Army Research Laboratory Adelphi, MD, USA nbenash@us.ibm.com, robert.f.erbacher.civ@mail.mil †Carnegie Mellon University Pittsburgh, PA, USA aoltrama@andrew.cmu.edu, coty@cmu.edu ‡IBM T.J.Watson Research Center, Yorktown Heights, NY Abstract—In this paper we outline a holistic approach for understanding and simulating human decision making in knowledge-intensive tasks. To this purpose, we integrate semantic and cognitive models in a hybrid computational architecture. The contribution of the paper is twofold: first we describe a packet- centric ontology to represent network traffic. We show how the ontology is used to describe real-world network traffic and also serve as a basis for higher level ontologies of cyber operation, threat and risk. Second, we demonstrate how the combination of the packet-centric ontology with an adaptive cognitive agent with learning capabilities, can be used to understand the human defender reasoning processes when monitoring network traffic. Through simulation experiments we evaluated the proposed hybrid computational architecture and demonstrate its ability to successfully detect malicious port scanning within legitimate network traffic. We discuss the implications of these findings for improving our understanding of the cognitive processes and knowledge requirements of the cyber defender, as well as the possible use of the hybrid architecture as a cognitively inspired decision support tool. I. INTRODUCTION Disruption of computers and the loss of sensitive infor- mation through cyber-attacks are becoming a widespread threat and a critical concern for citizens, organizations, and governments. Even with recent advances in information and network security and the development of new monitoring and threat detection tools, many of the tasks performed by cyber-defenders (i.e., security analysts) remain challenging, resulting in weak and uncertain cyber-defense. The analytical capabilities of the human decision maker are needed and indis- pensable for the process of cyber-defense [1]. Security analysts transform network traffic data into cyber situation awareness, a high level of processing that is difficult to automate [2]. This process may be seen as analogous to the Data-Information- Knowledge-Wisdom (DIKW) hierarchical model that is central for information and knowledge management [3]. Within this context, cognition serves as the driver that governs the transi- tions between the different levels of information representation [4]. While there is a large body of research on technologies that detect port scanning [5], there is a limited understanding of the cognitive processes cyber security analysts use to detect port scanning and specifically how these cognitive abilities interact with and information representation. In this regard, the contribution of this paper is twofold: first we describe a packet-level ontology that represents network traffic. Second, we demonstrate how the integration of this ontology with a computational cognitive agent can be used to understand the human analyst reasoning process, which may then serve as guide to develop decision support technology for the analyst. II. KNOWLEDGE MODEL From a cyber security standpoint, variations in network traf- fic are the primary prompts of analyst’s behavioral responses; nevertheless, full situational awareness can emerge only from a projection of observations and decisions into a more com- prehensive context that includes knowledge about threat and attack types, executable defensive maneuvers, system vulner- abilities, risk mitigation and time constraints, among others. In this regard, building a rigorous model of this complex context is a key requirement for the study of human decision making in cyber security. Computational ontologies are the knowledge component in this holistic approach, as they can provide a machine-readable semantic representation of cyber scenarios. In virtue of their logical properties and schematic structure, ontologies can be used by automatic reasoners in dynamic tasks: in particular, in our work we apply ontology- based reasoning to a detection task, where an agent simulates a human analyst’s cognitive capabilities, including the capa- bility of using domain knowledge and temporal information to reason about perceived events [6]. To this purpose, we engineered a packet-centric ontology of network traffic, a module of a larger ontology framework called CRATELO [7], the suite of modular ontologies under development in the U.S. Army Research Laboratory Cyber Security Collaborative Al- liance. CRATELO is constituted of several domain ontologies (collectively indicated as OSCO), integrated on the basis of DOLCE top level [8] extended with a security-related middle layer. These top, middle and domain level ontologies currently add up to 330 classes, connected by 162 relationships (132 object properties and 30 datatype properties) and encoded in OWL-DL. The packet-centric ontology presented in this paper, henceforth abbreviated to PACO, is a partition of OSCO1 . Our reseach efforts in developing CRATELO are inspired by Obrst and colleagues’s proposal of a wide-ranging ontology framework of cyber security [9], that spans from top-level, system-oriented ontologies and human factors ontologies. In 1CRATELO stands for ‘Three Levels Ontology for the ARL Collaborative Research Alliance’. OSCO stands for ontology of cyber operations. For more details about the program see also: http://www.arl.army.mil/www/default.cfm?page=1417
  • 2. this long-term endeavour, we have been working with ARL domain experts and cyber analysts to distill the necessary knowledge of the cyber domain. As the state of the art shows, a preliminary step in understanding any new domain is to produce accessible definitions and classifications of entities [10]: discussions on cyber security often begin with the difficulties created by misused terminology (such as char- acterizing cyber espionage as an attack). In this regard, the Joint Chiefs of Staff created a list of cyber term definitions (allegedly extended and refined for a classified version). None of these definitions, however, were formulated as an ontology. Likewise, various agencies and corporations (NIST, MITRE, Verizon) have formulated enumerations of types of malware, vulnerabilities, and exploitations. In particular MITRE, which has been very active in the field, maintains two dictionar- ies, CVE (Common Vulnerabilities and Exposure) and CWE (Common Weakness Enumeration), a classification of attack patterns (CAPEC - Common Attack Pattern Enumeration and Classification), and an XML-structured language to represent cyber threat information (STIX - Structure Threat Information Expression). Despite of the important role played by these and further initiatives, the lack of a shared formal semantics make ter- minologies hard to define, sustain, and port into a machine- processable format: here we try to overcome these problems, embracing a holistic approach to model cyber security factors. In fact, if the ontology outlined in this paper is tailored to a packet-centric model of network traffic, it can be framed at a higher level of conceptualization by means of the integration with CRATELO: for instance, when modeling the behavior of a cyber analysts during an attack, packets can be seen as parts of the evidence collection process, and specific attributes of packets (e.g. internal or external IP addresses, low or high packet rate, etc.) may hint to specific intentions of the adver- sary (also called anti-goals). As mentioned at the beginning of the section, ontologies can serve as knowledge bases to agents: conversely, the dynamics of the agent’s decision process and learning from experience are captured by an Instance-based Learning (IBL) cognitive model [11], which is a computational representation of the processes that guide human behavior. Next section reviews what cognitive models are, and how they can be used to study human decision making. III. COGNITIVE MODEL In a dynamic decision making setting, cognitive architec- tures, such as ACT-R [12], SOAR [13] and others, have been commonly used to provide an integrated representation of human cognition. Cognitive models, constructed using these architectures, allow for a careful examination of various cognitive processes that drive human decision making [11]. Cognitive models based on IBL theory (IBLT) focus on decision making and learning from experience in dynamic settings [11]. IBLT emerging from ACT-R, proposes a generic decision-making process that recognizes decision situations, generates instances through the interaction with the decision task, and finishes with reinforcement of the instance leading to desired outcomes. According to IBLT, the decision maker represents decision making situations as instances stored in memory. An instance is composed of three parts: (1) situation (S) a set of attributes representing a situation; (2) decision (D) that is made in the particular situation; and (3) utility (U) that is the experienced outcome from a decision. The IBLT decision cycle includes several stages: recognition, judgment, choice, and execution. In the Recognition stage, a decision maker identifies relevant attributes for a specific decision situation. Judgment stage determines the relevancy of past experiences (instances) in current decision making situation. The activation of instances in memory is a representation of relevancy. Acti- vation is influenced by the recency and frequency an instance occurred in the past and the similarity between the current decision situation and the situation stored in the instance. This activation mechanism is a simplification of the mechanism originally proposed in the ACT-R architecture. Memory ac- tivation determines the probability that an instance will be retrieved from memory and participate in the next phase. In the absence of previous experiences that may be relevant to the current situation, pre-defined heuristics are triggered for decision making. In the Choice, the retrieved instances and their retrieval probability are used to calculate the expected utility for each of the decision options, and the option with the highest expected utility is chosen. Finally, in the Execution, feedback regarding the last decision is provided to the decision maker [11]. In this work, we chose IBL to model the decision making as it captures the adaptive human decision making and learning processes in dynamic environment as well as the transition between exploration and maximization. Agents based on IBL models successfully account for human decision making and behavior in a variety of tasks. Lejarraga et al. [14] demonstrate that a single IBL model constructed for a specific repeated binary choice task can be generalized to different variants of repeated tasks requiring a binary decision as well as to probability learning tasks. More specifically, IBL models can reflect human behavior in simple stimulus-response practice and skill acquisition tasks and train- ing. Furthermore, the experience-based learning process of an IBL model was successfully extended to include descriptive information and biases as risk aversion [15]. A pair of IBL models successfully consider the dynamics of cooperation in iterated Prisoner’s Dilemma as well as reciprocity and other complex social interactions [16], [17]. IV. A PACKET-CENTRIC NETWORK ONTOLOGY In this section we describe the structure of PACO, and how it can be used to instantiate thousands of packets generated by capturing actual network traffic. As Fig. 1 shows, the class ‘PacketTransmission’ is considered the atomic element of a ‘NetworkSession’. Intuitively, this means that without an actual exchange of packets between a source and a destination node, no network session can be deemed as properly complete. In fact, there are additional features of network sessions: for instance, when considering TCP connections, a complete handshake with SYN, SYN+ACK and ACK packets transmis-
  • 3. Fig. 1. A Prot´eg´e visualization of PACO. From the bottom-left corner (clockwise): 1) The DL expressivity derived by the HermiT 1.3.8 reasoner; 2) the backbone taxonomy of classes; 3) an informal definition of packet transmission as value of annotation property; 4) property restrictions. sion is necessary to enable a packet transmission between two nodes, though this is not the case for communication protocols like UDP, where handshake dialogues are not supported. Fol- lowing the actual packet transmission between the two network nodes and after the data are exchanged, a session is usually resetted (although this final stage is not essential to qualify it as complete - and session can also end due to a timeout). In summary, when a communication between a source and a destination node is established, a complete network session consists of the transmission of a unit of data from source A to destination B, and of the transmission of a unit of data from source B to destination A. From the ontological standpoint, this constraint is represented by the cardinality restriction ‘min 2’ on the object property ‘has member’ holding between ‘Net- workSession’ and ‘PacketTransmission’ classes, respectively the domain and the range of ‘has member’. Apart from network-specific information associated to source and destination nodes, like IP and port numbers, communi- cation protocols, packet size, etc., we have introduced a data property ‘has time stamp’ that assigns a specific time stamp to each network event and a data property ‘has order’ that binds each individual network event to its relative position in a given sequence (the first event, the second event, etc.). This twofold modeling choice provides us with a flexible model of temporal knowledge: 1) it pinpoints the discrete temporal coordinates of each event according to a universal time format (based on the XML schema specifications2 ); 2) it allows for representing and reasoning over qualitative temporal relations like ‘before’, ‘after’, and ‘overlap’, as defined by Allen’s temporal axioms [18]). Figure 2 shows a situation where the ordinal scale of the packet is captured (i.e., the 1024th packet) but the time stamp is not represented: the reason is that the 2http://www.w3.org/TR/xmlschema11-2/ former is more appropriate than the latter for the simulation experiment reported in the next section, since the dataset was collected with a rate of about 83 packets per second. In other words, in our specific cyber scenario knowing the sequence of events is more meaningful than knowing the real time stamps from the defender’s perspective, although - to be general enough - the ontology has to support both representational formats. As depicted in Fig. 2, the role of a packet in the handshake sequence can be captured by three booleans data properties, respectively ‘has tcp.flags.syn’, ‘has tcp.flags.ack’ and ‘has tcp.flags.reset’. In the ‘PacketTrasmission1024’ case, however it is unclear whether this packet represents the first stage of a handshake or is part of a port scanning [19]. This can be resolved by evaluating the properties of the proceeding packet exchange (i.e., session) between the two nodes. As the next section will show, we conducted an experiment to elicit relevant information from instantiated ontology, and make the resulting knowledge chunks available to the cognitive model of a cyber defender. This process of knowledge elicitation from PACO is driven by a set of SPARQL queries3 , properly designed to extract and present relevant information that an agent can use to decide whether a specific event is a threat or not. For instance, the query in Fig. 3 is designed to collect all the pairs of distinct source and destination ports in the dataset of network events: on the basis of the retrieved information, an analyst can gauge the volume of network traffic on a per unique port basis; moreover, Fig. 4 represents a query built to assess how many times a given source has sent a packet to a closed port. In the latter case, the returned result, around one thousand times, can be used as a clue of the maliciousness of the source: so many attempts of communication with closed ports may, in fact, suggest a port scanning attack. Note that 3http://www.w3.org/TR/rdf-sparql-query/
  • 4. Fig. 2. A Prot´eg´e visualization of a specific instance of the ‘PacketTransmission’ class. both queries have been used dynamically in the experiment described in the next section, where the goal is to replicate the analyst’s incremental understanding of the considered cyber scenario. Following a basic modeling strategy, in PACO we directly assign specific data sizes to each network event through the data property ‘has frame length’: an alternative option would have been to introduce the class ‘Packet’ (a subclass of ‘information object’ in DOLCE), and use the object prop- erty ‘participation’ to link ‘Packet’ and ‘PacketTransmission’, switching the domain of the data property ‘has frame length’ from PacketTransmission’ to ‘Packet’. At the current stage of development, representing the data contents of packet trans- missions doesn’t add any fundamental benefit to our modeling framework, although we don’t exclude this option in the future. Additional semantic structures of PACO concern network topology and services: for instance, every network node runs a set of services, and each service uses an official commu- nication port and a specific protocol to establish a network session with another node. It follows that when a port is open, a service is running on a node, and if a port is closed, no services are currently running for that particular node. Thanks to the interoperability between PACO and CRATELO, services can be modeled in the context of user’s actions: for instance, a system administrator can decide to start or stop an HTTP service, or access to the event log service on a server. By and large, the originality of our approach relies on the flexibility in the granularity of the representation: PACO is only a module of a more comprehensive framework that sees the detection as a socio-technical task, where packet-centric information can PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX IBLOd: <http://www.cra.psu.edu/IBLOd#> SELECT DISTINCT ?srcport ?dstport WHERE {{?event IBLOd:member IBLOd:NetworkTraffic-041215; IBLOd:has_src_port ?srcport; IBLOd:has_dst_port ?dstport; IBLOd:has_source_node ?s; IBLOd:has_destination_node ?d; IBLOd:has_order ?order. FILTER(?order >= "1"ˆˆxsd:positiveInteger && ?order<= "4735"ˆˆxsd:positiveInteger).}} Fig. 3. A SPARQL query that returns all the distinct combinations of source and destination ports for a packets exchange sequence between two nodes. be used by the decision maker at the cyber operation level. In principle, using CRATELO we can also model beliefs, goals and emotions of defenders and attackers, although it’s beyond the scope of the current work to address these dimensions. V. USING HYBRID MODELS IN CYBER DEFENSE Next, we examine the interplay between knowledge and cognition in cyber defense by integrating the packet-centric ontology with cognitive agents who make decisions regarding the state of a network into a hybrid computational architecture. For the packet-centric knowledge-base we use PACO and the agents are computational models of the IBL theory.
  • 5. PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX IBLOd: <http://www.cra.psu.edu/IBLOd#> SELECT (COUNT(?order) AS ?numberOfACKResponses) WHERE {?event IBLOd:member IBLOd:NetworkTraffic-041215; IBLOd:has_source_node ?sn; IBLOd:has_destination_node ?dn; IBLOd:has_tcp.flags.syn false; IBLOd:has_tcp.flags.ack true; IBLOd:has_tcp.flags.reset true; IBLOd:has_order ?order. FILTER (?order >= "1"ˆˆxsd:positiveInteger && ?order <= "4735"ˆˆxsd:positiveInteger).} Fig. 4. A SPARQL query that returns the number of times a source node sent packets to a closed port on the destination node. A. Port Scanning Scenario Port scanning is designed to probe network nodes for open ports. The existence of an open port can provide some indi- cation on the availability of services. This type of information gathering can be part of a defensive or offensive operation. From the attacker’s perspective, a port scan is useful for gather- ing relevant information for launching a successful attack and indeed most attacks are preceded by some form of scanning activity (reconnaissance), particularly vulnerability scanning [20]. Therefore, the defender will try to detect external scans while the attacker interest is to perform a scan without being detected [21]. In this work, we assume first that the attacker uses external resources to identify the attack IP address (i.e., the target). Following, the attacker identifies port ranges to scan on the specific target. These are the ports for services for which the attacker has sophisticated attacks available. We also assume, that the target is using standard ports and not randomized ports. Thus, knowing that a port is open provides an accurate indication that a service is running on the target. B. Cognitive Models for Port Scanning Detection To better understand the interplay between cognition and knowledge and how semantic information supports the ongo- ing work of the cyber defender, we developed two cognitive models for cyber defender agents. Both agents observe a situation, make decisions whether there is a scan or not, and learn from feedback and past experiences. However, the one agent operates without the knowledge based provided by PACO, while the other is querying PACO to acquire temporal information and situational awareness. 1) Experience Only Agent: To examine the interplay be- tween information, cognition and knowledge, we initially con- structed an agent using an IBL model which classifies network events based on their attributes and learns from experience only. The decision making process of this IBL agent depends on the low level network traffic information, and the agent could learn only from its own experiences without the ability to acquire knowledge by querying the ontology. The situation as observed by the agent in this condition is given by Si = {p, sIP, dIP, SY N, ACK, RST} (1) TABLE I PAYOFF MATRIX WHICH DETERMINING THE FEEDBACK FOR AN AGENT MAKING A DECISION IN A GIVEN SITUATION Agent’s Decision Scan No Scan Packet Type Scan Hit: 10 Miss: -10 No Scan False alarm: -5 Correct Rejection: 5 Where p is the protocol type (e.g., TCP, HTTP) of the packet, sIP and dIP are the source and destination IP addresses of the packet. SY N, ACK and RST are 1-bit boolean flags that indicate on the state of a connection. The agent observed a situation Si and made a decision which corresponds to classifying a packet as being part of a scan or not. This decision process involves retrieving relevant instances (i.e., past experiences) from the agent’s memory, computing retrieval probability for each of the instances and, choosing the decision option that yields the highest expected utility, based on the previous decisions recorded in the in- stances. The process of choosing the option with the highest expected utility is influenced by the recency and frequency of past experiences, memory decay (d) and a noise parameter for capturing the variability in memory activation (σ) [11]. After making a decision, the agent received a utility feed- back, representing the outcome of the decision in a given situation. The experienced utility (i.e., payoff) is determined based on the payoff matrix illustrated in Table I. The payoff that an agent receives following a decision, is determined by the accuracy of the decision, based on the ground truth, detailed in section V-C. The payoffs in the matrix emphasize the positive and negative utilities from hits and misses over correct rejections and false alarms. 2) Semantic Information and Experience Agent: In contrast to the previous agent model, this agent can send SPARQL queries to the PACO ontology, that provides specific knowl- edge of the scenario, temporal information and augmented situational awareness. As such, this model observes the same situation as the Experience Only agent: however, instead of using this information to make a decision, the agent uses the information to generate queries (which, in turn, provides richer information). Using PACO, the agent can generalize from and reason about the characteristics of a sequence of packets transferred from one network node to the other. Therefore, the situation observed by the agent consist of the outputs from multiple queries regarding the conversation between two specific IP addresses, where one is the source and the other is the destination. The situation for any packet, transmitted between a source and a destination IP addresses, is given by Si = {p, sPorts, dPorts, avgSY N, avgACK-RST} (2) Where the attributes of the situation represent properties of a communication between source and destination IPs, using protocol p. The communication consists of a sequence of packets exchanged between the two network nodes up to the current packet. Thus, the agent can examine each packet within
  • 6. the context of a sequence. Given the source IP of the current package, attribute sPorts indicates on the average number of ports in the source node that sent packets to the destination node. Similarly, attribute dPorts indicates how many ports in the destination node received packets from the source. The attribute avgSY N describes the average ratio between SYN packets and normal traffic recived from the source of the packet. Attribute avgACK-RST provides complementary information, the average ratio of between ACK-RST packets and normal traffic the destination sent back to the source. This type of answer indicates that the packet was sent to a closed port (i.e., a port that is not used by any service on the target node). Based on the set of attributes described above, the Semantic information and Experience agent classified packets. The Semantic information and Experience agent received feedback for these decisions using the same payoff matrix as the Experience Only agent. C. Simulation Experiment We evaluated the differences between the two agent models through simulation experiment. In the experiment, agents classified the packets captured from the traffic in a small network with 16 nodes (i.e., unique IP addresses). The cap- tured communication between the network nodes included 4735 packets. The nodes used several types of protocols to exchange packets, for example SMB and SSL. However, the majority of the traffic (99.56%) used the TCP protocol. Within this network, the adversary was located in a node with the IP address of 192.168.1.8. The adversary used a specific port to scan the 1000 common ports of the target node (192.168.1.3) using Nmap defaults [22]. This information was not provided to the agents and served as the ground truth for evaluating the detection performance of the agents and providing them with feedback. The captured network traffic was converted into an XML data structure that was used to populate PACO and the Semantic Information and Experience agent could then query using SPARQL. The output of the SPARQL queries served as the attributes of a situation as described in Eq. 2. The values of the free parameters across the two agents were kept the same, with d = 1.5 for memory decay and σ = .25 for noise. These values are considered to be the ACT-R defaults and are commonly used for IBL models as well [23]. Each agent classified the 4735 packets and received feedback following each decision, and this was repeated for 20 iterations. To compare the performance of the Experience Only agent with the Semantic Information and Experience agent we used the following metrics: 1) Correct packet classification indicates on the propor- tion of packets classified correctly as being a Scan or No Scan packet. 2) Correct detection of scanning sequence indicates on the proportion of conversations between two IPs that were correctly classified as scans. Fig. 5. Proportions of hits and false alarms for the two agents. 3) Learned classification rule indicates on the decision rule the agents constructed from the repeated experi- ences. VI. RESULTS In this section, we show our experimental results and ana- lyze the observed trends based on the performance comparison of the two modeling approaches. Correct packet classification When analyzing the ability of the agents to classify correctly a scan packet, and as seen in Fig. 5, we find that the Experience Only agent (mean=.999, SD=0) and the Semantic Information and Experi- ence agent (mean=.992, SD=.002) performed similarly with a minor advantage to the Experience Only agent, t(38)=.387, p=ns. However, the Semantic Information and Experience agent (mean=.050, SD=.077) generated a significantly higher number of false alerts compared to the Experience Only agent (mean=.004, SD=0), t(38)=2.661, p=.011. Correct detection of scanning sequence utilizes the classi- fication of a packet as belonging to a scan or to normal traffic between two network nodes. This high level decision aims to answer the question whether network node A is scanning network node B. With respect to this question, if the network traffic from node A to node B includes one or more packets that are classified as scan packets, then node A is scanning node B. When analyzing the ability of the two agents to answer the question whether node A is scanning node B, both agents detected that the adversary was scanning a specific network node (i.e., 192.168.1.8 SY Nscan → 192.168.1.3). However, the Experience Only agent detected on average additional 2.3 out of 22 sequences between network nodes as scans (i.e., 10% false scans), while the decisions of the Semantic Information and Experience agent yielded 0 false classification of packet sequences. Despite the higher false classification rate of inde- vidual packets the Semantic Information and Experience agent had, all these false classified packets belonged to the responses of the scanned node (ACK packets) to the adversary scan (i.e., 192.168.1.3 ACKresponse → 192.168.1.8).
  • 7. Fig. 6. Detection outcomes of the Experience Only agent during a single iteration with black arrows highlighting false classification of packets, red cross marks indicating on sequences of packets that were incorrectly classified as scans and green check mark for correct classification. Fig. 7. Detection outcomes of the Semantic Information and Experience agent during a single iteration with green check mark indicating on the correct classification of a packet sequence. Figures 6 and 7 illustrate the interplay between packet classification and sequence classification. In both figures, we present the same network sequences and how the packets were classified by each agent. As seen in Fig. 6, the Experience Only agent generated a very low number of false alerts (highlighted by arrows). However, these packets corresponding to these alarms were distributed across multiple sequences. As a result, the entire sequence was classified as a scan. On the other hand, and as seen in Fig. 7, the false alerts generated by the Semantic Information and Experience agent were all part of the communication between the scanned node to the source of the scan. Note that both agents were able to separate between legitimate traffic between 192.168.1.8 and 192.168.1.3 that was not part of the scan and used UDP and other protocols. The Learned classification rule used by each agent can be formalized by examining the instances stored in the memory of each agent, and their activation. The combination of attributes and decision in highly activated instances represent beliefs re- garding a relationship between a situation and the appropriate decision. The decision rule formulated by the Experience only agent was that any TCP packet with a SYN flag is part of an ongoing scan between the source of the packet and its destination. This rule yields high accuracy in detecting scan packets as all the scan packets had a SYN flag. However, packets with SYN flag are also part of legitimate handshake between network node and for that reason the Experience Only agent detected a higher proportion of packets sequences as scans. In contrast, the Semantic Information and Experience agent observed the temporal properties of a packet sequence. The decision rule formulated by this agent suggests that a scan packet uses TCP protocol and is part of a sequence of packets in which the source node is using a low number of ports to send packets to a high number of destination ports and the average number of SYN packets sent to a port is very close to 1. In addition, the rule constructed by the Semantic Information and Experience agent indicates the based on experience, the target node of the packet is very likely to respond to the current packet with a ACK-RST packet, indicating that the destination of the packets coming from the source node tends to be a closed port. VII. DISCUSSION AND CONCLUSION Analytical capabilities of the human decision maker are needed and are indispensable when ensuring the security of any cyber infrastructure. It is the human abilities to inte- grate information, to reason, to learn and to quickly adjust to changes that make such significant contribution to cyber security. The understanding of these processes relies on our integration of knowledge from human cognitive theories and knowledge-based technologies. In this study we propose an architecture to combine cognitive models and ontologies in the domain of cyber defense. We developed a packet-centric ontology PACO which allows us to represent and capture the atomic elements of network communication, i.e., packets and sequences of packets. PACO serves as the basis for more holistic semantic representations of cyber operation, cyber assets, threats and risks, available through CRATELO. We also developed an IBL cognitive model capable of accessing the information in PACO and using it when detecting adversarial port scan. When making decisions, the ability of the IBL agent to access PACO and retrieve information improved its performance, compared to the same IBL agent that did not utilize PACO. We show that when answering the questions ’Is IP A scanning IP B?’, an agent with access to a packet-centric ontology delivers a much lower false alerts rate and by that show superior performance. Overall, the access to semantic information allowed the agent to acquire better situation awareness by incorporating sum- marized information into the decision making process. PACO extended the agents ability to inspect temporal relationship between a packet sent from a specific source and previous replays of packet’s destination to communication coming from that source. Such reasoning requires a representation of a source and a destination, as well as the ability to switch between these roles in order to observe the response patterns. The agents explored rules in the form of IF a situation THEN a decision, and learned which rule maximizes their
  • 8. payoff. While the attributes of the situation part are influenced by the availability of information, the cutoff values of the attributes were learned from experience. Furthermore, the decision rule that the agent with access to a packet-centric ontology learned from experience is valid and useful beyond the limited scope of the network scenario we used in the study. However, the existence of knowledge is a precondition rather than a guarantee for improvement: correctly querying the information is the key for the major improvement. In the process of modeling, we used domain experts to construct the queries that aggregate and retrieve information. By using cognitive agent we were able to test different queries and combinations of attributes, to identify representations that facilitate the decision making process of a network defender. While PACO has the potential of representing packet level information for complex and diverse network communication, the current cognitive model was developed to accommodate a simplistic network scenario. Port scanning can take many forms (vertical and horizontal scans), can use different pro- tocols and can be highly distributed over time (i.e., low- and-slow scan). Therefore, although we used a high fidelity network traffic, future research should scale up the volume of traffic as well as the complexity of the network scan. Such additions will likely challenge the cognitive agent. However, providing the agent access to the middle and high levels of CRATELO might be the key component for the agent’s success in more complex and challenging tasks. The benefit of pairing cognitive agents and ontologies goes beyond the ability to gauge into the decision making process of the human analyst. Such combination can serve as an initial step towards the development of cognitively inspired decision aid tool for automating some tasks that are currently performed by human analyst. ACKNOWLEDGMENT This research was sponsored by the Army Research Lab- oratory and was accomplished under Cooperative Agreement Number W911NF-13-2-0045 (ARL Cyber Security CRA). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Gov- ernment is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on. REFERENCES [1] C. Gonzalez, N. Ben-Asher, A. Oltramari, and C. Lebiere, “Cognition and technology,” in Cyber Defense and Situational Awareness, ser. Advances in Information Security, A. Kott, C. Wang, and R. F. Erbacher, Eds. Springer International Publishing, 2014, vol. 62, pp. 93–117. [2] A. DAmico and K. Whitley, “The real work of computer network defense analysts,” in VizSEC 2007. Springer, 2008, pp. 19–37. [3] J. E. Rowley, “The wisdom hierarchy: representations of the dikw hierarchy,” Journal of information science, 2007. [4] N. Ben-Asher and C. Gonzalez, “Effects of cyber security knowledge on attack detection,” Computers in Human Behavior, vol. 48, pp. 51–61, 2015. [5] C. B. Lee, C. Roedel, and E. Silenok, “Detection and characterization of port scan attacks,” Technical report, Univeristy of California, Depart- ment of Computer Science and Engineering, 2003. [6] A. Oltramari, N. Ben-Asher, L. Cranor, L. Bauer, and N. Christin, “Gen- eral requirements of a hybrid-modeling framework for cyber security,” in Military Communications Conference (MILCOM). IEEE, 2014, pp. 129–135. [7] A. Oltramari, L. F. Cranor, R. J. Walls, and P. McDaniel, “Building an ontology of cyber security,” in 9th International Conference on Semantic Technologies for Defense, Intelligence and Security (STIDS), 2014, pp. 54–61. [8] C. Masolo, S. Borgo, A. Gangemi, N. Guarino, A. Oltramari, R. Oltra- mari, L. Schneider, L. P. Istc-cnr, and I. Horrocks, “Wonderweb deliv- erable d17. the wonderweb library of foundational ontologies and the dolce ontology,” 2002. [9] L. Obrst, P. Chase, and R. Markeloff, “Developing an ontology of the cyber security domain.” in 7th International Conference on Semantic Technologies for Defense, Intelligence and Security (STIDS), 2012, pp. 49–56. [10] D. A. Mundie and D. M. McIntire, “The mal: A malware analysis lexicon,” DTIC Document, Tech. Rep., 2013. [11] C. Gonzalez, J. F. Lerch, and C. Lebiere, “Instance-based learning in dynamic decision making,” Cognitive Science, vol. 27, no. 4, pp. 591– 635, 2003. [12] J. R. Anderson and C. Lebiere, The atomic components of thought. Lawrence Erlbaum Associates Publishers, 1998. [13] J. E. Laird, A. Newell, and P. S. Rosenbloom, “Soar: An architecture for general intelligence,” Artificial intelligence, vol. 33, no. 1, pp. 1–64, 1987. [14] T. Lejarraga, V. Dutt, and C. Gonzalez, “Instance-based learning: A gen- eral model of repeated binary choice,” Journal of Behavioral Decision Making, vol. 25, no. 2, pp. 143–153, 2012. [15] N. Ben-Asher, V. Dutt, and C. Gonzalez, “Accounting for the integration of descriptive and experiential information in a repeated prisoner’s dilemma using an instance-based learning model,” in 22th Behavior Representation in Modeling & Simulation (BRiMS) Conference, 2013, pp. 11–14. [16] C. Gonzalez, N. Ben-Asher, J. M. Martin, and V. Dutt, “A cognitive model of dynamic cooperation with varied interdependency informa- tion,” Cognitive science, 2014. [17] C. Gonzalez and N. Ben-Asher, “Learning to cooperate in the prison- ers dilemma: Robustness of predictions of an instance-based learning model,” in 35th annual meeting of the Cognitive Science Society (CogSci 2014), 2014, pp. 2287–2292. [18] J. F. Allen, “Maintaining knowledge about temporal intervals,” Commu- nications of the ACM, vol. 26, no. 11, pp. 832–843, 1983. [19] M. De Vivo, E. Carrasco, G. Isern, and G. O. de Vivo, “A review of port scanning techniques,” ACM SIGCOMM Computer Communication Review, vol. 29, no. 2, pp. 41–48, 1999. [20] E. M. Hutchins, M. J. Cloppert, and R. M. Amin, “Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains,” Leading Issues in Information Warfare & Security Research, vol. 1, p. 80, 2011. [21] M. H. Bhuyan, D. Bhattacharyya, and J. K. Kalita, “Surveying port scans and their detection methodologies,” The Computer Journal, vol. 10, pp. 1565–1581, 2011. [22] Nmap network mapper. [Online]. Available: https://nmap.org/ [23] N. Ben-Asher, J.-H. Cho, and S. Adalı, “Cognitive leadership frame- work using instance-based learning,” in 24th Conference on Behavior Representation in Modeling and Simulation (BRiMS), March 2015.