SlideShare a Scribd company logo
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
DOI:10.5121/ijcsa.2015.5307 77
A SERIAL COMPUTING MODEL OF AGENT
ENABLED MINING OF GLOBALLY STRONG
ASSOCIATION RULES
G.S.Bhamra1
, A. K.Verma2
and R.B.Patel3
1
M. M. University, Mullana, Haryana, 133207 - India
2
Thapar University, Patiala, Punjab, 147004- India
3
Chandigarh College of Engineering & Technology, Chandigarh- 160019- India
ABSTRACT
The intelligent agent based model is a popular approach in constructing Distributed Data Mining (DDM)
systems to address scalable mining over large scale and ever increasing distributed data. In an agent based
distributed system, variety of agents coordinate and communicate with each other to perform the various
tasks of the Data Mining (DM) process. In this study a serial computing mode of a multi-agent system
(MAS) called Agent enabled Mining of Globally Strong Association Rules (AeMGSAR) is presented based
on the serial itinerary of the mobile agents. A Running environment is also designed for the implementation
and performance study of AeMGSAR system.
KEYWORDS
Knowledge Discovery, Association Rules, Intelligent Agents, Multi-Agent System
1.INTRODUCTION
Data Mining (DM) technique is used to extract some interesting and valid data patterns implicitly
stored in large databases [1], [2]. Intelligent software agent technology is an interdisciplinary
technology dealing with the development and efficient utilization of autonomous software objects
called agents which have access to geographically distributed and heterogeneous resources. They
are autonomous, adaptive, reactive, pro-active, social, cooperative, collaborative and flexible.
They also support temporal continuity and mobility within the network. An intelligent agent with
mobility feature is known as Mobile Agent (MA). MA migrates from node to node in a
heterogeneous network without losing its operability. On reaching at a network node MA is
delivered to an Agent Execution Environment (AEE) where its executable parts are started
running. Upon completion of the desired task, it delivers the results to the home node. A Mobile
Agent Platform (MAP) or Agent Execution Environment (AEE), is a server application that
provides the appropriate functionality to MAs to authenticate, execute, communicate, migrate to
other platform, and use system resources in a secure way. A Multi Agent System (MAS) is
distributed application comprised of multiple interacting intelligent agent components [3].
Let { }, 1jDB T j D= = K be a transactional dataset of size D where each transaction T is assigned
an identifier (TID ) and { },i 1i
I d m= = K , total m data items in DB . A set of items in a particular
transaction T is called itemset or pattern. An itemset, { },i 1i
P d k= = K , which is a set of k data
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
78
items in a particular transaction T and P I⊆ , is called k-itemset. Support of an itemset,
( )
No_of_T_containing_P
%s P
D
= is the frequency of occurrence of itemset P in DB , where
No_of_T_containing_P is the support count (sup_count) of itemset P . Frequent Itemsets (FIs)
are the itemset that appear in DB frequently, i.e., if ( ) min_th_sups P ≥ (given minimum
threshold support), then P is a frequent k-itemset. Finding such FIs plays an essential role in
miming the interesting relationships among itemsets. Frequent Itemset Mining (FIM) is the task
of finding the set of all the subsets of FIs in a transactional database [2].
Association Rules (ARs) are used to discover the associations among item in a database [4]. It is
an implication of the form [ ]support,confidenceP Q⇒ where, ,P I Q I⊂ ⊂ and P Q∩ = ∅ . An
AR is measured in terms of its support and confidence factor where support of the rule
( ( )s P Q⇒ ) is the probability of both P and Q appearing in T , i.e., ( )p P Q∪ and the
confidence of the rule ( ( )c P Q⇒ ) is the conditional probability of Q given P , i.e., ( )|p Q P .
An AR is said to be strong if ( ) min_th_sups P Q⇒ ≥ (given minimum threshold support) and
( ) min_th_confc P Q⇒ ≥ (given minimum threshold confidence). Association Rule Mining (ARM)
today is one of the most important aspects of DM tasks. In ARM all the strong ARs are generated
from the FIs. The ARM can be viewed as two step process [5], [6].
1. Find all the frequent k-itemsets ( k
L )
2. Generate Strong ARs from k
L
a. For each frequent itemset, k
l L∈ , generate all non empty subsets of l .
b. For every non empty subset s of l , output the rule “ ( )s l s⇒ − ”, if
( )
( )
sup_count
min_th_conf
sup_count
l
s
≥
Distributed Association Rule Mining (DARM) is the task of generating the globally strong
association rules from the global FIs in a distributed environment. Few preliminaries notations
and definitions required for defining DARM and to make this study self contained are as follows:
• { },i 1iS S n= = K , n distributed sites.
• CENTRAL
S , Central Site.
• { }, 1i j i
DB T j D= = K , Horizontally partitioned data set of size i
D at the local site i
S , where
each transaction j
T is assigned an identifier (TID).
• 1
n
ii
DB DB=
= U , the aggregated dataset of size 1
n
ii
D D=
= ∑ , i j
DB DB∩ = ∅
• { },i 1i
I d m= = K , total m data items in each i
DB .
• ( )
FI
k i
L , Local frequent k-itemsets at site i
S .
• ( )
FISC
k i
L , List of support count ( )
FI
k i
Itemset L∀ ∈ .
• LSAR
i
L , List of locally strong association rules at site i
S .
• 1
nTLSAR LSAR
ii
L L=
= U , List of total locally strong association rules.
• ( )1
nTFI FI
k k ii
L L=
= U , List of total frequent k-itemsets.
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
79
• ( )1
nGFI FI
k k ii
L L=
= I , List of global frequent k-itemsets.
• GSAR
CENTRAL
L , List of Globally strong association rule.
Local Knowledge Base (LKB), at site iS , comprises of ( )
FI
k i
L , ( )
FISC
k i
L and LSAR
i
L which can provide
reference to the local supervisor for local decisions. Global Knowledge Base (GKB), at CENTRAL
S ,
comprises of TLSAR
L , TFI
k
L , GFI
k
L and GSAR
CENTRAL
L for the global decision making [7]. Like ARM, DARM
task can also be viewed as two-step process [6]:
1. Find the global frequent k-itemset ( GFI
k
L ) from the distributed Local frequent k-itemsets
( ( )
FI
k i
L ) from the partitioned datasets.
2. Generate globally strong association rules ( GSAR
CENTRAL
L ) from GFI
kL .
The existing agent based systems specifically dealing with DARM task are: Knowledge
Discovery Management System (KDMS) [8], Efficient Distributed Data Mining using Intelligent
Agents [9], Mobile Agent based Distributed Data Mining [10], An Agent based Framework for
Association Rule Mining of Distributed Data (AFARMDD) [11], [12], Multi-Agent Distributed
Association Rule Miner (MADARM) [13]. All these systems are academic research projects.
Qualitative comparison of these DARM frameworks is provided in [14]. Most of the existing
agent based frameworks for DARM task are only prototype model and lacks the appropriate
underlying AEE, scalability, privacy preserving techniques, global knowledge generation and
implementation using a real datasets.
The rest of the paper is organised as follows. Section 2 described the running environment for the
proposed system along with various algorithms involved. Serial computing model of AeMGSAR
is presented in Section 3. Algorithms for all the agents involved in this system are also discussed.
Section 4 describes the implementation and performance study of the system and finally the
article is concluded in Section 5.
2.ENVIRONMENT FOR THE PROPOSED SYSTEM
Every MAS needs an underlying AEE to provide a running infrastructure on which agents can be
deployed and tested. A running environment has been designed in Java. Various attributes of the
MA are encapsulated within a data structure known as AgentProfile . It contains the name of MA
( AgentName ), version number ( AgentVersion ), entire byte code ( BC ), list of nodes to be
visited by MA, i.e., itinerary plan ( NODESL ) , type of the itinerary ( ItinType ) which can be
serial or parallel, a reference of current execution state ( AObject ) and an additional data structure
known as Briefcase that acts as a result bag of MA to store final resultant knowledge ( iResult_S )
at a particular site. Computational time (CPUTime ) taken by a MA at a particular site is also
stored in iResult_S . In addition to results, Briefcase also contains the system time for start of
agent journey ( startTripTime ), system time for end of journey ( endTripTime ) and total round trip
time of MA (TripTime ) calculated using end startTripTime TripTime TripTime← − . Stationary as well
as mobile agents involved in the models would be discussed later on. This environment consists
of the following three components:
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
80
• Data Mining Agent Execution Environment (DM_AEE): It is the key component that
acts as a Server. DM_AEE is deployed on any distributed sites iS and is responsible for
receiving, executing and migrating all the visiting DM agents. It receives the incoming
AgentProfile at site iS , retrieves the entire BC of agent and save it with
.AgentName class in the local file system of the site iS after that execution of the agent is
started using AObject . Steps are shown in Algorithm 1.
• Agent Launcher (AL): It acts a Client at agent launching station ( CENTRAL
S ) and launches
the goal oriented DM agents on behalf of the user through a user interface to the
DM_AEE running at the distributed sites. Agent Pool (or Zone) at CENTRAL
S is a repository
of all mobile as well as stationary agents (SAs). AL first reads and stores AgentName
in AgentProfile . The entire BC of the AgentName is loaded from the Agent Pool and
stored in AgentProfile . NODES
L and ItinType are retrieved and stored in AgentProfile .
startTripTime is maintained in Briefcase which is further added to AgentProfile . In case of
serial computing model, i.e., if ItinType Serial= , AL dispatches a specific single MA
along with NODES
L , and it travels from node to node. AgentVersion is set as 1 for this
agent. AL also contacts the Result Manager (RM) for processing the Briefcase of an agent.
Detailed steps are given in Algorithm 2.
• Result Manager (RM): It manages and processes the Briefcase of all MAs. RM is either
contacted by a MA for submitting its results or by AL for processing the results of the
specific MA. On completion of itinerary, each DM agent submits its results to RM which
computes total round trip time (TripTime ) of that MA and saves it in the Briefcase of that
agent. It ItinType Serial= then it saves the updated AgentProfile of an agent at CENTRALS .
When it is contacted by AL for processing the results of a specific agent it sends back the
AgentProfile of that agent. Steps are defined in Algorithm 3.
Algortihm 1 DATA MINING AGENT EXECUTION ENVIRONMENT (DM_AEE)
1: procedure DM_AEE( )
2: while TRUE do
3: iAgentPofile listen and receive AgentProfile at S←
4: AgentName get AgentName from AgentProfile←
5: BC retrieve the BC of agent from AgentProfile←
6: isave the BC with AgentName.class in the local file system of S
7: AObject get AObject from AgentProfile← >current state
8: . ()AObject run >start executing mobile agent
9: end while
10: end procedure
Algortihm 2 AGENT LAUNCHER (AL)
1: procedure AL( )
2: option read option(dispatch / result)←
3: switch option do
4: case dispatch >dispatch the mobile agent to DM_AEE
5: AgentName read Mobile Agent's name←
6: add AgentName to AgentProfile
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
81
7: BC load entire byte code of AgentName from AgentPool←
8: add BC to AgentProfile
9: NODES
L read Itinerary (IP addresses)of mobile agent←
10: ItinType read ItinType( Serial / Parallel)←
11: add ItinType to AgentProfile
12: if " "ItinType Serial= then >Serial Itinerary
13: 1AgentVersion ←
14: add AgentVersion to AgentProfile
15: NODES
add L to AgentProfile
16: switch AgentName do
17: case LFIGA
18: minthrsup read minimum threshold support←
19: AObject new LFIGA(AgentProfile,minthrsup)←
20: end case
21: case LKGA
22: minthrconf read minimum threshold confidence←
23: AObject new LKGA(AgentProfile,minthrconf)←
24: end case
25: case TFICA
26: AObject newTFICA(AgentProfile)←
27: end case
28: case LKCA
29: (AObject new LKCA AgentProfile)←
30: end case
31: case GKDA
32: GSAR GSAR
CENTRAL CENTRAL CENTRALL load L generated by GKGA at S←
33: GSAR
CENTRALadd L to Briefcase
34: add updated Briefcase to AgentProfile
35: AObject newGKDA(AgentProfile)←
36: end case
37: end switch
38: add AObject to AgentProfile >current state
39: NODES
Transfer AgentProfile to DM_AEE at first IP address in L
40: end if
41: end case
42: case result >process the result of mobile agent
43: AgentName read mobile agent's name←
44: ItinType read mobile agent's ItinType←
45: AgentInfo
add AgentName to L
46: AgentInfo
add ItinType to L
47: > Result processing for Serial Itinerary Agents
48: if " "ItinType Serial= then
49: AgentInfo
AgentProfile contact RM for L←
50: Briefcase retrieve Briefcase from AgentProfile←
51: switch AgentName do
52: case LFIGA
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
82
53: process the Briefcase of LFIGA
54: end case
55: case LKGA
56: process the Briefcase of LKGA
57: end case
58: case TFICA
59: call GFIGA(Briefcase) >stationary agent
60: end case
61: case LKCA
62: call GKGA(Briefcase) >stationary agent
63: end case
64: case GKDA
65: process the Briefcase of GKDA
66: end case
67: end switch
68: end if
69: end case
70: end switch
71: end procedure
Algortihm 3 RESULT MANAGER (RM)
1: procedure RM( )
2: while TRUE do
3: listen and receive the incomming request
4: if icontacted by a mobile agent for submitting results from site S then
5: iAgentProfile receive the incomming AgentProfile from site S←
6: ItinType retrieve ItinType from AgentProfile←
7: Briefcase retrieve mobile agent's Briefcase from AgentProfile←
8: start startTripTime retrieveTripTime from Briefcase←
9: end endTripTime retrieveTripTime from Briefcase←
10: end startTripTime TripTime TripTime← −
11: add TripTime to Briefcase
12: add updated Briefcase to AgentProfile
13: if " "ItinType Serial= then
14: CENTRALsave AgentProfile at S
15: end if
16: end if
17: if contacted by AL for processing the results then
18: AgentInfo
AgentName retrieve AgentName from incomming L←
19: AgentInfo
ItinType retrieve ItinType from incomming L←
20: if " "ItinType Serial= then
21: CENTRALAgentProfile load AgentProfile for AgentName from S←
22: dispatch AgentProfile to AL
23: end if
24: end if
25: end while
26: end procedure
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
83
The overall working of AeMGSAR system may be divided into following six stages:
1. Request Stage: Request for the DARM is initiated at CENTRALS by AL on behalf of the user
with necessary credentials.
2. Preparation Stage: AL through User Interface reads agent name; version number;
Itinerary for the MAs journey is obtained in terms of IP addresses of the distributed nodes
to be visited by a MA; any specific additional data for a specific MA is obtained; Agent
code for the specific MA is loaded from AgentPool; for serial itinerary a single specific
MA is dispatched by AL to travel and visit n distributed sites in parallel.
3. Local Mining Stage: ARM process is performed locally by specific DM agents on each
distributed site and results are kept as local knowledge base at that site.
4. Result Collection Stage: Collector agents visits each site and collect the results generated
by DM agents and submit the results back to RM at CENTRALS .
5. Knowledge Integration and Global Knowledge Generation Stage: Knowledge or result
integration is carried out by the RM with the help of stationary agent and Global
Knowledge in the form of Globally Strong Association Rules may be generated with the
help of other stationary agents at CENTRALS .
6. Global Knowledge Dispatching Stage: Global knowledge is dispatched to the distributed
sites by a dispatching agent to compare it with the local knowledge at each site.
Figure 1. AeMGSAR Serial Computing Model
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
84
3.SERIAL COMPUTING MODEL OF AEMGSAR
Serial computing model of AeMGSAR system is shown in Figure 1. It consists of total seven
agents, five of these are MAs dispatched from CENTRALS with serial itinerary multi-hop migration
and other two are intelligent SAs running at CENTRALS to perform different tasks. The CPU time
taken by a MA while processing on each site along with some other specific information is
carried back in the result bag at CENTRALS . Agents in serial number 1-5 visit n sites serially other
parameters are collected from different resources. Detailed relationship among these agents and
working behaviour of each agent is as follows:
1. Local Frequent Itemset Generater Agent (LFIGA): This is a MA that carries the
AgentProfile & min_th_sup . LFIGA generates and stores ( )
FI
k iL and ( )
FISC
k iL at site iS by
scanning the local iDB at that site with the constraint of min_th_sup . It carries back the
computational time (CPUTime ) at each site iS and endTripTime . This agent is embedded
with Apriori algorithm [15] for generating all the frequent k-itemset lists. It may be
equipped with decision making capability to select other FIM algorithms based on the
density of the dataset at a particular site. More details are available in Algorithm 4.
2. Local Knowledge Generater Agent (LKGA): This is a MA that carries the
AgentProfile & min_th_conf . LKGA applies the constraint of min_th_conf to generate and
store LSAR
iL by using the ( )
FI
k iL and ( )
FISC
k iL lists already generated by LFIGA agent at site iS .
LSAR
iL list also support and confidence for a particular association rule along with the site
name. It carries back the computational time (CPUTime ) at each site iS and endTripTime .
Detailed steps are given in Algorithm 7.
3. Total Frequent Itemset Collector Agent (TFICA): This is a MA that carries the
AgentProfile . TFICA collects list of local frequent k-itemset ( ( )
FI
k iL ) generated by LFIGA
agent and carries back the list of total frequent k-itemset TFI
kL in the result bag to RM at
CENTRALS . In addition to this resultant knowledge, it also carries back the computational
time (CPUTime ) at each site iS and endTripTime . It executes Algorithm 8.
4. Local Knowledge Collctor Agent (LKCA): This is a MA that carries the AgentProfile .
LKCA collects the list of locally strong association rules ( LSAR
iL ) generated by LKGA
agent and carries back the list of total locally strong association rules ( TLSAR
L ) in the result
bag to RM at CENTRALS . In addition to this resultant knowledge, it also carries back the
computational time (CPUTime ) at each site iS and endTripTime . Steps are shown in
Algprithm 9.
5. Global Knowledge Dispatcher Agent (GKDA): This is a MA that carries the
AgentProfile containing global knowledge ( GSAR
CENTRALL ). It dispatches global knowledge at
every site for further decision making and comparing with the local knowledge at that
site. It executes Algorithm 12.
6. Global Frequent Itemset Generater Agent (GFIGA): It is a stationary agent at CENTRALS ,
mainly used for processing the result bag of TFICA, i.e., total frequent k-itemset list
( TFI
kL ) generated y TIFCA to generate the global frequent itemset list, GFI
kL . More details
are available in Algorithm 10.
7. Global Knowledge Generater Agent (GKGA): It is also a stationary agent at CENTRALS ,
mainly used for processing the GFI
kL list and TLSAR
L list to compile the global knowledge,
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
85
i.e., the list of globally strong association rules, GSAR
CENTRALL . Detailed steps are shown in
Algorithm 11.
Algortihm 4 LOCAL FREQUENT ITEMSET GENERATER AGENT (LFIGA)
Input:
• AgentProfile,A collection of agent attributes set by the AL
• min_th_sup,the given minimum threshold support
Output: FI &SC
L ,the list of frequent itemsets and their support counts
1: procedure LFIGA( AgentProfile,min_th_sup )
2: startCPUTime get system time←
3: Briefcase get Briefcase from AgentProfile←
4: i i iDB load DB from local file system of site S←
5: . (0)iT DB get← >No. of records
6: . (1)iI DB get← >No. of items
7: . (3)iDB[T][I] DB get← >itemset data bank
8: minsupcount (T ×min_th_sup) / 100←
9: >generate frequent-1 itemset list ( 1FIL ) and support count list ( 1FISC )
10: 1CFIL {1,2,3...I}← >candidate frequent-1 itemset
11: for i 1,I← do >initialize the support count array 1SCFIL to zero
12: 01SCFIL [i] ←
13: end for
14: 1k ←
15: for all 1candidate c CFIL∈ do >find support count for every candidate
16: for all transaction t DB∈ do
17: if c t⊂ then
18: 1 1[ ] [ ] 1SCFIL k SCFIL k← +
19: end if
20: end for
21: 1k k← +
22: end for
23: >prune 1 1 1CFIL to generate FIL and FISC
24: for 1,k I← do
25: if 1[ ]SCFIL k minsupcount≥ then
26: k 1 1add c CFIL to FIL∈
27: 1 1add SCFIL [k] to FISC
28: end if
29: end for
30: if 1FIL ≠ ∅ then
31: FI
1add FIL to L
32: FISC
1add FISC to L
33: end if
34: 2k ←
35: while 1kFIL − ≠ ∅ do
36: k k-1CFIL Call GenerateCFIL(FIL )← >see Algorithm 5
37: for 1, .ki CFIL length← do >initialize the array kSCFIL to zero
38: [ ] 0kSCFIL i ←
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
86
39: end for
40: 1i ←
41: for all kcandidate c CFIL∈ do >find support count for every candidate
42: for all transaction t DB∈ do >scan DB
43: if c t⊂ then
44: 1 1[ ] [ ] 1SCFIL k SCFIL k← +
45: end if
46: end for
47: 1i i← +
48: end for
49: >prune kCFIL to generate kFIL and kFISC
50: for 1, .ki SCFIL length← do
51: if [i]kSCFIL minsupcount≥ then
52: i k kadd c CFIL to FIL∈
53: k kadd SCFIL [i] to FISC
54: end if
55: end for
56: if kFIL ≠ ∅ then
57: FI
kadd FIL to L
58: FISC
kadd FISC to L
59: end if
60: 1k k← +
61: end while
62: FI &SC
add T to L
63: FI FI &SC
add L to L
64: FISC FI &SC
add L to L
65: FI &SC
isave L in the local file system of this site S
66: endCPUTime get system time←
67: end startCPUTime CPUTime CPUTime← −
68: iadd CPUTime to Result_S
69: iadd Result_S to Briefcase
70: add updated Briefcase to AgentProfile
71: NODES
L get itinerary list from AgentProfile←
72: NODES NODES
L remove first IP address from L← >visited site
73: NODES
add updated L to AgentProfile
74: if NODES
L ≠ ∅ then >itinerary not empty
75: AObject new LGFIGA(AgentProfile,min_th_sup)←
76: add AObject to AgentProfile
77: NODES
transfer AgentProfile to DM_AEE at first IP address in L
78: else
79: endTripTime get system time for end of agent journey←
80: endadd TripTime to Briefcase
81: add updated Briefcase to AgentProfile
82: CENTRALtransfer AgentProfile to RM at S
83: end if
84: end procedure
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
87
Algortihm 5 GENERATECFIL
Input: 1,kL Frequent k -1itemsets−
Output: kC ,Candidate Frequent k itemsets
1: procedure GENERATECFIL ( 1kL − )
2: for all 1 k-1itemset l L∈ do
3: for all 2 k-1itemset l L∈ do
4: if 1 2 1 2 1 2(l [1] = l [1]) (l [2] = l [2]) (l [k -1] = l [k -1])∧ ∧ ∧L then
5: 1 2c l l← ⊗ >join step: generate candidates
6: end if
7: if HASINFREQUENTSUBSET ( 1, kc L − ) then >see Algorithm 6
8: delete c
9: else
10: kadd c to C
11: end if
12: end for
13: end for
14: return kC
15: end procedure
Algortihm 6 HASINFREQUENTSUBSET
Input: ,c Candidate k itemsets
Output: 1 1kL ,Frequent k itemsets− −
1: procedure HASINFREQUENTSUBSET ( 1, kc L − )
2: for all (k -1) subset s c∈ do
3: if 1ks L −∉ then
4: return TRUE
5: else
6: return FALSE
7: end if
8: end for
9: end procedure
Algortihm 7 LOCAL KNOWLEDGE GENERATER AGENT (LKGA)
Input:
• AgentProfile,A collection of agent attributes set by the AL
• min_th_conf,the given minimum threshold confidence
Output: LSAR
L ,the list of locally strong association rules
1: procedure LKGA( AgentProfile,min_th_conf )
2: startCPUTime get system time←
3: Briefcase get Briefcase from AgentProfile←
4: FI &SC FI &SC
iL load L from local file system of this site S←
5: &
. (0)FI SC
T L get← >No. of records
6: &
. (1)FI FI SC
L L get← >frequent k-itemset list
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
88
7: &
. (2)FISC FI SC
L L get← >support count list
8: for 2, .FI
k L size← do
9: . ( )FI
kL L get k← >get frequent k-itemset list
10: for all kl L∈ do
11: subsetsl generate all non - empty subsets of l←
12: FISC
spcountl get support count of l from L←
13: (l / T) 100support spcountAR ← × >support of the association rule
14: for all subsetsnon - empty subset s l∈ do
15: FISC
spcounts get support count of s from L←
16: conf spcount spcountAR (l / s )×100← >confidence of the association rule
17: if confAR min_th_conf≥ then
18: strong support confAR "s l - s[AR %,AR %]"← ⇒
19: print strongAR
20: strongadd l to AR
21: IP
i iS get IP address of this site S←
22: IP
i strongadd S to AR
23: LSAR
strongadd AR to L
24: end if
25: end for
26: end for
27: end for
28: LSAR
isave L in the local file system of this site S
29: endCPUTime get system time←
30: end startCPUTime CPUTime CPUTime← −
31: iadd CPUTime to Result_S
32: iadd Result_S to Briefcase
33: add updated Briefcase to AgentProfile
34: NODES
L get itinerary list from AgentProfile←
35: NODES NODES
L remove first IP address from L← >visited site
36: NODES
add updated L to AgentProfile
37: if NODES
L ≠ ∅ then >itinerary not empty
38: AObject new LKGA(AgentProfile,min_th_conf)←
39: add AObject to AgentProfile
40: NODES
transfer AgentProfile to DM_AEE at first IP address in L
41: else
42: endTripTime get system time for end of agent journey←
43: endadd TripTime to Briefcase
44: add updated Briefcase to AgentProfile
45: CENTRALtransfer AgentProfile to RM at S
46: end if
47: end procedure
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
89
Algortihm 8 TOTAL FREQUENT ITEMSET COLLECTOR AGENT (TFICA)
Input: AgentProfile,A collection of agent attributes set by the AL
Output: FI
L ,the list of locally frequent itemsets
1: procedure TFICA( AgentProfile,min_th_conf )
2: startCPUTime get system time←
3: Briefcase get Briefcase from AgentProfile←
4: FI &SC FI &SC
iL load L from local file system of this site S←
5: &
. (1)FI FI SC
L L get← >frequent k-itemset list
6: FI
iadd L to Result_S
7: endCPUTime get system time←
8: end startCPUTime CPUTime CPUTime← −
9: iadd CPUTime to Result_S
10: iadd Result_S to Briefcase
11: add updated Briefcase to AgentProfile
12: NODES
L get itinerary list from AgentProfile←
13: NODES NODES
L remove first IP address from L← >visited site
14: NODES
add updated L to AgentProfile
15: if NODES
L ≠ ∅ then >itinerary not empty
16: AObject newTFICA(AgentProfile)←
17: add AObject to AgentProfile
18: NODES
transfer AgentProfile to DM_AEE at first IP address in L
19: else
20: endTripTime get system time for end of agent journey←
21: endadd TripTime to Briefcase
22: add updated Briefcase to AgentProfile
23: CENTRALtransfer AgentProfile to RM at S
24: end if
25: end procedure
Algortihm 9 LOCAL KNOWLEDGE COLLECTOR AGENT (LKCA)
Input: AgentProfile,A collection of agent attributes set by the AL
Output: LSAR
L ,the list of locally strong association rules
1: procedure LKCA( AgentProfile )
2: startCPUTime get system time←
3: Briefcase get Briefcase from AgentProfile←
4: LSAR LSAR
iL load L from local file system of this site S←
5: LSAR
iadd L to Result_S
6: endCPUTime get system time←
7: end startCPUTime CPUTime CPUTime← −
8: iadd CPUTime to Result_S
9: iadd Result_S to Briefcase
10: add updated Briefcase to AgentProfile
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
90
11: NODES
L get itinerary list from AgentProfile←
12: NODES NODES
L remove first IP address from L← >visited site
13: NODES
add updated L to AgentProfile
14: if NODES
L ≠ ∅ then >itinerary not empty
15: AObject new LKCA(AgentProfile)←
16: add AObject to AgentProfile
17: NODES
transfer AgentProfile to DM_AEE at first IP address in L
18: else
19: endTripTime get system time for end of agent journey←
20: endadd TripTime to Briefcase
21: add updated Briefcase to AgentProfile
22: CENTRALtransfer AgentProfile to RM at S
23: end if
24: end procedure
Algortihm 10 GLOBAL FREQUENT ITEMSET GENERATER AGENT (GFIGA)
Input: Briefcase, Result bag of TFICA agent
Output: GFI
L ,the list of global frequent itemsets
1: procedure GFIGA( Briefcase )
2: startCPUTime get system time←
3: ( )nTFI FI
ii=1
L retrieve total frequent itemsets L from Briefcase← U
4: ( )1
nGFI FI
ii
L retrieve global frequent itemsets L from Briefcase=
← I
5: print GFI
L
6: GFI
CENTRALsave L in the local file system of site S
7: endCPUTime get system time←
8: end startCPUTime CPUTime CPUTime← −
9: print CPUTime
10: return GFI
L
11: end procedure
Algortihm 11 GLOBAL KNOWLEDGE GENERATER AGENT (GKGA)
Input: Briefcase, Result bag of LKCA agent
Output: GSAR
CENTRALL ,the list of globally strong association rules
1: procedure GKGA( Briefcase )
2: startCPUTime get system time←
3: ( )nTLSAR LSAR
ii=1
L retrieve total strong rules L from Briefcase← U
4: ( )GFI GFI
CENTRALL load global frequent itemsets L from S←
5: for all TLSAR
strongAR L∈ do
6: strongL get frequent itemset from AR←
7: if GFI
L L∈ then
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
91
8: print IP
strong iAR along with the site address (S )
9: GSAR
strong CENTRALadd AR to L
10: end if
11: end for
12: GSAR
CENTRAL CENTRALsave L in the local file system of site S
13: endCPUTime get system time←
14: end startCPUTime CPUTime CPUTime← −
15: print CPUTime
16: return GSAR
CENTRALL
17: end procedure
Algortihm 12 GLOBAL KNOWLEDGE DISPATCHER AGENT (GKDA)
Input: AgentProfile,A collection of agent attributes set by the AL
Output: GSAR
CENTRAL iDispatch L at each distributed site S
1: procedure GKDA( AgentProfile )
2: startCPUTime get system time←
3: Briefcase get Briefcase from AgentProfile←
4: GSAR GSAR
CANTRAL CENTRALL get L from Briefcase←
5: GSAR
CENTRAL isave L in the local file system of site S
6: endCPUTime get system time←
7: end startCPUTime CPUTime CPUTime← −
8: iadd CPUTime to Result_S
9: iadd Result_S to Briefcase
10: add updated Briefcase to AgentProfile
11: NODES
L get itinerary list from AgentProfile←
12: NODES NODES
L remove first IP address from L← >visited site
13: NODES
add updated L to AgentProfile
14: if NODES
L ≠ ∅ then >itinerary not empty
15: AObject newGKDA(AgentProfile)←
16: add AObject to AgentProfile
17: NODES
transfer AgentProfile to DM_AEE at first IP address in L
18: else
19: endTripTime get system time for end of agent journey←
20: endadd TripTime to Briefcase
21: add updated Briefcase to AgentProfile
22: CENTRALtransfer AgentProfile to RM at S
23: end if
24: end procedure
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
92
Figure 2. Control Panel of AeMGSAR
4.IMPLEMENTATION AND PERFORMANCE STUDY
All the agents as well as control panel as shown in Figure 2 are designed in Java. Synthetic
dataset ( iDB ) is stored across three distributed sites 1S , 2S and 3S , with 3500, 3850 and 3900
transactions and 10 items in each respectively using Transactional Data Set Generator (TDSG)
tool [16]. Binary and transactional versions of these datasets are shown in Appendix A. The
required configuration of the system is shown in Table 1 with additional deployment of DM_AEE
at each distributed site and AL and RM at CENTRALS . Round Trip time taken by various MAs is
shown in Figure 3. CPU time consumed by various MAs at site 1S , 2S and 3S is shown in Figure
4, Figure 5 and Figure 6, respectively. CPU time for GFIGA and GKGA is 101357102 nano
seconds and 33317458 nano seconds, respectively. ( )
FI
k iL and ( )
FISC
k iL at distributed sites generated by
LFIGA agent with 20% min_th_sup are shown in Appendix B.1, B.2 and B.3. LSAR
iL at distributed
sites generated by LKGA agent with 50% min_th_conf are shown in Appendix B.4, B.5 and B.6.
Globally frequent itemsets generated by GFIGA at CENTRALS is shown in Figure 7. Fifteen numbers
of 2-itemsets and eight number of 3-itemsets are globally frequent in TFI
kL list and 4, 5 and 6-
itemsets, which are locally frequent, are not globally frequent. Globally strong association rules
( GSAR
CENTRALL ) generated by GKGA at CENTRALS for globally frequent 3-itemsets are shown in Figure 8
and GSAR
CENTRALL for 2-itemsets are shown in Appendix B.7.
On comparing this system with the traditional central data warehouse (DW) based approach for
ARM where entire data from the distributed sites is centrally collected in a DW [17], it is found
that the storage cost is reduced as data is mined locally and only the resultant knowledge is
carried at the central site by mobile agents. As size of the resultant data carried across by mobile
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
93
agents is small so network communication cost is also reduced in this case. Data mining is
performed locally by agents, so computational cost at central site is also minimised. AeMGSAR
reflects the global knowledge because all the strong association rules generated are also strong at
each distributed site. The system relies upon the Java's in-built security system. As MAs are
scalable in nature so performance would not be affected by adding more sites.
Table 1. Network Configuration
Site Name Processor OS
LAN Configuration
IP a
Network
SCENTRAL Intel b
MS c
192.168.46.5 NW d
S1 Intel b
MS c
192.168.46.212 NW d
S2 Intel b
MS c
192.168.46.189 NW d
S3 Intel b
MS c
192.168.46.213 NW d
a. IP address with Mask: 255.255.255.0 and Gateway 192.168.46.1
b. Intel Pentium Dual Core(3.40 GHz, 3.40 GHz) with 512 MB RAM
c. Microsoft Windows XP Professional ver. 2002
d. Network Speed: 100 Mbps and Network Adaptor: 82566DM-2 Gigabit NIC
Figure 3. Round Trip time taken by various MAs
Figure 4. CPU Time taken by various MAs at site 1S
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
94
Figure 5. CPU Time taken by various MAs at site 2S
Figure 6. CPU Time taken by various MAs at site 3S
Figure 7. Lists of global frequent k-itemsets at CENTRALS
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
95
Figure 8. Globally strong association rules for globally frequent 3-itemsets
5.CONCLUSION
Mobile agents strongly qualify for designing distributed applications and the amalgamation of
DDM and agent technology gives favourable results. Most of the existing agent based
frameworks for DARM task are only prototype model and lacks the appropriate underlying
execution environment, scalability, privacy preserving techniques, global knowledge generation
and implementation using a real datasets. In this study, a scalable MAS, called Agent enabled
Mining of Globally Strong Association Rules (AeMGSAR), is presented based on the serial
itinerary of the mobile agents. In this system the overall task of mining the globally strong
association rules is divided into subtasks which are handled by various mobile as well as
stationary agents. An AEE is also designed for the implementation and performance study of
AeMGSAR system. Serial itinerary used for mobile agent migration increases the overall cost of
DARM task so a parallel computing model could be designed where clones of each mobile agent
is dispatched in parallel to all distributed sites.
REFERENCES
[1] U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth & R. Uthurusamy, (1996) Advances in Knowledge
Discovery and Data Mining, AAAI/MIT Press.
[2] J. Han & M. Kamber, (2006) Data Mining: Concepts and Techniques, 2nd ed. Morgan Kaufmann.
[3] G. S. Bhamra, R. B. Patel & A. K. Verma, (2014) “Intelligent Software Agent Technology: An
Overview”, International Journal of Computer Applications (IJCA), vol. 89, no. 2, pp. 19–31.
[4] R. Agrawal, T. Imielinski & A. Swami, (1993) “Mining association rules between sets of items in large
databases”, in Proceedings of the ACM-SIGMOD International Conference of Management of Data,
pp. 207–216.
[5] R. Agrawal & J. C. Shafer, (1996) “Parallel mining of association rules”, IEEE Transaction on
Knowledge and Data Engineering, vol. 8, no. 6, pp. 962–969.
[6] M. J. Zaki, (1999) “Parallel and distributed association mining: a survey”, IEEE Concurrency, vol. 7,
no. 4, pp. 14–25.
[7] X. Wu & S. Zhang, (2003) “Synthesizing high-frequency rules from different data sources”, IEEE
Transactions on Knowledge and Data Engineering, vol. 15, no. 2, pp. 353–367.
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
96
[8] Y.-L. Wang, Z.-Z. Li & H.-P. Zhu, (2003) “Mobile agent based distributed and incremental techniques
for association rules”, in Proceedings of the International Conference on Machine Learning and
Cybernetics(ICMLC 2003), vol. 1, pp. 266–271.
[9] C. Aflori & F. Leon, (2004) “Efficient Distributed Data Mining using Intelligent Agents”, in
Proceedings of the 8th International Symposium on Automatic Control and Computer Science, pp. 1–
6.
[10] U. P. Kulkarni, P. D. Desai, T. Ahmed, J. V. Vadavi & A. R. Yardi, (2007) “Mobile Agent Based
Distributed Data Mining”, in Proceedings of the International Conference on Computational
Intelligence and Multimedia Applications (ICCIMA 2007), IEEE Computer Society, pp. 18–24.
[11] G. Hu & S. Ding, (2009a) “An Agent-Based Framework for Association Rules Mining of Distributed
Data”, in Software Engineering Research, Management and Applications 2009, ser. Studies in
Computational Intelligence, R. Lee and N. Ishii, Eds. Springer Berlin - Heidelberg, vol. 253, pp. 13–
26.
[12] G. Hu & S. Ding, (2009b) “Mining of Association Rules from Distributed Data using Mobile
Agents,” in Proceedings of the International Conference on e-Business(ICE-B 2009), pp. 21–26.
[13] A. O. Ogunde, O. Folorunso, A. S. Sodiya, J. A. Oguntuase & G. O. Ogunleye, (2011) “Improved
cost models for agent based association rule mining in distributed databases”, Anale SEria
Informatica, vol. 9, no. 1, pp. 231–250, Available: http://anale-
informatica.tibiscus.ro/download/lucrari/9-1-20-Ogunde.pdf
[14] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2015) “Agent Based Frameworks for Distributed
Association Rule Mining: An Analysis”, International Journal in Foundations of Computer Science &
Technology (IJFCST), vol. 5, no. 1, pp. 11-22.
[15] R. Agrawal & R. Srikant, (1994) “Fast Algorithms for Mining Association Rules in Large Databases”,
in Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). Morgan
Kaufmann Publishers Inc., pp. 487–499.
[16] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2011) “TDSGenerator: A Tool for generating synthetic
Transactional Datasets for Association Rules Mining”, International Journal of Computer Science
Issues (IJCSI), vol. 8, no. 2, pp. 184-188.
[17] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2014) “An Investigation into the Central Data Warehouse
based Association Rule Mining”, International Journal of Computer Applications (IJCA), vol. 96, no.
10, pp. 1-12.
AUTHORS
Gurpreet Singh Bhamra is currently working as Assistant Professor at
Department of Computer Science and Engineering, M. M. University, Mullana,
Haryana. He received his B.Sc. (Computer Sc.) and MCA from Kurukshetra
University, Kurukshetra in 1995 and 1998, respectively. He is pursuing Ph.D.
from Department of Computer Science and Engineering, Thapar University,
Patiala, Punjab. He is in teaching since 1998. He h as published 13 research
papers in International/National Journals and International Conferences. He has
received Best Paper Award for “An Agent enriched Distributed Data Mining on
Heterogeneous Networks”, in “Challenges & Opportunities in Information
Technology” (COIT-2008). He is a Life Member of Computer Society of India. His research interests are in
Distributed Computing, Distributed Data Mining, Mobile Agents and Bio-informatics.
Dr. Anil Kumar Verma is currently working as Associate Professor at
Department of Computer Science & Engineering, Thapar University, Patiala. He
received his B.S., M.S. and Ph.D. in 1991, 2001 and 2008 respectively, majoring in
Computer science and engineering. He has worked as Lecturer at M.M.M.
Engineering College, Gorakhpur from 1991 to 1996. He joined Thapar Institute of
Engineering & Technology in 1996 as a Systems Analyst in the Computer Centre
and is presently associated with the same Institute. He has been a visiting faculty to
many institutions. He has published over 100 papers in referred journals and
conferences (India and Abroad). He is a MISCI (Turkey), LMCSI (Mumbai),
GMAIMA (New Delhi). He is a certified software quality auditor by MoCIT,
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
97
Govt. of India. His research interests include wireless networks, routing algorithms and securing ad hoc
networks and data mining.
Dr. Ram Bahadur Patel is currently working as Professor and Head at Department
of Computer Science & Engineering, Chandigarh College of Engineering &
Technology, Chandigarh. He received PhD from IIT Roorkee in Computer Science &
Engineering, PDF from Highest Institute of Education, Science & Technology
(HIEST), Athens, Greece, MS (Software Systems) from BITS Pilani and B. E. in
Computer Engineering from M. M. M. Engineering College, Gorakhpur, UP. Dr.
Patel is in teaching and research since 1991. He has supervised 36 M. Tech, 7 M.
Phil. and 8 PhD Thesis. He is currently supervising 6 PhD students. He has published
130 research papers in International/National Journals and Refereed International
Conferences. He has written 7 text books for engineering courses. He is member of
ISTE (New Delhi), IEEE (USA). He is a member of various International Technical Committees and
participating frequently in International Technical Committees in India and abroad. His current research
interests are in Mobile & Distributed Computing, Mobile Agent Security and Fault Tolerance and Sensor
Network.
APPENDIX A – SYNTHETIC DATASETS
A.1 BDS3500T10I.txt and corresponding TDS3500T10I.txt( 1DB ) at site 1S
These synthetic binary and transactional datasets of 3500 records are created by TDSG tool at
site 1S . In the binary version each column head represents the item number and each row
represents a transaction where integer ‘1’ is used for a purchased item and ‘0’ is used if it is nor
purchased. The corresponding transactional version has a Transaction It (TID) for each
transaction and Itemset is the set of all the purchased items for that particular transaction.
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
98
A.2 BDS3850T10I.txt and corresponding TDS3850T10I.txt( 2DB ) at site 2S
These synthetic binary and transactional datasets of 3850 records are created by TDSG tool at site
2S .
A.3 BDS3900T10I.txt and corresponding TDS3900T10I.txt( 3DB ) at site 3S
These synthetic binary and transactional datasets of 3900 records are created by TDSG tool at site
3S .
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
99
APPENDIX B–RESULTANT KNOWLEDGE OF AEMGSAR
SYSTEM
B.1 (1)
FI
kL and (1)
FISC
kL at site 1S
List of frequent k-itemset, i.e., (1)
FI
kL is represented by column L and column SC shows the support
count of the corresponding frequent k-itemset, i.e., (1)
FISC
kL at site 1S . These frequent itemsets and
their support counts are obtained by processing the synthetic dataset ( 1DB ) as shown in Appendix
A.1.
B.2 (2)
FI
kL and (2)
FISC
kL at site 2S
These frequent itemsets and their support counts are obtained by processing the synthetic dataset
( 2DB ) as shown in Appendix A.2.
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
100
B.3 (3)
FI
kL and (3)
FISC
kL at site 3S
These frequent itemsets and their support counts are obtained by processing the synthetic dataset
( 3DB ) as shown in Appendix A.3.
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
101
B.4 1
LSAR
L at site 1S
Column L represents frequent k-itemset and column AR(support, confidence) shows the list of
locally strong association rules, i.e., 1
LSAR
L at site 1S . Each strong rule has its associated support
and confidence factor. The minimum threshold is taken as 20% and minimum threshold
confidence as 50% for generating the strong rules by making use of the data as shown in
Appendix B.1.
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
102
B.5 2
LSAR
L at site 2S
Column L represents frequent k-itemset and column AR(support, confidence) shows the list of
locally strong association rules, i.e., 2
LSAR
L at site 2S . Each strong rule has its associated support
and confidence factor. The minimum threshold is taken as 20% and minimum threshold
confidence as 50% for generating the strong rules by making use of the data as shown in
Appendix B.2.
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
103
B.6 3
LSAR
L at site 3S
Column L represents frequent k-itemset and column AR(support, confidence) shows the list of
locally strong association rules, i.e., 3
LSAR
L at site 3S . Each strong rule has its associated support
and confidence factor. The minimum threshold is taken as 20% and minimum threshold
confidence as 50% for generating the strong rules by making use of the data as shown in
Appendix B.3.
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015
104
B.7 GSAR
CENTRALL at site CENTRALS
Column L represents globally frequent k-itemset, i.e., itemsets which are locally strong at all the
distributed sites and column AR(support, confidence) shows the list of globally strong
association rules, i.e., GSAR
CENTRALL for such itemsets. Each globally strong rule has its associated
support and confidence factor. The minimum threshold is taken as 20% and minimum threshold
confidence as 50%. Site represents the IP address of the site where the rule is locally strong. IP
address 192.168.46.212 is used for site 1S , 192.168.46.189 for site 2S and address
192.168.46.213 is used for site 3S .

More Related Content

PDF
Experimental study of Data clustering using k- Means and modified algorithms
PDF
Optimized Access Strategies for a Distributed Database Design
PDF
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
PDF
Big Data Clustering Model based on Fuzzy Gaussian
PDF
Welcome to International Journal of Engineering Research and Development (IJERD)
PDF
Implementation of query optimization for reducing run time
PDF
Vol 16 No 2 - July-December 2016
PDF
A report on designing a model for improving CPU Scheduling by using Machine L...
Experimental study of Data clustering using k- Means and modified algorithms
Optimized Access Strategies for a Distributed Database Design
PERFORMANCE EVALUATION OF SQL AND NOSQL DATABASE MANAGEMENT SYSTEMS IN A CLUSTER
Big Data Clustering Model based on Fuzzy Gaussian
Welcome to International Journal of Engineering Research and Development (IJERD)
Implementation of query optimization for reducing run time
Vol 16 No 2 - July-December 2016
A report on designing a model for improving CPU Scheduling by using Machine L...

What's hot (20)

PDF
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
PDF
Optimization of workload prediction based on map reduce frame work in a cloud...
PDF
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
PDF
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
PDF
Implementation of p pic algorithm in map reduce to handle big data
PDF
Survey on Load Rebalancing for Distributed File System in Cloud
PDF
A comparative study in dynamic job scheduling approaches in grid computing en...
PDF
A COMPARATIVE STUDY IN DYNAMIC JOB SCHEDULING APPROACHES IN GRID COMPUTING EN...
PPTX
Dynamic Memory & Linked Lists
PPTX
Introduction to Data Structures & Algorithms
PPT
Introduction of data structure
PPTX
Query processing in Distributed Database System
PDF
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
PPT
Chapter16
PPTX
Distributed DBMS - Unit 6 - Query Processing
PDF
Parallel KNN for Big Data using Adaptive Indexing
PDF
Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...
PDF
Simplified Data Processing On Large Cluster
PPTX
Linked list using Dynamic Memory Allocation
PDF
A Framework for Performance Analysis of Computing Clouds
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
Optimization of workload prediction based on map reduce frame work in a cloud...
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
TASK-DECOMPOSITION BASED ANOMALY DETECTION OF MASSIVE AND HIGH-VOLATILITY SES...
Implementation of p pic algorithm in map reduce to handle big data
Survey on Load Rebalancing for Distributed File System in Cloud
A comparative study in dynamic job scheduling approaches in grid computing en...
A COMPARATIVE STUDY IN DYNAMIC JOB SCHEDULING APPROACHES IN GRID COMPUTING EN...
Dynamic Memory & Linked Lists
Introduction to Data Structures & Algorithms
Introduction of data structure
Query processing in Distributed Database System
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...
Chapter16
Distributed DBMS - Unit 6 - Query Processing
Parallel KNN for Big Data using Adaptive Indexing
Multiprocessor scheduling of dependent tasks to minimize makespan and reliabi...
Simplified Data Processing On Large Cluster
Linked list using Dynamic Memory Allocation
A Framework for Performance Analysis of Computing Clouds
Ad

Viewers also liked (11)

PDF
Bimestral tecnología y emprendimiento cuarto período (2)
PPT
Institutions supporting small and medium enterprises, sanjeev
PDF
Commodity Tips, Free Intraday Commodity Tips
PDF
Culinaria com farinha de trigo integral sidney federman
ODP
activitat 15
PPTX
alejandro
PPT
Chptr5 principlesofinsurance-130405110609-phpapp01
PDF
Inserción, eliminación y modificación de Registros.
DOCX
Panorámica histórica sobre el sistema operativo mac
PDF
SUCCESSIVE LINEARIZATION SOLUTION OF A BOUNDARY LAYER CONVECTIVE HEAT TRANSFE...
DOCX
¿Qué se puede hacer para acelerar la entrega de contenido en Internet?
Bimestral tecnología y emprendimiento cuarto período (2)
Institutions supporting small and medium enterprises, sanjeev
Commodity Tips, Free Intraday Commodity Tips
Culinaria com farinha de trigo integral sidney federman
activitat 15
alejandro
Chptr5 principlesofinsurance-130405110609-phpapp01
Inserción, eliminación y modificación de Registros.
Panorámica histórica sobre el sistema operativo mac
SUCCESSIVE LINEARIZATION SOLUTION OF A BOUNDARY LAYER CONVECTIVE HEAT TRANSFE...
¿Qué se puede hacer para acelerar la entrega de contenido en Internet?
Ad

Similar to A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATION RULES (20)

PDF
Agent enabled mining of distributed
PDF
AGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKS
PDF
Bi4201403406
PDF
Comparative analysis of association rule generation algorithms in data streams
PDF
Adaptive and Fast Predictions by Minimal Itemsets Creation
PDF
Distributed Data mining using Multi Agent data
PDF
Intelligent Supermarket using Apriori
PDF
Scalable frequent itemset mining using heterogeneous computing par apriori a...
PDF
A Performance Based Transposition algorithm for Frequent Itemsets Generation
PDF
Agent-Driven Distributed Data Mining
PDF
Agent based frameworks for distributed association rule mining an analysis
PDF
Comparative study of frequent item set in data mining
PDF
PDF
International journal of computer science and innovation vol 2015-n1-paper4
PPT
Distributed Datamining and Agent System,security
PPTX
CS 402 DATAMINING AND WAREHOUSING -MODULE 5
PDF
Dm unit ii r16
PDF
REVIEW: Frequent Pattern Mining Techniques
PDF
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
PDF
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
Agent enabled mining of distributed
AGENT ENABLED MINING OF DISTRIBUTED PROTEIN DATA BANKS
Bi4201403406
Comparative analysis of association rule generation algorithms in data streams
Adaptive and Fast Predictions by Minimal Itemsets Creation
Distributed Data mining using Multi Agent data
Intelligent Supermarket using Apriori
Scalable frequent itemset mining using heterogeneous computing par apriori a...
A Performance Based Transposition algorithm for Frequent Itemsets Generation
Agent-Driven Distributed Data Mining
Agent based frameworks for distributed association rule mining an analysis
Comparative study of frequent item set in data mining
International journal of computer science and innovation vol 2015-n1-paper4
Distributed Datamining and Agent System,security
CS 402 DATAMINING AND WAREHOUSING -MODULE 5
Dm unit ii r16
REVIEW: Frequent Pattern Mining Techniques
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...

Recently uploaded (20)

PPT
Project quality management in manufacturing
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Well-logging-methods_new................
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Welding lecture in detail for understanding
DOCX
573137875-Attendance-Management-System-original
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Sustainable Sites - Green Building Construction
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
composite construction of structures.pdf
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Construction Project Organization Group 2.pptx
Project quality management in manufacturing
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Lecture Notes Electrical Wiring System Components
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Well-logging-methods_new................
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Welding lecture in detail for understanding
573137875-Attendance-Management-System-original
CH1 Production IntroductoryConcepts.pptx
Sustainable Sites - Green Building Construction
CYBER-CRIMES AND SECURITY A guide to understanding
composite construction of structures.pdf
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
UNIT 4 Total Quality Management .pptx
Construction Project Organization Group 2.pptx

A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATION RULES

  • 1. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 DOI:10.5121/ijcsa.2015.5307 77 A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATION RULES G.S.Bhamra1 , A. K.Verma2 and R.B.Patel3 1 M. M. University, Mullana, Haryana, 133207 - India 2 Thapar University, Patiala, Punjab, 147004- India 3 Chandigarh College of Engineering & Technology, Chandigarh- 160019- India ABSTRACT The intelligent agent based model is a popular approach in constructing Distributed Data Mining (DDM) systems to address scalable mining over large scale and ever increasing distributed data. In an agent based distributed system, variety of agents coordinate and communicate with each other to perform the various tasks of the Data Mining (DM) process. In this study a serial computing mode of a multi-agent system (MAS) called Agent enabled Mining of Globally Strong Association Rules (AeMGSAR) is presented based on the serial itinerary of the mobile agents. A Running environment is also designed for the implementation and performance study of AeMGSAR system. KEYWORDS Knowledge Discovery, Association Rules, Intelligent Agents, Multi-Agent System 1.INTRODUCTION Data Mining (DM) technique is used to extract some interesting and valid data patterns implicitly stored in large databases [1], [2]. Intelligent software agent technology is an interdisciplinary technology dealing with the development and efficient utilization of autonomous software objects called agents which have access to geographically distributed and heterogeneous resources. They are autonomous, adaptive, reactive, pro-active, social, cooperative, collaborative and flexible. They also support temporal continuity and mobility within the network. An intelligent agent with mobility feature is known as Mobile Agent (MA). MA migrates from node to node in a heterogeneous network without losing its operability. On reaching at a network node MA is delivered to an Agent Execution Environment (AEE) where its executable parts are started running. Upon completion of the desired task, it delivers the results to the home node. A Mobile Agent Platform (MAP) or Agent Execution Environment (AEE), is a server application that provides the appropriate functionality to MAs to authenticate, execute, communicate, migrate to other platform, and use system resources in a secure way. A Multi Agent System (MAS) is distributed application comprised of multiple interacting intelligent agent components [3]. Let { }, 1jDB T j D= = K be a transactional dataset of size D where each transaction T is assigned an identifier (TID ) and { },i 1i I d m= = K , total m data items in DB . A set of items in a particular transaction T is called itemset or pattern. An itemset, { },i 1i P d k= = K , which is a set of k data
  • 2. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 78 items in a particular transaction T and P I⊆ , is called k-itemset. Support of an itemset, ( ) No_of_T_containing_P %s P D = is the frequency of occurrence of itemset P in DB , where No_of_T_containing_P is the support count (sup_count) of itemset P . Frequent Itemsets (FIs) are the itemset that appear in DB frequently, i.e., if ( ) min_th_sups P ≥ (given minimum threshold support), then P is a frequent k-itemset. Finding such FIs plays an essential role in miming the interesting relationships among itemsets. Frequent Itemset Mining (FIM) is the task of finding the set of all the subsets of FIs in a transactional database [2]. Association Rules (ARs) are used to discover the associations among item in a database [4]. It is an implication of the form [ ]support,confidenceP Q⇒ where, ,P I Q I⊂ ⊂ and P Q∩ = ∅ . An AR is measured in terms of its support and confidence factor where support of the rule ( ( )s P Q⇒ ) is the probability of both P and Q appearing in T , i.e., ( )p P Q∪ and the confidence of the rule ( ( )c P Q⇒ ) is the conditional probability of Q given P , i.e., ( )|p Q P . An AR is said to be strong if ( ) min_th_sups P Q⇒ ≥ (given minimum threshold support) and ( ) min_th_confc P Q⇒ ≥ (given minimum threshold confidence). Association Rule Mining (ARM) today is one of the most important aspects of DM tasks. In ARM all the strong ARs are generated from the FIs. The ARM can be viewed as two step process [5], [6]. 1. Find all the frequent k-itemsets ( k L ) 2. Generate Strong ARs from k L a. For each frequent itemset, k l L∈ , generate all non empty subsets of l . b. For every non empty subset s of l , output the rule “ ( )s l s⇒ − ”, if ( ) ( ) sup_count min_th_conf sup_count l s ≥ Distributed Association Rule Mining (DARM) is the task of generating the globally strong association rules from the global FIs in a distributed environment. Few preliminaries notations and definitions required for defining DARM and to make this study self contained are as follows: • { },i 1iS S n= = K , n distributed sites. • CENTRAL S , Central Site. • { }, 1i j i DB T j D= = K , Horizontally partitioned data set of size i D at the local site i S , where each transaction j T is assigned an identifier (TID). • 1 n ii DB DB= = U , the aggregated dataset of size 1 n ii D D= = ∑ , i j DB DB∩ = ∅ • { },i 1i I d m= = K , total m data items in each i DB . • ( ) FI k i L , Local frequent k-itemsets at site i S . • ( ) FISC k i L , List of support count ( ) FI k i Itemset L∀ ∈ . • LSAR i L , List of locally strong association rules at site i S . • 1 nTLSAR LSAR ii L L= = U , List of total locally strong association rules. • ( )1 nTFI FI k k ii L L= = U , List of total frequent k-itemsets.
  • 3. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 79 • ( )1 nGFI FI k k ii L L= = I , List of global frequent k-itemsets. • GSAR CENTRAL L , List of Globally strong association rule. Local Knowledge Base (LKB), at site iS , comprises of ( ) FI k i L , ( ) FISC k i L and LSAR i L which can provide reference to the local supervisor for local decisions. Global Knowledge Base (GKB), at CENTRAL S , comprises of TLSAR L , TFI k L , GFI k L and GSAR CENTRAL L for the global decision making [7]. Like ARM, DARM task can also be viewed as two-step process [6]: 1. Find the global frequent k-itemset ( GFI k L ) from the distributed Local frequent k-itemsets ( ( ) FI k i L ) from the partitioned datasets. 2. Generate globally strong association rules ( GSAR CENTRAL L ) from GFI kL . The existing agent based systems specifically dealing with DARM task are: Knowledge Discovery Management System (KDMS) [8], Efficient Distributed Data Mining using Intelligent Agents [9], Mobile Agent based Distributed Data Mining [10], An Agent based Framework for Association Rule Mining of Distributed Data (AFARMDD) [11], [12], Multi-Agent Distributed Association Rule Miner (MADARM) [13]. All these systems are academic research projects. Qualitative comparison of these DARM frameworks is provided in [14]. Most of the existing agent based frameworks for DARM task are only prototype model and lacks the appropriate underlying AEE, scalability, privacy preserving techniques, global knowledge generation and implementation using a real datasets. The rest of the paper is organised as follows. Section 2 described the running environment for the proposed system along with various algorithms involved. Serial computing model of AeMGSAR is presented in Section 3. Algorithms for all the agents involved in this system are also discussed. Section 4 describes the implementation and performance study of the system and finally the article is concluded in Section 5. 2.ENVIRONMENT FOR THE PROPOSED SYSTEM Every MAS needs an underlying AEE to provide a running infrastructure on which agents can be deployed and tested. A running environment has been designed in Java. Various attributes of the MA are encapsulated within a data structure known as AgentProfile . It contains the name of MA ( AgentName ), version number ( AgentVersion ), entire byte code ( BC ), list of nodes to be visited by MA, i.e., itinerary plan ( NODESL ) , type of the itinerary ( ItinType ) which can be serial or parallel, a reference of current execution state ( AObject ) and an additional data structure known as Briefcase that acts as a result bag of MA to store final resultant knowledge ( iResult_S ) at a particular site. Computational time (CPUTime ) taken by a MA at a particular site is also stored in iResult_S . In addition to results, Briefcase also contains the system time for start of agent journey ( startTripTime ), system time for end of journey ( endTripTime ) and total round trip time of MA (TripTime ) calculated using end startTripTime TripTime TripTime← − . Stationary as well as mobile agents involved in the models would be discussed later on. This environment consists of the following three components:
  • 4. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 80 • Data Mining Agent Execution Environment (DM_AEE): It is the key component that acts as a Server. DM_AEE is deployed on any distributed sites iS and is responsible for receiving, executing and migrating all the visiting DM agents. It receives the incoming AgentProfile at site iS , retrieves the entire BC of agent and save it with .AgentName class in the local file system of the site iS after that execution of the agent is started using AObject . Steps are shown in Algorithm 1. • Agent Launcher (AL): It acts a Client at agent launching station ( CENTRAL S ) and launches the goal oriented DM agents on behalf of the user through a user interface to the DM_AEE running at the distributed sites. Agent Pool (or Zone) at CENTRAL S is a repository of all mobile as well as stationary agents (SAs). AL first reads and stores AgentName in AgentProfile . The entire BC of the AgentName is loaded from the Agent Pool and stored in AgentProfile . NODES L and ItinType are retrieved and stored in AgentProfile . startTripTime is maintained in Briefcase which is further added to AgentProfile . In case of serial computing model, i.e., if ItinType Serial= , AL dispatches a specific single MA along with NODES L , and it travels from node to node. AgentVersion is set as 1 for this agent. AL also contacts the Result Manager (RM) for processing the Briefcase of an agent. Detailed steps are given in Algorithm 2. • Result Manager (RM): It manages and processes the Briefcase of all MAs. RM is either contacted by a MA for submitting its results or by AL for processing the results of the specific MA. On completion of itinerary, each DM agent submits its results to RM which computes total round trip time (TripTime ) of that MA and saves it in the Briefcase of that agent. It ItinType Serial= then it saves the updated AgentProfile of an agent at CENTRALS . When it is contacted by AL for processing the results of a specific agent it sends back the AgentProfile of that agent. Steps are defined in Algorithm 3. Algortihm 1 DATA MINING AGENT EXECUTION ENVIRONMENT (DM_AEE) 1: procedure DM_AEE( ) 2: while TRUE do 3: iAgentPofile listen and receive AgentProfile at S← 4: AgentName get AgentName from AgentProfile← 5: BC retrieve the BC of agent from AgentProfile← 6: isave the BC with AgentName.class in the local file system of S 7: AObject get AObject from AgentProfile← >current state 8: . ()AObject run >start executing mobile agent 9: end while 10: end procedure Algortihm 2 AGENT LAUNCHER (AL) 1: procedure AL( ) 2: option read option(dispatch / result)← 3: switch option do 4: case dispatch >dispatch the mobile agent to DM_AEE 5: AgentName read Mobile Agent's name← 6: add AgentName to AgentProfile
  • 5. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 81 7: BC load entire byte code of AgentName from AgentPool← 8: add BC to AgentProfile 9: NODES L read Itinerary (IP addresses)of mobile agent← 10: ItinType read ItinType( Serial / Parallel)← 11: add ItinType to AgentProfile 12: if " "ItinType Serial= then >Serial Itinerary 13: 1AgentVersion ← 14: add AgentVersion to AgentProfile 15: NODES add L to AgentProfile 16: switch AgentName do 17: case LFIGA 18: minthrsup read minimum threshold support← 19: AObject new LFIGA(AgentProfile,minthrsup)← 20: end case 21: case LKGA 22: minthrconf read minimum threshold confidence← 23: AObject new LKGA(AgentProfile,minthrconf)← 24: end case 25: case TFICA 26: AObject newTFICA(AgentProfile)← 27: end case 28: case LKCA 29: (AObject new LKCA AgentProfile)← 30: end case 31: case GKDA 32: GSAR GSAR CENTRAL CENTRAL CENTRALL load L generated by GKGA at S← 33: GSAR CENTRALadd L to Briefcase 34: add updated Briefcase to AgentProfile 35: AObject newGKDA(AgentProfile)← 36: end case 37: end switch 38: add AObject to AgentProfile >current state 39: NODES Transfer AgentProfile to DM_AEE at first IP address in L 40: end if 41: end case 42: case result >process the result of mobile agent 43: AgentName read mobile agent's name← 44: ItinType read mobile agent's ItinType← 45: AgentInfo add AgentName to L 46: AgentInfo add ItinType to L 47: > Result processing for Serial Itinerary Agents 48: if " "ItinType Serial= then 49: AgentInfo AgentProfile contact RM for L← 50: Briefcase retrieve Briefcase from AgentProfile← 51: switch AgentName do 52: case LFIGA
  • 6. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 82 53: process the Briefcase of LFIGA 54: end case 55: case LKGA 56: process the Briefcase of LKGA 57: end case 58: case TFICA 59: call GFIGA(Briefcase) >stationary agent 60: end case 61: case LKCA 62: call GKGA(Briefcase) >stationary agent 63: end case 64: case GKDA 65: process the Briefcase of GKDA 66: end case 67: end switch 68: end if 69: end case 70: end switch 71: end procedure Algortihm 3 RESULT MANAGER (RM) 1: procedure RM( ) 2: while TRUE do 3: listen and receive the incomming request 4: if icontacted by a mobile agent for submitting results from site S then 5: iAgentProfile receive the incomming AgentProfile from site S← 6: ItinType retrieve ItinType from AgentProfile← 7: Briefcase retrieve mobile agent's Briefcase from AgentProfile← 8: start startTripTime retrieveTripTime from Briefcase← 9: end endTripTime retrieveTripTime from Briefcase← 10: end startTripTime TripTime TripTime← − 11: add TripTime to Briefcase 12: add updated Briefcase to AgentProfile 13: if " "ItinType Serial= then 14: CENTRALsave AgentProfile at S 15: end if 16: end if 17: if contacted by AL for processing the results then 18: AgentInfo AgentName retrieve AgentName from incomming L← 19: AgentInfo ItinType retrieve ItinType from incomming L← 20: if " "ItinType Serial= then 21: CENTRALAgentProfile load AgentProfile for AgentName from S← 22: dispatch AgentProfile to AL 23: end if 24: end if 25: end while 26: end procedure
  • 7. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 83 The overall working of AeMGSAR system may be divided into following six stages: 1. Request Stage: Request for the DARM is initiated at CENTRALS by AL on behalf of the user with necessary credentials. 2. Preparation Stage: AL through User Interface reads agent name; version number; Itinerary for the MAs journey is obtained in terms of IP addresses of the distributed nodes to be visited by a MA; any specific additional data for a specific MA is obtained; Agent code for the specific MA is loaded from AgentPool; for serial itinerary a single specific MA is dispatched by AL to travel and visit n distributed sites in parallel. 3. Local Mining Stage: ARM process is performed locally by specific DM agents on each distributed site and results are kept as local knowledge base at that site. 4. Result Collection Stage: Collector agents visits each site and collect the results generated by DM agents and submit the results back to RM at CENTRALS . 5. Knowledge Integration and Global Knowledge Generation Stage: Knowledge or result integration is carried out by the RM with the help of stationary agent and Global Knowledge in the form of Globally Strong Association Rules may be generated with the help of other stationary agents at CENTRALS . 6. Global Knowledge Dispatching Stage: Global knowledge is dispatched to the distributed sites by a dispatching agent to compare it with the local knowledge at each site. Figure 1. AeMGSAR Serial Computing Model
  • 8. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 84 3.SERIAL COMPUTING MODEL OF AEMGSAR Serial computing model of AeMGSAR system is shown in Figure 1. It consists of total seven agents, five of these are MAs dispatched from CENTRALS with serial itinerary multi-hop migration and other two are intelligent SAs running at CENTRALS to perform different tasks. The CPU time taken by a MA while processing on each site along with some other specific information is carried back in the result bag at CENTRALS . Agents in serial number 1-5 visit n sites serially other parameters are collected from different resources. Detailed relationship among these agents and working behaviour of each agent is as follows: 1. Local Frequent Itemset Generater Agent (LFIGA): This is a MA that carries the AgentProfile & min_th_sup . LFIGA generates and stores ( ) FI k iL and ( ) FISC k iL at site iS by scanning the local iDB at that site with the constraint of min_th_sup . It carries back the computational time (CPUTime ) at each site iS and endTripTime . This agent is embedded with Apriori algorithm [15] for generating all the frequent k-itemset lists. It may be equipped with decision making capability to select other FIM algorithms based on the density of the dataset at a particular site. More details are available in Algorithm 4. 2. Local Knowledge Generater Agent (LKGA): This is a MA that carries the AgentProfile & min_th_conf . LKGA applies the constraint of min_th_conf to generate and store LSAR iL by using the ( ) FI k iL and ( ) FISC k iL lists already generated by LFIGA agent at site iS . LSAR iL list also support and confidence for a particular association rule along with the site name. It carries back the computational time (CPUTime ) at each site iS and endTripTime . Detailed steps are given in Algorithm 7. 3. Total Frequent Itemset Collector Agent (TFICA): This is a MA that carries the AgentProfile . TFICA collects list of local frequent k-itemset ( ( ) FI k iL ) generated by LFIGA agent and carries back the list of total frequent k-itemset TFI kL in the result bag to RM at CENTRALS . In addition to this resultant knowledge, it also carries back the computational time (CPUTime ) at each site iS and endTripTime . It executes Algorithm 8. 4. Local Knowledge Collctor Agent (LKCA): This is a MA that carries the AgentProfile . LKCA collects the list of locally strong association rules ( LSAR iL ) generated by LKGA agent and carries back the list of total locally strong association rules ( TLSAR L ) in the result bag to RM at CENTRALS . In addition to this resultant knowledge, it also carries back the computational time (CPUTime ) at each site iS and endTripTime . Steps are shown in Algprithm 9. 5. Global Knowledge Dispatcher Agent (GKDA): This is a MA that carries the AgentProfile containing global knowledge ( GSAR CENTRALL ). It dispatches global knowledge at every site for further decision making and comparing with the local knowledge at that site. It executes Algorithm 12. 6. Global Frequent Itemset Generater Agent (GFIGA): It is a stationary agent at CENTRALS , mainly used for processing the result bag of TFICA, i.e., total frequent k-itemset list ( TFI kL ) generated y TIFCA to generate the global frequent itemset list, GFI kL . More details are available in Algorithm 10. 7. Global Knowledge Generater Agent (GKGA): It is also a stationary agent at CENTRALS , mainly used for processing the GFI kL list and TLSAR L list to compile the global knowledge,
  • 9. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 85 i.e., the list of globally strong association rules, GSAR CENTRALL . Detailed steps are shown in Algorithm 11. Algortihm 4 LOCAL FREQUENT ITEMSET GENERATER AGENT (LFIGA) Input: • AgentProfile,A collection of agent attributes set by the AL • min_th_sup,the given minimum threshold support Output: FI &SC L ,the list of frequent itemsets and their support counts 1: procedure LFIGA( AgentProfile,min_th_sup ) 2: startCPUTime get system time← 3: Briefcase get Briefcase from AgentProfile← 4: i i iDB load DB from local file system of site S← 5: . (0)iT DB get← >No. of records 6: . (1)iI DB get← >No. of items 7: . (3)iDB[T][I] DB get← >itemset data bank 8: minsupcount (T ×min_th_sup) / 100← 9: >generate frequent-1 itemset list ( 1FIL ) and support count list ( 1FISC ) 10: 1CFIL {1,2,3...I}← >candidate frequent-1 itemset 11: for i 1,I← do >initialize the support count array 1SCFIL to zero 12: 01SCFIL [i] ← 13: end for 14: 1k ← 15: for all 1candidate c CFIL∈ do >find support count for every candidate 16: for all transaction t DB∈ do 17: if c t⊂ then 18: 1 1[ ] [ ] 1SCFIL k SCFIL k← + 19: end if 20: end for 21: 1k k← + 22: end for 23: >prune 1 1 1CFIL to generate FIL and FISC 24: for 1,k I← do 25: if 1[ ]SCFIL k minsupcount≥ then 26: k 1 1add c CFIL to FIL∈ 27: 1 1add SCFIL [k] to FISC 28: end if 29: end for 30: if 1FIL ≠ ∅ then 31: FI 1add FIL to L 32: FISC 1add FISC to L 33: end if 34: 2k ← 35: while 1kFIL − ≠ ∅ do 36: k k-1CFIL Call GenerateCFIL(FIL )← >see Algorithm 5 37: for 1, .ki CFIL length← do >initialize the array kSCFIL to zero 38: [ ] 0kSCFIL i ←
  • 10. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 86 39: end for 40: 1i ← 41: for all kcandidate c CFIL∈ do >find support count for every candidate 42: for all transaction t DB∈ do >scan DB 43: if c t⊂ then 44: 1 1[ ] [ ] 1SCFIL k SCFIL k← + 45: end if 46: end for 47: 1i i← + 48: end for 49: >prune kCFIL to generate kFIL and kFISC 50: for 1, .ki SCFIL length← do 51: if [i]kSCFIL minsupcount≥ then 52: i k kadd c CFIL to FIL∈ 53: k kadd SCFIL [i] to FISC 54: end if 55: end for 56: if kFIL ≠ ∅ then 57: FI kadd FIL to L 58: FISC kadd FISC to L 59: end if 60: 1k k← + 61: end while 62: FI &SC add T to L 63: FI FI &SC add L to L 64: FISC FI &SC add L to L 65: FI &SC isave L in the local file system of this site S 66: endCPUTime get system time← 67: end startCPUTime CPUTime CPUTime← − 68: iadd CPUTime to Result_S 69: iadd Result_S to Briefcase 70: add updated Briefcase to AgentProfile 71: NODES L get itinerary list from AgentProfile← 72: NODES NODES L remove first IP address from L← >visited site 73: NODES add updated L to AgentProfile 74: if NODES L ≠ ∅ then >itinerary not empty 75: AObject new LGFIGA(AgentProfile,min_th_sup)← 76: add AObject to AgentProfile 77: NODES transfer AgentProfile to DM_AEE at first IP address in L 78: else 79: endTripTime get system time for end of agent journey← 80: endadd TripTime to Briefcase 81: add updated Briefcase to AgentProfile 82: CENTRALtransfer AgentProfile to RM at S 83: end if 84: end procedure
  • 11. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 87 Algortihm 5 GENERATECFIL Input: 1,kL Frequent k -1itemsets− Output: kC ,Candidate Frequent k itemsets 1: procedure GENERATECFIL ( 1kL − ) 2: for all 1 k-1itemset l L∈ do 3: for all 2 k-1itemset l L∈ do 4: if 1 2 1 2 1 2(l [1] = l [1]) (l [2] = l [2]) (l [k -1] = l [k -1])∧ ∧ ∧L then 5: 1 2c l l← ⊗ >join step: generate candidates 6: end if 7: if HASINFREQUENTSUBSET ( 1, kc L − ) then >see Algorithm 6 8: delete c 9: else 10: kadd c to C 11: end if 12: end for 13: end for 14: return kC 15: end procedure Algortihm 6 HASINFREQUENTSUBSET Input: ,c Candidate k itemsets Output: 1 1kL ,Frequent k itemsets− − 1: procedure HASINFREQUENTSUBSET ( 1, kc L − ) 2: for all (k -1) subset s c∈ do 3: if 1ks L −∉ then 4: return TRUE 5: else 6: return FALSE 7: end if 8: end for 9: end procedure Algortihm 7 LOCAL KNOWLEDGE GENERATER AGENT (LKGA) Input: • AgentProfile,A collection of agent attributes set by the AL • min_th_conf,the given minimum threshold confidence Output: LSAR L ,the list of locally strong association rules 1: procedure LKGA( AgentProfile,min_th_conf ) 2: startCPUTime get system time← 3: Briefcase get Briefcase from AgentProfile← 4: FI &SC FI &SC iL load L from local file system of this site S← 5: & . (0)FI SC T L get← >No. of records 6: & . (1)FI FI SC L L get← >frequent k-itemset list
  • 12. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 88 7: & . (2)FISC FI SC L L get← >support count list 8: for 2, .FI k L size← do 9: . ( )FI kL L get k← >get frequent k-itemset list 10: for all kl L∈ do 11: subsetsl generate all non - empty subsets of l← 12: FISC spcountl get support count of l from L← 13: (l / T) 100support spcountAR ← × >support of the association rule 14: for all subsetsnon - empty subset s l∈ do 15: FISC spcounts get support count of s from L← 16: conf spcount spcountAR (l / s )×100← >confidence of the association rule 17: if confAR min_th_conf≥ then 18: strong support confAR "s l - s[AR %,AR %]"← ⇒ 19: print strongAR 20: strongadd l to AR 21: IP i iS get IP address of this site S← 22: IP i strongadd S to AR 23: LSAR strongadd AR to L 24: end if 25: end for 26: end for 27: end for 28: LSAR isave L in the local file system of this site S 29: endCPUTime get system time← 30: end startCPUTime CPUTime CPUTime← − 31: iadd CPUTime to Result_S 32: iadd Result_S to Briefcase 33: add updated Briefcase to AgentProfile 34: NODES L get itinerary list from AgentProfile← 35: NODES NODES L remove first IP address from L← >visited site 36: NODES add updated L to AgentProfile 37: if NODES L ≠ ∅ then >itinerary not empty 38: AObject new LKGA(AgentProfile,min_th_conf)← 39: add AObject to AgentProfile 40: NODES transfer AgentProfile to DM_AEE at first IP address in L 41: else 42: endTripTime get system time for end of agent journey← 43: endadd TripTime to Briefcase 44: add updated Briefcase to AgentProfile 45: CENTRALtransfer AgentProfile to RM at S 46: end if 47: end procedure
  • 13. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 89 Algortihm 8 TOTAL FREQUENT ITEMSET COLLECTOR AGENT (TFICA) Input: AgentProfile,A collection of agent attributes set by the AL Output: FI L ,the list of locally frequent itemsets 1: procedure TFICA( AgentProfile,min_th_conf ) 2: startCPUTime get system time← 3: Briefcase get Briefcase from AgentProfile← 4: FI &SC FI &SC iL load L from local file system of this site S← 5: & . (1)FI FI SC L L get← >frequent k-itemset list 6: FI iadd L to Result_S 7: endCPUTime get system time← 8: end startCPUTime CPUTime CPUTime← − 9: iadd CPUTime to Result_S 10: iadd Result_S to Briefcase 11: add updated Briefcase to AgentProfile 12: NODES L get itinerary list from AgentProfile← 13: NODES NODES L remove first IP address from L← >visited site 14: NODES add updated L to AgentProfile 15: if NODES L ≠ ∅ then >itinerary not empty 16: AObject newTFICA(AgentProfile)← 17: add AObject to AgentProfile 18: NODES transfer AgentProfile to DM_AEE at first IP address in L 19: else 20: endTripTime get system time for end of agent journey← 21: endadd TripTime to Briefcase 22: add updated Briefcase to AgentProfile 23: CENTRALtransfer AgentProfile to RM at S 24: end if 25: end procedure Algortihm 9 LOCAL KNOWLEDGE COLLECTOR AGENT (LKCA) Input: AgentProfile,A collection of agent attributes set by the AL Output: LSAR L ,the list of locally strong association rules 1: procedure LKCA( AgentProfile ) 2: startCPUTime get system time← 3: Briefcase get Briefcase from AgentProfile← 4: LSAR LSAR iL load L from local file system of this site S← 5: LSAR iadd L to Result_S 6: endCPUTime get system time← 7: end startCPUTime CPUTime CPUTime← − 8: iadd CPUTime to Result_S 9: iadd Result_S to Briefcase 10: add updated Briefcase to AgentProfile
  • 14. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 90 11: NODES L get itinerary list from AgentProfile← 12: NODES NODES L remove first IP address from L← >visited site 13: NODES add updated L to AgentProfile 14: if NODES L ≠ ∅ then >itinerary not empty 15: AObject new LKCA(AgentProfile)← 16: add AObject to AgentProfile 17: NODES transfer AgentProfile to DM_AEE at first IP address in L 18: else 19: endTripTime get system time for end of agent journey← 20: endadd TripTime to Briefcase 21: add updated Briefcase to AgentProfile 22: CENTRALtransfer AgentProfile to RM at S 23: end if 24: end procedure Algortihm 10 GLOBAL FREQUENT ITEMSET GENERATER AGENT (GFIGA) Input: Briefcase, Result bag of TFICA agent Output: GFI L ,the list of global frequent itemsets 1: procedure GFIGA( Briefcase ) 2: startCPUTime get system time← 3: ( )nTFI FI ii=1 L retrieve total frequent itemsets L from Briefcase← U 4: ( )1 nGFI FI ii L retrieve global frequent itemsets L from Briefcase= ← I 5: print GFI L 6: GFI CENTRALsave L in the local file system of site S 7: endCPUTime get system time← 8: end startCPUTime CPUTime CPUTime← − 9: print CPUTime 10: return GFI L 11: end procedure Algortihm 11 GLOBAL KNOWLEDGE GENERATER AGENT (GKGA) Input: Briefcase, Result bag of LKCA agent Output: GSAR CENTRALL ,the list of globally strong association rules 1: procedure GKGA( Briefcase ) 2: startCPUTime get system time← 3: ( )nTLSAR LSAR ii=1 L retrieve total strong rules L from Briefcase← U 4: ( )GFI GFI CENTRALL load global frequent itemsets L from S← 5: for all TLSAR strongAR L∈ do 6: strongL get frequent itemset from AR← 7: if GFI L L∈ then
  • 15. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 91 8: print IP strong iAR along with the site address (S ) 9: GSAR strong CENTRALadd AR to L 10: end if 11: end for 12: GSAR CENTRAL CENTRALsave L in the local file system of site S 13: endCPUTime get system time← 14: end startCPUTime CPUTime CPUTime← − 15: print CPUTime 16: return GSAR CENTRALL 17: end procedure Algortihm 12 GLOBAL KNOWLEDGE DISPATCHER AGENT (GKDA) Input: AgentProfile,A collection of agent attributes set by the AL Output: GSAR CENTRAL iDispatch L at each distributed site S 1: procedure GKDA( AgentProfile ) 2: startCPUTime get system time← 3: Briefcase get Briefcase from AgentProfile← 4: GSAR GSAR CANTRAL CENTRALL get L from Briefcase← 5: GSAR CENTRAL isave L in the local file system of site S 6: endCPUTime get system time← 7: end startCPUTime CPUTime CPUTime← − 8: iadd CPUTime to Result_S 9: iadd Result_S to Briefcase 10: add updated Briefcase to AgentProfile 11: NODES L get itinerary list from AgentProfile← 12: NODES NODES L remove first IP address from L← >visited site 13: NODES add updated L to AgentProfile 14: if NODES L ≠ ∅ then >itinerary not empty 15: AObject newGKDA(AgentProfile)← 16: add AObject to AgentProfile 17: NODES transfer AgentProfile to DM_AEE at first IP address in L 18: else 19: endTripTime get system time for end of agent journey← 20: endadd TripTime to Briefcase 21: add updated Briefcase to AgentProfile 22: CENTRALtransfer AgentProfile to RM at S 23: end if 24: end procedure
  • 16. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 92 Figure 2. Control Panel of AeMGSAR 4.IMPLEMENTATION AND PERFORMANCE STUDY All the agents as well as control panel as shown in Figure 2 are designed in Java. Synthetic dataset ( iDB ) is stored across three distributed sites 1S , 2S and 3S , with 3500, 3850 and 3900 transactions and 10 items in each respectively using Transactional Data Set Generator (TDSG) tool [16]. Binary and transactional versions of these datasets are shown in Appendix A. The required configuration of the system is shown in Table 1 with additional deployment of DM_AEE at each distributed site and AL and RM at CENTRALS . Round Trip time taken by various MAs is shown in Figure 3. CPU time consumed by various MAs at site 1S , 2S and 3S is shown in Figure 4, Figure 5 and Figure 6, respectively. CPU time for GFIGA and GKGA is 101357102 nano seconds and 33317458 nano seconds, respectively. ( ) FI k iL and ( ) FISC k iL at distributed sites generated by LFIGA agent with 20% min_th_sup are shown in Appendix B.1, B.2 and B.3. LSAR iL at distributed sites generated by LKGA agent with 50% min_th_conf are shown in Appendix B.4, B.5 and B.6. Globally frequent itemsets generated by GFIGA at CENTRALS is shown in Figure 7. Fifteen numbers of 2-itemsets and eight number of 3-itemsets are globally frequent in TFI kL list and 4, 5 and 6- itemsets, which are locally frequent, are not globally frequent. Globally strong association rules ( GSAR CENTRALL ) generated by GKGA at CENTRALS for globally frequent 3-itemsets are shown in Figure 8 and GSAR CENTRALL for 2-itemsets are shown in Appendix B.7. On comparing this system with the traditional central data warehouse (DW) based approach for ARM where entire data from the distributed sites is centrally collected in a DW [17], it is found that the storage cost is reduced as data is mined locally and only the resultant knowledge is carried at the central site by mobile agents. As size of the resultant data carried across by mobile
  • 17. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 93 agents is small so network communication cost is also reduced in this case. Data mining is performed locally by agents, so computational cost at central site is also minimised. AeMGSAR reflects the global knowledge because all the strong association rules generated are also strong at each distributed site. The system relies upon the Java's in-built security system. As MAs are scalable in nature so performance would not be affected by adding more sites. Table 1. Network Configuration Site Name Processor OS LAN Configuration IP a Network SCENTRAL Intel b MS c 192.168.46.5 NW d S1 Intel b MS c 192.168.46.212 NW d S2 Intel b MS c 192.168.46.189 NW d S3 Intel b MS c 192.168.46.213 NW d a. IP address with Mask: 255.255.255.0 and Gateway 192.168.46.1 b. Intel Pentium Dual Core(3.40 GHz, 3.40 GHz) with 512 MB RAM c. Microsoft Windows XP Professional ver. 2002 d. Network Speed: 100 Mbps and Network Adaptor: 82566DM-2 Gigabit NIC Figure 3. Round Trip time taken by various MAs Figure 4. CPU Time taken by various MAs at site 1S
  • 18. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 94 Figure 5. CPU Time taken by various MAs at site 2S Figure 6. CPU Time taken by various MAs at site 3S Figure 7. Lists of global frequent k-itemsets at CENTRALS
  • 19. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 95 Figure 8. Globally strong association rules for globally frequent 3-itemsets 5.CONCLUSION Mobile agents strongly qualify for designing distributed applications and the amalgamation of DDM and agent technology gives favourable results. Most of the existing agent based frameworks for DARM task are only prototype model and lacks the appropriate underlying execution environment, scalability, privacy preserving techniques, global knowledge generation and implementation using a real datasets. In this study, a scalable MAS, called Agent enabled Mining of Globally Strong Association Rules (AeMGSAR), is presented based on the serial itinerary of the mobile agents. In this system the overall task of mining the globally strong association rules is divided into subtasks which are handled by various mobile as well as stationary agents. An AEE is also designed for the implementation and performance study of AeMGSAR system. Serial itinerary used for mobile agent migration increases the overall cost of DARM task so a parallel computing model could be designed where clones of each mobile agent is dispatched in parallel to all distributed sites. REFERENCES [1] U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth & R. Uthurusamy, (1996) Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press. [2] J. Han & M. Kamber, (2006) Data Mining: Concepts and Techniques, 2nd ed. Morgan Kaufmann. [3] G. S. Bhamra, R. B. Patel & A. K. Verma, (2014) “Intelligent Software Agent Technology: An Overview”, International Journal of Computer Applications (IJCA), vol. 89, no. 2, pp. 19–31. [4] R. Agrawal, T. Imielinski & A. Swami, (1993) “Mining association rules between sets of items in large databases”, in Proceedings of the ACM-SIGMOD International Conference of Management of Data, pp. 207–216. [5] R. Agrawal & J. C. Shafer, (1996) “Parallel mining of association rules”, IEEE Transaction on Knowledge and Data Engineering, vol. 8, no. 6, pp. 962–969. [6] M. J. Zaki, (1999) “Parallel and distributed association mining: a survey”, IEEE Concurrency, vol. 7, no. 4, pp. 14–25. [7] X. Wu & S. Zhang, (2003) “Synthesizing high-frequency rules from different data sources”, IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 2, pp. 353–367.
  • 20. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 96 [8] Y.-L. Wang, Z.-Z. Li & H.-P. Zhu, (2003) “Mobile agent based distributed and incremental techniques for association rules”, in Proceedings of the International Conference on Machine Learning and Cybernetics(ICMLC 2003), vol. 1, pp. 266–271. [9] C. Aflori & F. Leon, (2004) “Efficient Distributed Data Mining using Intelligent Agents”, in Proceedings of the 8th International Symposium on Automatic Control and Computer Science, pp. 1– 6. [10] U. P. Kulkarni, P. D. Desai, T. Ahmed, J. V. Vadavi & A. R. Yardi, (2007) “Mobile Agent Based Distributed Data Mining”, in Proceedings of the International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), IEEE Computer Society, pp. 18–24. [11] G. Hu & S. Ding, (2009a) “An Agent-Based Framework for Association Rules Mining of Distributed Data”, in Software Engineering Research, Management and Applications 2009, ser. Studies in Computational Intelligence, R. Lee and N. Ishii, Eds. Springer Berlin - Heidelberg, vol. 253, pp. 13– 26. [12] G. Hu & S. Ding, (2009b) “Mining of Association Rules from Distributed Data using Mobile Agents,” in Proceedings of the International Conference on e-Business(ICE-B 2009), pp. 21–26. [13] A. O. Ogunde, O. Folorunso, A. S. Sodiya, J. A. Oguntuase & G. O. Ogunleye, (2011) “Improved cost models for agent based association rule mining in distributed databases”, Anale SEria Informatica, vol. 9, no. 1, pp. 231–250, Available: http://anale- informatica.tibiscus.ro/download/lucrari/9-1-20-Ogunde.pdf [14] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2015) “Agent Based Frameworks for Distributed Association Rule Mining: An Analysis”, International Journal in Foundations of Computer Science & Technology (IJFCST), vol. 5, no. 1, pp. 11-22. [15] R. Agrawal & R. Srikant, (1994) “Fast Algorithms for Mining Association Rules in Large Databases”, in Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). Morgan Kaufmann Publishers Inc., pp. 487–499. [16] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2011) “TDSGenerator: A Tool for generating synthetic Transactional Datasets for Association Rules Mining”, International Journal of Computer Science Issues (IJCSI), vol. 8, no. 2, pp. 184-188. [17] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2014) “An Investigation into the Central Data Warehouse based Association Rule Mining”, International Journal of Computer Applications (IJCA), vol. 96, no. 10, pp. 1-12. AUTHORS Gurpreet Singh Bhamra is currently working as Assistant Professor at Department of Computer Science and Engineering, M. M. University, Mullana, Haryana. He received his B.Sc. (Computer Sc.) and MCA from Kurukshetra University, Kurukshetra in 1995 and 1998, respectively. He is pursuing Ph.D. from Department of Computer Science and Engineering, Thapar University, Patiala, Punjab. He is in teaching since 1998. He h as published 13 research papers in International/National Journals and International Conferences. He has received Best Paper Award for “An Agent enriched Distributed Data Mining on Heterogeneous Networks”, in “Challenges & Opportunities in Information Technology” (COIT-2008). He is a Life Member of Computer Society of India. His research interests are in Distributed Computing, Distributed Data Mining, Mobile Agents and Bio-informatics. Dr. Anil Kumar Verma is currently working as Associate Professor at Department of Computer Science & Engineering, Thapar University, Patiala. He received his B.S., M.S. and Ph.D. in 1991, 2001 and 2008 respectively, majoring in Computer science and engineering. He has worked as Lecturer at M.M.M. Engineering College, Gorakhpur from 1991 to 1996. He joined Thapar Institute of Engineering & Technology in 1996 as a Systems Analyst in the Computer Centre and is presently associated with the same Institute. He has been a visiting faculty to many institutions. He has published over 100 papers in referred journals and conferences (India and Abroad). He is a MISCI (Turkey), LMCSI (Mumbai), GMAIMA (New Delhi). He is a certified software quality auditor by MoCIT,
  • 21. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 97 Govt. of India. His research interests include wireless networks, routing algorithms and securing ad hoc networks and data mining. Dr. Ram Bahadur Patel is currently working as Professor and Head at Department of Computer Science & Engineering, Chandigarh College of Engineering & Technology, Chandigarh. He received PhD from IIT Roorkee in Computer Science & Engineering, PDF from Highest Institute of Education, Science & Technology (HIEST), Athens, Greece, MS (Software Systems) from BITS Pilani and B. E. in Computer Engineering from M. M. M. Engineering College, Gorakhpur, UP. Dr. Patel is in teaching and research since 1991. He has supervised 36 M. Tech, 7 M. Phil. and 8 PhD Thesis. He is currently supervising 6 PhD students. He has published 130 research papers in International/National Journals and Refereed International Conferences. He has written 7 text books for engineering courses. He is member of ISTE (New Delhi), IEEE (USA). He is a member of various International Technical Committees and participating frequently in International Technical Committees in India and abroad. His current research interests are in Mobile & Distributed Computing, Mobile Agent Security and Fault Tolerance and Sensor Network. APPENDIX A – SYNTHETIC DATASETS A.1 BDS3500T10I.txt and corresponding TDS3500T10I.txt( 1DB ) at site 1S These synthetic binary and transactional datasets of 3500 records are created by TDSG tool at site 1S . In the binary version each column head represents the item number and each row represents a transaction where integer ‘1’ is used for a purchased item and ‘0’ is used if it is nor purchased. The corresponding transactional version has a Transaction It (TID) for each transaction and Itemset is the set of all the purchased items for that particular transaction.
  • 22. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 98 A.2 BDS3850T10I.txt and corresponding TDS3850T10I.txt( 2DB ) at site 2S These synthetic binary and transactional datasets of 3850 records are created by TDSG tool at site 2S . A.3 BDS3900T10I.txt and corresponding TDS3900T10I.txt( 3DB ) at site 3S These synthetic binary and transactional datasets of 3900 records are created by TDSG tool at site 3S .
  • 23. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 99 APPENDIX B–RESULTANT KNOWLEDGE OF AEMGSAR SYSTEM B.1 (1) FI kL and (1) FISC kL at site 1S List of frequent k-itemset, i.e., (1) FI kL is represented by column L and column SC shows the support count of the corresponding frequent k-itemset, i.e., (1) FISC kL at site 1S . These frequent itemsets and their support counts are obtained by processing the synthetic dataset ( 1DB ) as shown in Appendix A.1. B.2 (2) FI kL and (2) FISC kL at site 2S These frequent itemsets and their support counts are obtained by processing the synthetic dataset ( 2DB ) as shown in Appendix A.2.
  • 24. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 100 B.3 (3) FI kL and (3) FISC kL at site 3S These frequent itemsets and their support counts are obtained by processing the synthetic dataset ( 3DB ) as shown in Appendix A.3.
  • 25. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 101 B.4 1 LSAR L at site 1S Column L represents frequent k-itemset and column AR(support, confidence) shows the list of locally strong association rules, i.e., 1 LSAR L at site 1S . Each strong rule has its associated support and confidence factor. The minimum threshold is taken as 20% and minimum threshold confidence as 50% for generating the strong rules by making use of the data as shown in Appendix B.1.
  • 26. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 102 B.5 2 LSAR L at site 2S Column L represents frequent k-itemset and column AR(support, confidence) shows the list of locally strong association rules, i.e., 2 LSAR L at site 2S . Each strong rule has its associated support and confidence factor. The minimum threshold is taken as 20% and minimum threshold confidence as 50% for generating the strong rules by making use of the data as shown in Appendix B.2.
  • 27. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 103 B.6 3 LSAR L at site 3S Column L represents frequent k-itemset and column AR(support, confidence) shows the list of locally strong association rules, i.e., 3 LSAR L at site 3S . Each strong rule has its associated support and confidence factor. The minimum threshold is taken as 20% and minimum threshold confidence as 50% for generating the strong rules by making use of the data as shown in Appendix B.3.
  • 28. International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 104 B.7 GSAR CENTRALL at site CENTRALS Column L represents globally frequent k-itemset, i.e., itemsets which are locally strong at all the distributed sites and column AR(support, confidence) shows the list of globally strong association rules, i.e., GSAR CENTRALL for such itemsets. Each globally strong rule has its associated support and confidence factor. The minimum threshold is taken as 20% and minimum threshold confidence as 50%. Site represents the IP address of the site where the rule is locally strong. IP address 192.168.46.212 is used for site 1S , 192.168.46.189 for site 2S and address 192.168.46.213 is used for site 3S .