SlideShare a Scribd company logo
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
DOI : 10.5121/ijsea.2015.6301 1
AN IMPROVED GRAPH BASED METHOD
FOR EXTRACTING ASSOCIATION RULES
Wael AlZoubi
Ajloun University College, Balqa Applied University
PO Box: Al-Salt 19117, Jordan
ABSTRACT
This paper proposes an improved approach to mine strong association rules from an association graph,
called graph based association rule mining (GBAR) method, where the association for each frequent
itemset is represented by a sub-graph, then all sub-graphs are merged to determine association rules with
high confidence and eliminate weak rules, the proposed graph based technique is self-motivated since it
builds the association graph in a successive manner. These rules achieve the scalability and reduce the
time needed to extract them. GBAR has been compared with three of the main graph based rule mining
algorithms; they are, FP-Growth Graph algorithm, generalized association pattern generation
(RIOMining) and multilevel association pattern generation (GRG). All of these algorithms depend on the
construction of association graph to generate the desired association rules. On the other hand, this chapter
expresses the observation results from the implementation of GBAR method recorded through the
experiment. The detailed results are shown by different case studies in different minimum support
thresholds values ranging from 90% down to 10% and minimum confidence values range from 55% to
95%. Generally, the observations focused on the execution time, the dimensionality of rules and the number
of rules generated, because the performance of the association rule mining process affected directly of
these criteria. Generally, the GBAR method has successfully reduced the execution time required to
generate desired association rules in almost all of the dataset.
KEYWORDS:
Graph, association rules, transaction data
1. INTRODUCTION
There are several graph based algorithms for mining association rules from transaction datasets,
some of these algorithms are: FP-Growth-Graph algorithm (Tiwari et al. 2010), graph-based rule-
chain incremental online mining algorithm (RIOMining) (Ning et al. 2009), and graph based
algorithm for association rules generation (GRG) (Li et al. 2003). In this paper, an improved
method for association rules extraction is proposed, which will be called (GBAR), GBAR stands
for Graph Based Association Rule mining.
A dataset of transactions must be divided into disjoint groups or clusters before applying the
proposed algorithm. The datasets that are used in this paper are included to evaluate the strength
of the proposed technique in a broader boundary. Ten datasets have been used as case studies in
the experiments, namely Chess and Mushroom. The selected datasets are commonly used in data
mining research (Orlando et al. 2003, Cule & Goethals 2010, Margahny&Mitwaly 2005). Some
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
2
of these datasets have been cleaned on receipt and others will be in their original form.
It is so difficult to construct an association graph for the frequent itemsets of different lengths, i.e.
from the whole clusters of transactions; this is the third challenge in the research. So, a graph
based association rules mining (GBAR) has been proposed to construct an association graph for
each cluster, GBAR is considered as a solution to this challenge, and this facilitates the traversing
of graph, and accelerates the association rules generation.
The evaluation measurements in the graph based phase are focused on four different
measurements, i.e. the rule confidence, the number of rules generated, the dimensionality of the
rules, i.e. the average number of items per rule, and the time required to mine the desired
association rules from the association graph.
2. A GRAPH-BASED FRAMEWORK FOR TRANSACTION DATA
MINING
In general, GBAR is used mainly for visualization of the results and for extraction of confident
association rules among the set of frequent itemsets in a systematic way. The main advantage of
using graph data structure in this research is its ability to solve the space complexity problem
because the graph uses an item as a node exactly once rather than two or more times as was done
in the previous works (Vivek et al. 2010).
As discussed earlier in figure 3.2 that illustrates the research framework, there are three main
steps to construct an association graph for frequent itemsets that are generated and reduced in
previous phases, i.e. the preprocessing phase, the clustering phase, and dataset reduction based on
clustering phase; these three steps are:
(i) Sorting frequent itemsets in each cluster in an ascending order, i.e. if the set of frequent
itemsets in the first cluster – cluster of transactions that contain only one item – are {{1}, {5},
{3}, {29}, {7}}. The frequent 1-itemsets should be reordered to facilitate the construction of the
association graph; they should be as following {{1}, {3}, {5}, {7}, {29}}.
(ii) The second step; which is the main step in this phase; is graph construction step. The
association graph used in the research is directed, if the result after performing logical and
operation (^) between the bit vector of item i (BVi) and the bit vector of item j (BVj) exceeds the
minimum support threshold value assigned by the user, formally (BVi ^ BVj) ≥ min-support, then
a directed edge is drawn from item i to item j, where both i and j are frequent 1-itemsets and they
are in ascending order (i < j).
(iii) The third and final step is the association pattern generation step, in this step, the last item of
the k-frequent itemset is used to extend the frequent k- itemset into k+1 itemsets, where k ≥ 2.
The general framework for the graph-based approach to analyze a large amount of transaction
data consists of five phases (Yen & Chen 2001).
(i) Numbering phase: In this phase, all different items are assigned unique integer numbers, since
dealing with numbers is much easier than dealing with all other data types for calculation of
support and confidence purposes.
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
3
(ii) Large item generation phase: This phase generates large (frequent) items and records related
information. A large item is an item whose support is no less than a user specified minimum
support.
(iii) Association graph construction phase: This phase constructs an association graph to indicate
the associations between large items.
(iv) Association pattern generation phase: This phase generates all association patterns by
traversing the constructed association graph, which is much simpler and less in size than the
original dataset of items.
(v) Association rule generation phase: The association rules can be generated directly according
to the corresponding association patterns.
In this research, the number of phases or steps reduced from five to three as mentioned later in the
previous discussion, and this in turn reduce significantly the time needed to build the graph and to
extract desired association rules.
The concentration of the following example will be restricted to study the graph technique as the
data items are clustered, the frequent itemsets are already generated and the dataset is minimized.
Example 1: Consider the database in Table 1. Each record is a <TID, Itemset> pair, where TID is
the identifier of the corresponding transaction, and itemset records the items purchased in that
transaction. Assume that the minimum support is 50 % (i.e., 2 transactions in this example).
Figure 1(a)Database of Transactions (b) The Database after Sorting of Items (c) Cluster Data (d) GBAR
representation of Data
GBAR receives the transaction dataset in form of clusters, as shown in Figure 1 c. After the
numbering phase, the numbers assigned to the items A, B, C, D, and E are 1, 2, 3, 4, and 5,
respectively as shown in Figure 1 d. The frequent items found in the database are items 1, 2, 3,
and 5, the item D is infrequent because its support (0.25) is less the user defined minimum
support, so it is removed from further processing. From now on, the number of an item will be
used to represent this item. The support for the itemset {i1, i2, …, ik} is the number of 1s in BV i1
^ BV i2 ^ … ^ BV ik, where the notation ^ is the logical AND operation.
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
4
In the association graph construction phase, the Association Graph Construction algorithm (AGC)
(Yen & Chen 2001) is applied to construct an association graph. The AGC algorithm is described
as follows: For every two frequent items i and j. such that i < j, if the number of 1s in BVi ^ BVj
achieves the user-specified
minimum support, a directed edge from item i to item j is created. Also, itemset (i, j) is a frequent
2-itemset. The association graph for this example is shown in Figure 2.
Figure 2: The association graph for example 1
The frequent 2-itemsets are generated directly after the association graph construction phase. The
fourth phase, which is the association pattern generation phase, an extension should be done for
large 2-itemsets generated to generate large k-itemsets (k > 2), such that the last item of the k-
itemset is used to extend the large itemset into k+1 itemsets, i.e. in the aforementioned example,
{{1, 3}, {2, 3}, {2, 5} and {3, 5}} are the set of frequent 2-itemsets, and they will be used as a
base to generate frequent k- itemsets, where k ≥ 3.
As a rule, for a frequent itemset {i1; i2; . . . ; ik}, if there is no directed edge from item ik to an item
v, then itemset {i1; i2; . . . ; ik; v} cannot be a frequent itemset. According to this rule and
depending on the graph in figure 2, there is an edge from 1 to 3 and another edge from 3 to 5; So,
the logical AND operation should be carried out on the bit vectors for these itemsets to check if it
frequent or not, 1010 ^ 1110 ^ 0111 = 0010, as the number of 1's in the result is less than 2, there
will be no frequent itemsets and the association rule generation phase no longer be required since
the generation of the itemsets is terminated.
3. THE PROPOSED GBAR STEPS
Figure 3 presents the steps of the proposed graph based association rule mining (GBAR)
algorithm. The GBAR algorithm takes as input the local frequent itemsets generated by the
cluster based frequent itemset generation (CBFIG) algorithm. After that, a directed association
sub-graph is constructed between local frequent itemsets, such that a directed edge is drawn from
frequent itemsets i to frequent itemset j, where i < j, each edge in the sub-graph represents an
association rule and its confidence is directly computed according to equation x.
Due to the existence of large number of rules among the local frequent itemsets, the
implementation of the GBAR algorithm shows only the confident rules, i.e. those that have
confidence not less than the user defined minimum confidence threshold, this simplifies counting
the association rules and calculating the rules dimensionality. Then the sub-graphs merged
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
5
together, the confidence is updated for the similar edges and they are represented as one edge, to
eliminate any redundancy may occur. The last step is the evaluation of the association rules
generated with respect to the following measurements:
Figure 3 The proposed graph construction steps
4. COMPARISON BETWEEN THE PROPOSED GRAPHS BASED
ALGORITHM AND OTHER METHODS
The good performance of GBAR over GRG algorithm comes from the fact that GRG algorithm
passes over the database of transactions twice to represent each item as a bit vector while GBAR
requires traverses the association graph only once to generate the association rules as mentioned
earlier in this chapter.
The good performance of GBAR over RIOMining comes from the following reasons:
(i) Number of edges in the association graph constructed by RIOMining can be very large.
(ii) Frequent itemset generation by direct extension takes much more time.
The good performance of GBAR over FP-Growth graph algorithm comes from the following
reasons:
(i) The number of nodes in the FP–graph equals number of distinct items in the database D. So,
GBAR is better than FP-Growth graph algorithm as it deals with little number of nodes to mine
the required association rules.
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
6
(ii) The FP-Growth-Graph algorithm gets bad results when the minimum support threshold is
assigned small value since the data structure size rapidly increase which leads to increase the used
memory space.
5. IMPLEMENTATION OF GBAR METHOD
The GBAR algorithm improves the graph based transaction rule mining algorithms as discussed
in the previous section, it builds an association graph for each local frequent itemset generated by
applying the proposed frequent itemset generation based on clustering (CBFIG) method, GBAR
uses the data that have been gotten from the cluster matrix and frequent item matrix to draw the
relationships between the frequent items.
The graphs are constructed in advance, and so GBAR is called successive graph based method for
mining association rules, starting from frequent 2-itemsets in the 2-itemset transaction cluster,
since the first cluster contains only 1–itemset transactions, and grows to cover all subsequent
clusters. Figure 4 shows a sample association graph, which will be illustrated later, in this section.
The graphs are constructed using MATLAB programming language.
Figure 4 Association graph represents the rules associated with frequent itemset {3}
Each frequent item is represented by a node which is a circle on the graph, the graph construction
is based on the matrices concept, and it looks like a 2 dimensional matrix of rows and columns,
where the X-axis represents the frequent item number while the Y-axis doesn’t represent any
meaningful value, it is just used to separate the nodes in order to make them clear to recognize.
Each edge represents an association rule where the labels of the edges are their confidence value.
The directions were removed from the association graph to clarify the relations since all edges
originate from the lowest item number.
According to Figure 4, the frequent itemset {3} has 19 different relations with other frequent
itemsets whose positions are beyond {3}, i.e. {4} and above, these relations will be referred to as
association rules, therefore, there are 19 association rules originates from the item {3}, some of
these rules are:
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
7
If 3 then 7, with 0.62 confidence value
If 3 then 5, with 0.60 confidence value
It is not only impossible to take each graph as a unit and write each edge in it as a separate rule
but also this operation is not rational. Thus all the graphs should be taken together to clarify all
relations among different itemsets of different lengths, the goal behind drawing a graph for each
frequent itemset is to achieve simplicity and avoid any overlapping or misunderstanding may
occur due to the huge amount of relations.
6. THE GRAPH PRUNING IN GBAR METHOD
Constructing the transactions graph is combined with the generation of 2-itemsets and hence, it
does not introduce any additional computational overhead to the graph construction process.
Support pruning is next applied to the transactions graph to remove links with support below
minimum confidence threshold and divide the original graph into a set of sub-graphs.
Figure 5 shows a simple illustration of the graph construction process. The shown graph is a
representation of the following set of transactions {ACF, CF, ABDH, ABEI, DH, DGJ, DGJH,
BEI}. The result of the previous graph construction process is a set of sub graphs each
representing a subset of the transactions database. Next, the subset of the transactions database
corresponding to each sub-graph is identified and the graph construction algorithm (GCA) is
applied on each of these subsets individually.
Figure 5: Transaction Graph Division (Karam et al. 2004)
GBAR starts its work after grouping the transactions into clusters, and processing each cluster to
generate local frequent itemsets then global frequent itemsets going through dataset reduction,
makes the proposed graph method highly efficient and scalable for almost all transaction datasets
regardless to their size, this is due to the nature of market basket data where users are free to
select any items together. Therefore, constructing a graph directly for market basket data will lead
to an almost fully connected graph with very high degree of complexity. Pruning a fully
connected graph will result in a single large sub-graph which will reduce the overhead incurred in
the frequent itemsets generation step, but also, it will add an extra overhead to the process of
executing the transactions graph division algorithm.
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
8
7. THE RESULTS
The results of the experiments have been recorded to be compared against the selected
comparison graph based rule mining methods reported in the literature review chapter. This
section reports the observations from the experiments on all ten datasets according to case
studies. The description starts from the small size datasets to the medium size datasets and ends
with the large size datasets. These all datasets with different sizes are included to evaluate the
performance of the proposed graph based rule mining method (GBAR).
The scalability of the proposed GBAR method was evaluated by using different datasets with
various sizes at the same values of minimum supports, the values of minimum support chosen in
the experiments ranges from ten percent to ninety percent with 10% step, whereas the values of
minimum confidence threshold ranges from 95% down to 55%, the datasets of specific size are
grouped together and the average is computed and recorded in terms of all measurements used.
There are two datasets used in the experiments, i.e. Chess, and Mushroom datasets. In general,
the performance of GBAR is rather high comparing with the FP-Growth graph algorithm,
RIOMining, and GRG algorithms at different values of minimum support and minimum
confidence thresholds due to its efficiency in traversing the association graph and merging any
redundant edges. However, among the comparison algorithms, it has shown better results than
FP-Growth graph algorithm (Tiwari et al. 2010), RIOMining algorithm (Ning et al. 2009) and
GRG algorithm (Li et al. 2003) in these datasets. FP-Growth graph algorithm remains the best
among the comparison algorithms for small size datasets for the reasons listed earlier in section 3.
Table 1 shows the execution time in seconds of the proposed graph based technique (GBAR)
against FP-Growth graph, RIOMining, and GRG algorithms.
Table 1: The Execution Time of GBAR Method, GRG, RioMining, and FP-Growth graph
From the observations recorded in Table 1, the execution time improved while moving from
GRG to FP-Growth Graph algorithm, but the proposed GBAR algorithm still gives best results
among all these techniques since it reduces the steps of mining association rules from the
constructed association graph from five to three as mentioned earlier in section 3 and it doesn’t
take the whole items in consideration during the process of mining association rules. As noticed
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
9
from the table above, the improvement on the execution time ranges from 18.9% in case of FP-
Growth Graph to 63.3% in case of GRG. On the other hand, the time decreases while the
minimum support threshold increases and minimum confidence decreases due to the fact that the
number of frequent itemsets is going down if the minimum support value going up and the
number of rules is being lesser and lesser by increasing the value of minimum confidence
threshold.
Table 2 explains the number of association rules generated and the dimension of the generated
associated rules using the previous two datasets.
Table 2: The Number of Generated Rules and the Dimensionality of Rules using Chess and Mushroom
datasets
From Table 2, the proposed GBAR technique is the best among all other algorithms in terms of
the number of generated rules and the average number of items per rule (dimensionality of the
rules). The increase in number of generated rules in case of GRG is due to the big size of the
association graph and to the huge amount of edges in it, the average number of confident
association rules in case of GRG is 45 rules. FP-Growth Graph gives better results than
RIOMining for the following reasons: (i) the graph size is increased because of using concept
hierarchy in numbering of items in the database, and (ii) all ancestors of each item in a transaction
are added to the transaction and then the FP-Growth Graph algorithm is applied on the extended
transactions. These reasons lead to an increase of the number of edges in the association graph.
FP-Growth Graph outperforms GRG since the numbering of items in GRG occurs level by level
(from the highest concept level to the lowest concept level) and then each concept level is
processed separately to generate large itemsets. The number of confident association rules
increases as the minimum support threshold value decreases and the minimum confidence
threshold increases. On the other hand, GBAR is the best comparing with all these comparable
algorithms especially in the cases of high support values and small confidence values due the
simplicity of the association graph construction in the case of primitive rules.
With respect to the dimensionality of the confident rules generated measurement, GBAR is the
best as it simplifies the rules as possible by reduction of the items participating in them by – in
average – 43.9% comparing with the others due to absence of the concept hierarchy in case of
GBAR.
International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015
10
REFERENCES
[1] Tiwari, V., Gupta, S. & Tiwari, R. 2010. Association rule mining: A graph based approach for mining
frequent itemsets. International Conference on Networking and Information Technology (ICNIT),
2010, pp. 309 – 313.
[2] Ning, H. 2009. Rule-chain incremental mining algorithm based on directed graph. Sixth International
Conference on Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09, pp. 118 – 122.
[3] Li, L. 2003. GRG: an efficient method for association rules mining on frequent closed itemsets. IEEE
International Symposium on Intelligent Control, pp 854 – 859.
[4] Orlando, S., Palmerini, P., Perego, R., Lucchese, C. & Silvestri, F. 2003. kDCI: a multi-strategy
algorithm for mining frequent sets. In Proc. of the Int. Workshop on Frequent Itemset Mining
Implementations in conjunction with ICDM'03, pp 1 – 10.
[5] Cule, B. & Goethals, B. 2010. Mining association rules in long sequences. advances in knowledge
discovery and data mining. Vol. 6118 of Lecture Notes in Computer Science. Springer, pp 300 – 309.
[6] Margahny, M. H. & Mitwaly, A. A. 2005. Fast algorithm for mining association rules. AIML 05
Conference, 19-21 December 2005, CICC, Cairo, Egypt, pp 1 – 5.
[7] Yen, S.-J. & Chen, A. L. P. 2001. A graph-based approach for discovering various types of
association rules. IEEE transactions on knowledge and data engineering, vol. 13, no. 5,
September/October 2001, pp. 839 – 845.
[8] Vivek T., Vipin T., Shailendra G., & Renu T. 2010. Association rule mining: A graph based approach
for mining frequent itemsets. Networking and Information Technology (ICNIT), 2010 International
Conference, pp. 309 – 313.
[9] Karam, O. H., Hamad, A. & Riad, W. 2004. Fast Efficient Association Rule Mining From Web Data.
Fourth International ICSC Symposium on Engineering of Intelligent Systems (EIS 2004), Portugal
February 29 – March 2, 2004, pp.764–770.

More Related Content

PDF
A fpga implementation of data hiding using lsb matching method
PDF
A Survey Report on High Utility Itemset Mining for Frequent Pattern Mining
PDF
Financial Time Series Analysis Based On Normalized Mutual Information Functions
PPTX
Up growth an efficient algorithm for high utility itemset mining(sigkdd2010) (1)
PDF
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
PDF
A Novel Low Complexity Histogram Algorithm for High Performance Image Process...
PDF
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
PDF
Comparative analysis of various data stream mining procedures and various dim...
A fpga implementation of data hiding using lsb matching method
A Survey Report on High Utility Itemset Mining for Frequent Pattern Mining
Financial Time Series Analysis Based On Normalized Mutual Information Functions
Up growth an efficient algorithm for high utility itemset mining(sigkdd2010) (1)
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
A Novel Low Complexity Histogram Algorithm for High Performance Image Process...
MPSKM Algorithm to Cluster Uneven Dimensional Time Series Subspace Data
Comparative analysis of various data stream mining procedures and various dim...

What's hot (18)

PDF
IRJET- Efficient Image Encryption with Pixel Scrambling and Genetic Algorithm
PDF
Classification Techniques: A Review
PDF
Clustbigfim frequent itemset mining of
PDF
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
PDF
COLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACH
PDF
Introducing the concept of information pixels and the sipa (storing informati...
PDF
A fast search algorithm for large
PDF
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
PDF
Data mining techniques application for prediction in OLAP cube
PDF
84cc04ff77007e457df6aa2b814d2346bf1b
PDF
COMPARISON OF WAVELET NETWORK AND LOGISTIC REGRESSION IN PREDICTING ENTERPRIS...
PDF
A Novel Approach for Clustering Big Data based on MapReduce
PDF
Clustering of Big Data Using Different Data-Mining Techniques
PDF
A divisive hierarchical clustering based method for indexing image information
PDF
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
PDF
Data stream mining techniques: a review
PDF
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
PDF
Predicting performance of classification algorithms
IRJET- Efficient Image Encryption with Pixel Scrambling and Genetic Algorithm
Classification Techniques: A Review
Clustbigfim frequent itemset mining of
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
COLOCATION MINING IN UNCERTAIN DATA SETS: A PROBABILISTIC APPROACH
Introducing the concept of information pixels and the sipa (storing informati...
A fast search algorithm for large
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
Data mining techniques application for prediction in OLAP cube
84cc04ff77007e457df6aa2b814d2346bf1b
COMPARISON OF WAVELET NETWORK AND LOGISTIC REGRESSION IN PREDICTING ENTERPRIS...
A Novel Approach for Clustering Big Data based on MapReduce
Clustering of Big Data Using Different Data-Mining Techniques
A divisive hierarchical clustering based method for indexing image information
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
Data stream mining techniques: a review
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
Predicting performance of classification algorithms
Ad

Viewers also liked (20)

PDF
STRATEGIES TO REDUCE REWORK IN SOFTWARE DEVELOPMENT ON AN ORGANISATION IN MAU...
PDF
Benchmarking machine learning techniques
PDF
AGV PATH PLANNING BASED ON SMOOTHING A* ALGORITHM
PDF
A MAPPING MODEL FOR TRANSFORMING TRADITIONAL SOFTWARE DEVELOPMENT METHODS TO ...
PDF
Compositional testing for fsm based models
PDF
Suggest an intelligent framework for building business process management [ p...
PDF
Software metrics validation
PDF
High quality implementation for
PDF
Software testing strategy
PDF
An empirical evaluation of impact of refactoring on internal and external mea...
PDF
SOCIO-DEMOGRAPHIC DIFFERENCES IN THE PERCEPTIONS OF LEARNING MANAGEMENT SYSTE...
PDF
Rankingtherefactoring techniques
PDF
Design patterns for self adaptive systems
PDF
Effectiveness of test case
DOC
Quy che dai hoi
PDF
DOCX
Hương xuânđất việt.
DOC
Lăng kính thơ
DOC
Dê trắng dê đen
STRATEGIES TO REDUCE REWORK IN SOFTWARE DEVELOPMENT ON AN ORGANISATION IN MAU...
Benchmarking machine learning techniques
AGV PATH PLANNING BASED ON SMOOTHING A* ALGORITHM
A MAPPING MODEL FOR TRANSFORMING TRADITIONAL SOFTWARE DEVELOPMENT METHODS TO ...
Compositional testing for fsm based models
Suggest an intelligent framework for building business process management [ p...
Software metrics validation
High quality implementation for
Software testing strategy
An empirical evaluation of impact of refactoring on internal and external mea...
SOCIO-DEMOGRAPHIC DIFFERENCES IN THE PERCEPTIONS OF LEARNING MANAGEMENT SYSTE...
Rankingtherefactoring techniques
Design patterns for self adaptive systems
Effectiveness of test case
Quy che dai hoi
Hương xuânđất việt.
Lăng kính thơ
Dê trắng dê đen
Ad

Similar to An improved graph based method (20)

PDF
5 parallel implementation 06299286
PDF
Cf33497503
PDF
Cf33497503
PDF
Cf33497503
PDF
A SURVEY OF CLUSTERING ALGORITHMS IN ASSOCIATION RULES MINING
PDF
A SURVEY OF CLUSTERING ALGORITHMS IN ASSOCIATION RULES MINING
PDF
A SURVEY OF CLUSTERING ALGORITHMS IN ASSOCIATION RULES MINING
PDF
Scalable frequent itemset mining using heterogeneous computing par apriori a...
PDF
Efficient Temporal Association Rule Mining
PDF
PDF
Web Graph Clustering Using Hyperlink Structure
PDF
Web Graph Clustering Using Hyperlink Structure
PDF
An Improved Frequent Itemset Generation Algorithm Based On Correspondence
PDF
Comparative analysis of association rule generation algorithms in data streams
DOCX
Use of data mining techniques in the discovery of spatial and ...
PDF
EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8
PDF
Control chart pattern recognition using k mica clustering and neural networks
PDF
call for papers, research paper publishing, where to publish research paper, ...
PDF
Novel Approach to Automatically Generate Feasible Assembly Sequence
5 parallel implementation 06299286
Cf33497503
Cf33497503
Cf33497503
A SURVEY OF CLUSTERING ALGORITHMS IN ASSOCIATION RULES MINING
A SURVEY OF CLUSTERING ALGORITHMS IN ASSOCIATION RULES MINING
A SURVEY OF CLUSTERING ALGORITHMS IN ASSOCIATION RULES MINING
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Efficient Temporal Association Rule Mining
Web Graph Clustering Using Hyperlink Structure
Web Graph Clustering Using Hyperlink Structure
An Improved Frequent Itemset Generation Algorithm Based On Correspondence
Comparative analysis of association rule generation algorithms in data streams
Use of data mining techniques in the discovery of spatial and ...
EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8
Control chart pattern recognition using k mica clustering and neural networks
call for papers, research paper publishing, where to publish research paper, ...
Novel Approach to Automatically Generate Feasible Assembly Sequence

Recently uploaded (20)

PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Well-logging-methods_new................
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Digital Logic Computer Design lecture notes
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
web development for engineering and engineering
PDF
composite construction of structures.pdf
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Welding lecture in detail for understanding
PPTX
Construction Project Organization Group 2.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Operating System & Kernel Study Guide-1 - converted.pdf
CYBER-CRIMES AND SECURITY A guide to understanding
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Foundation to blockchain - A guide to Blockchain Tech
CH1 Production IntroductoryConcepts.pptx
Well-logging-methods_new................
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Digital Logic Computer Design lecture notes
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
web development for engineering and engineering
composite construction of structures.pdf
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
bas. eng. economics group 4 presentation 1.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Welding lecture in detail for understanding
Construction Project Organization Group 2.pptx

An improved graph based method

  • 1. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 DOI : 10.5121/ijsea.2015.6301 1 AN IMPROVED GRAPH BASED METHOD FOR EXTRACTING ASSOCIATION RULES Wael AlZoubi Ajloun University College, Balqa Applied University PO Box: Al-Salt 19117, Jordan ABSTRACT This paper proposes an improved approach to mine strong association rules from an association graph, called graph based association rule mining (GBAR) method, where the association for each frequent itemset is represented by a sub-graph, then all sub-graphs are merged to determine association rules with high confidence and eliminate weak rules, the proposed graph based technique is self-motivated since it builds the association graph in a successive manner. These rules achieve the scalability and reduce the time needed to extract them. GBAR has been compared with three of the main graph based rule mining algorithms; they are, FP-Growth Graph algorithm, generalized association pattern generation (RIOMining) and multilevel association pattern generation (GRG). All of these algorithms depend on the construction of association graph to generate the desired association rules. On the other hand, this chapter expresses the observation results from the implementation of GBAR method recorded through the experiment. The detailed results are shown by different case studies in different minimum support thresholds values ranging from 90% down to 10% and minimum confidence values range from 55% to 95%. Generally, the observations focused on the execution time, the dimensionality of rules and the number of rules generated, because the performance of the association rule mining process affected directly of these criteria. Generally, the GBAR method has successfully reduced the execution time required to generate desired association rules in almost all of the dataset. KEYWORDS: Graph, association rules, transaction data 1. INTRODUCTION There are several graph based algorithms for mining association rules from transaction datasets, some of these algorithms are: FP-Growth-Graph algorithm (Tiwari et al. 2010), graph-based rule- chain incremental online mining algorithm (RIOMining) (Ning et al. 2009), and graph based algorithm for association rules generation (GRG) (Li et al. 2003). In this paper, an improved method for association rules extraction is proposed, which will be called (GBAR), GBAR stands for Graph Based Association Rule mining. A dataset of transactions must be divided into disjoint groups or clusters before applying the proposed algorithm. The datasets that are used in this paper are included to evaluate the strength of the proposed technique in a broader boundary. Ten datasets have been used as case studies in the experiments, namely Chess and Mushroom. The selected datasets are commonly used in data mining research (Orlando et al. 2003, Cule & Goethals 2010, Margahny&Mitwaly 2005). Some
  • 2. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 2 of these datasets have been cleaned on receipt and others will be in their original form. It is so difficult to construct an association graph for the frequent itemsets of different lengths, i.e. from the whole clusters of transactions; this is the third challenge in the research. So, a graph based association rules mining (GBAR) has been proposed to construct an association graph for each cluster, GBAR is considered as a solution to this challenge, and this facilitates the traversing of graph, and accelerates the association rules generation. The evaluation measurements in the graph based phase are focused on four different measurements, i.e. the rule confidence, the number of rules generated, the dimensionality of the rules, i.e. the average number of items per rule, and the time required to mine the desired association rules from the association graph. 2. A GRAPH-BASED FRAMEWORK FOR TRANSACTION DATA MINING In general, GBAR is used mainly for visualization of the results and for extraction of confident association rules among the set of frequent itemsets in a systematic way. The main advantage of using graph data structure in this research is its ability to solve the space complexity problem because the graph uses an item as a node exactly once rather than two or more times as was done in the previous works (Vivek et al. 2010). As discussed earlier in figure 3.2 that illustrates the research framework, there are three main steps to construct an association graph for frequent itemsets that are generated and reduced in previous phases, i.e. the preprocessing phase, the clustering phase, and dataset reduction based on clustering phase; these three steps are: (i) Sorting frequent itemsets in each cluster in an ascending order, i.e. if the set of frequent itemsets in the first cluster – cluster of transactions that contain only one item – are {{1}, {5}, {3}, {29}, {7}}. The frequent 1-itemsets should be reordered to facilitate the construction of the association graph; they should be as following {{1}, {3}, {5}, {7}, {29}}. (ii) The second step; which is the main step in this phase; is graph construction step. The association graph used in the research is directed, if the result after performing logical and operation (^) between the bit vector of item i (BVi) and the bit vector of item j (BVj) exceeds the minimum support threshold value assigned by the user, formally (BVi ^ BVj) ≥ min-support, then a directed edge is drawn from item i to item j, where both i and j are frequent 1-itemsets and they are in ascending order (i < j). (iii) The third and final step is the association pattern generation step, in this step, the last item of the k-frequent itemset is used to extend the frequent k- itemset into k+1 itemsets, where k ≥ 2. The general framework for the graph-based approach to analyze a large amount of transaction data consists of five phases (Yen & Chen 2001). (i) Numbering phase: In this phase, all different items are assigned unique integer numbers, since dealing with numbers is much easier than dealing with all other data types for calculation of support and confidence purposes.
  • 3. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 3 (ii) Large item generation phase: This phase generates large (frequent) items and records related information. A large item is an item whose support is no less than a user specified minimum support. (iii) Association graph construction phase: This phase constructs an association graph to indicate the associations between large items. (iv) Association pattern generation phase: This phase generates all association patterns by traversing the constructed association graph, which is much simpler and less in size than the original dataset of items. (v) Association rule generation phase: The association rules can be generated directly according to the corresponding association patterns. In this research, the number of phases or steps reduced from five to three as mentioned later in the previous discussion, and this in turn reduce significantly the time needed to build the graph and to extract desired association rules. The concentration of the following example will be restricted to study the graph technique as the data items are clustered, the frequent itemsets are already generated and the dataset is minimized. Example 1: Consider the database in Table 1. Each record is a <TID, Itemset> pair, where TID is the identifier of the corresponding transaction, and itemset records the items purchased in that transaction. Assume that the minimum support is 50 % (i.e., 2 transactions in this example). Figure 1(a)Database of Transactions (b) The Database after Sorting of Items (c) Cluster Data (d) GBAR representation of Data GBAR receives the transaction dataset in form of clusters, as shown in Figure 1 c. After the numbering phase, the numbers assigned to the items A, B, C, D, and E are 1, 2, 3, 4, and 5, respectively as shown in Figure 1 d. The frequent items found in the database are items 1, 2, 3, and 5, the item D is infrequent because its support (0.25) is less the user defined minimum support, so it is removed from further processing. From now on, the number of an item will be used to represent this item. The support for the itemset {i1, i2, …, ik} is the number of 1s in BV i1 ^ BV i2 ^ … ^ BV ik, where the notation ^ is the logical AND operation.
  • 4. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 4 In the association graph construction phase, the Association Graph Construction algorithm (AGC) (Yen & Chen 2001) is applied to construct an association graph. The AGC algorithm is described as follows: For every two frequent items i and j. such that i < j, if the number of 1s in BVi ^ BVj achieves the user-specified minimum support, a directed edge from item i to item j is created. Also, itemset (i, j) is a frequent 2-itemset. The association graph for this example is shown in Figure 2. Figure 2: The association graph for example 1 The frequent 2-itemsets are generated directly after the association graph construction phase. The fourth phase, which is the association pattern generation phase, an extension should be done for large 2-itemsets generated to generate large k-itemsets (k > 2), such that the last item of the k- itemset is used to extend the large itemset into k+1 itemsets, i.e. in the aforementioned example, {{1, 3}, {2, 3}, {2, 5} and {3, 5}} are the set of frequent 2-itemsets, and they will be used as a base to generate frequent k- itemsets, where k ≥ 3. As a rule, for a frequent itemset {i1; i2; . . . ; ik}, if there is no directed edge from item ik to an item v, then itemset {i1; i2; . . . ; ik; v} cannot be a frequent itemset. According to this rule and depending on the graph in figure 2, there is an edge from 1 to 3 and another edge from 3 to 5; So, the logical AND operation should be carried out on the bit vectors for these itemsets to check if it frequent or not, 1010 ^ 1110 ^ 0111 = 0010, as the number of 1's in the result is less than 2, there will be no frequent itemsets and the association rule generation phase no longer be required since the generation of the itemsets is terminated. 3. THE PROPOSED GBAR STEPS Figure 3 presents the steps of the proposed graph based association rule mining (GBAR) algorithm. The GBAR algorithm takes as input the local frequent itemsets generated by the cluster based frequent itemset generation (CBFIG) algorithm. After that, a directed association sub-graph is constructed between local frequent itemsets, such that a directed edge is drawn from frequent itemsets i to frequent itemset j, where i < j, each edge in the sub-graph represents an association rule and its confidence is directly computed according to equation x. Due to the existence of large number of rules among the local frequent itemsets, the implementation of the GBAR algorithm shows only the confident rules, i.e. those that have confidence not less than the user defined minimum confidence threshold, this simplifies counting the association rules and calculating the rules dimensionality. Then the sub-graphs merged
  • 5. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 5 together, the confidence is updated for the similar edges and they are represented as one edge, to eliminate any redundancy may occur. The last step is the evaluation of the association rules generated with respect to the following measurements: Figure 3 The proposed graph construction steps 4. COMPARISON BETWEEN THE PROPOSED GRAPHS BASED ALGORITHM AND OTHER METHODS The good performance of GBAR over GRG algorithm comes from the fact that GRG algorithm passes over the database of transactions twice to represent each item as a bit vector while GBAR requires traverses the association graph only once to generate the association rules as mentioned earlier in this chapter. The good performance of GBAR over RIOMining comes from the following reasons: (i) Number of edges in the association graph constructed by RIOMining can be very large. (ii) Frequent itemset generation by direct extension takes much more time. The good performance of GBAR over FP-Growth graph algorithm comes from the following reasons: (i) The number of nodes in the FP–graph equals number of distinct items in the database D. So, GBAR is better than FP-Growth graph algorithm as it deals with little number of nodes to mine the required association rules.
  • 6. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 6 (ii) The FP-Growth-Graph algorithm gets bad results when the minimum support threshold is assigned small value since the data structure size rapidly increase which leads to increase the used memory space. 5. IMPLEMENTATION OF GBAR METHOD The GBAR algorithm improves the graph based transaction rule mining algorithms as discussed in the previous section, it builds an association graph for each local frequent itemset generated by applying the proposed frequent itemset generation based on clustering (CBFIG) method, GBAR uses the data that have been gotten from the cluster matrix and frequent item matrix to draw the relationships between the frequent items. The graphs are constructed in advance, and so GBAR is called successive graph based method for mining association rules, starting from frequent 2-itemsets in the 2-itemset transaction cluster, since the first cluster contains only 1–itemset transactions, and grows to cover all subsequent clusters. Figure 4 shows a sample association graph, which will be illustrated later, in this section. The graphs are constructed using MATLAB programming language. Figure 4 Association graph represents the rules associated with frequent itemset {3} Each frequent item is represented by a node which is a circle on the graph, the graph construction is based on the matrices concept, and it looks like a 2 dimensional matrix of rows and columns, where the X-axis represents the frequent item number while the Y-axis doesn’t represent any meaningful value, it is just used to separate the nodes in order to make them clear to recognize. Each edge represents an association rule where the labels of the edges are their confidence value. The directions were removed from the association graph to clarify the relations since all edges originate from the lowest item number. According to Figure 4, the frequent itemset {3} has 19 different relations with other frequent itemsets whose positions are beyond {3}, i.e. {4} and above, these relations will be referred to as association rules, therefore, there are 19 association rules originates from the item {3}, some of these rules are:
  • 7. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 7 If 3 then 7, with 0.62 confidence value If 3 then 5, with 0.60 confidence value It is not only impossible to take each graph as a unit and write each edge in it as a separate rule but also this operation is not rational. Thus all the graphs should be taken together to clarify all relations among different itemsets of different lengths, the goal behind drawing a graph for each frequent itemset is to achieve simplicity and avoid any overlapping or misunderstanding may occur due to the huge amount of relations. 6. THE GRAPH PRUNING IN GBAR METHOD Constructing the transactions graph is combined with the generation of 2-itemsets and hence, it does not introduce any additional computational overhead to the graph construction process. Support pruning is next applied to the transactions graph to remove links with support below minimum confidence threshold and divide the original graph into a set of sub-graphs. Figure 5 shows a simple illustration of the graph construction process. The shown graph is a representation of the following set of transactions {ACF, CF, ABDH, ABEI, DH, DGJ, DGJH, BEI}. The result of the previous graph construction process is a set of sub graphs each representing a subset of the transactions database. Next, the subset of the transactions database corresponding to each sub-graph is identified and the graph construction algorithm (GCA) is applied on each of these subsets individually. Figure 5: Transaction Graph Division (Karam et al. 2004) GBAR starts its work after grouping the transactions into clusters, and processing each cluster to generate local frequent itemsets then global frequent itemsets going through dataset reduction, makes the proposed graph method highly efficient and scalable for almost all transaction datasets regardless to their size, this is due to the nature of market basket data where users are free to select any items together. Therefore, constructing a graph directly for market basket data will lead to an almost fully connected graph with very high degree of complexity. Pruning a fully connected graph will result in a single large sub-graph which will reduce the overhead incurred in the frequent itemsets generation step, but also, it will add an extra overhead to the process of executing the transactions graph division algorithm.
  • 8. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 8 7. THE RESULTS The results of the experiments have been recorded to be compared against the selected comparison graph based rule mining methods reported in the literature review chapter. This section reports the observations from the experiments on all ten datasets according to case studies. The description starts from the small size datasets to the medium size datasets and ends with the large size datasets. These all datasets with different sizes are included to evaluate the performance of the proposed graph based rule mining method (GBAR). The scalability of the proposed GBAR method was evaluated by using different datasets with various sizes at the same values of minimum supports, the values of minimum support chosen in the experiments ranges from ten percent to ninety percent with 10% step, whereas the values of minimum confidence threshold ranges from 95% down to 55%, the datasets of specific size are grouped together and the average is computed and recorded in terms of all measurements used. There are two datasets used in the experiments, i.e. Chess, and Mushroom datasets. In general, the performance of GBAR is rather high comparing with the FP-Growth graph algorithm, RIOMining, and GRG algorithms at different values of minimum support and minimum confidence thresholds due to its efficiency in traversing the association graph and merging any redundant edges. However, among the comparison algorithms, it has shown better results than FP-Growth graph algorithm (Tiwari et al. 2010), RIOMining algorithm (Ning et al. 2009) and GRG algorithm (Li et al. 2003) in these datasets. FP-Growth graph algorithm remains the best among the comparison algorithms for small size datasets for the reasons listed earlier in section 3. Table 1 shows the execution time in seconds of the proposed graph based technique (GBAR) against FP-Growth graph, RIOMining, and GRG algorithms. Table 1: The Execution Time of GBAR Method, GRG, RioMining, and FP-Growth graph From the observations recorded in Table 1, the execution time improved while moving from GRG to FP-Growth Graph algorithm, but the proposed GBAR algorithm still gives best results among all these techniques since it reduces the steps of mining association rules from the constructed association graph from five to three as mentioned earlier in section 3 and it doesn’t take the whole items in consideration during the process of mining association rules. As noticed
  • 9. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 9 from the table above, the improvement on the execution time ranges from 18.9% in case of FP- Growth Graph to 63.3% in case of GRG. On the other hand, the time decreases while the minimum support threshold increases and minimum confidence decreases due to the fact that the number of frequent itemsets is going down if the minimum support value going up and the number of rules is being lesser and lesser by increasing the value of minimum confidence threshold. Table 2 explains the number of association rules generated and the dimension of the generated associated rules using the previous two datasets. Table 2: The Number of Generated Rules and the Dimensionality of Rules using Chess and Mushroom datasets From Table 2, the proposed GBAR technique is the best among all other algorithms in terms of the number of generated rules and the average number of items per rule (dimensionality of the rules). The increase in number of generated rules in case of GRG is due to the big size of the association graph and to the huge amount of edges in it, the average number of confident association rules in case of GRG is 45 rules. FP-Growth Graph gives better results than RIOMining for the following reasons: (i) the graph size is increased because of using concept hierarchy in numbering of items in the database, and (ii) all ancestors of each item in a transaction are added to the transaction and then the FP-Growth Graph algorithm is applied on the extended transactions. These reasons lead to an increase of the number of edges in the association graph. FP-Growth Graph outperforms GRG since the numbering of items in GRG occurs level by level (from the highest concept level to the lowest concept level) and then each concept level is processed separately to generate large itemsets. The number of confident association rules increases as the minimum support threshold value decreases and the minimum confidence threshold increases. On the other hand, GBAR is the best comparing with all these comparable algorithms especially in the cases of high support values and small confidence values due the simplicity of the association graph construction in the case of primitive rules. With respect to the dimensionality of the confident rules generated measurement, GBAR is the best as it simplifies the rules as possible by reduction of the items participating in them by – in average – 43.9% comparing with the others due to absence of the concept hierarchy in case of GBAR.
  • 10. International Journal of Software Engineering & Applications (IJSEA), Vol.6, No.3, May 2015 10 REFERENCES [1] Tiwari, V., Gupta, S. & Tiwari, R. 2010. Association rule mining: A graph based approach for mining frequent itemsets. International Conference on Networking and Information Technology (ICNIT), 2010, pp. 309 – 313. [2] Ning, H. 2009. Rule-chain incremental mining algorithm based on directed graph. Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09, pp. 118 – 122. [3] Li, L. 2003. GRG: an efficient method for association rules mining on frequent closed itemsets. IEEE International Symposium on Intelligent Control, pp 854 – 859. [4] Orlando, S., Palmerini, P., Perego, R., Lucchese, C. & Silvestri, F. 2003. kDCI: a multi-strategy algorithm for mining frequent sets. In Proc. of the Int. Workshop on Frequent Itemset Mining Implementations in conjunction with ICDM'03, pp 1 – 10. [5] Cule, B. & Goethals, B. 2010. Mining association rules in long sequences. advances in knowledge discovery and data mining. Vol. 6118 of Lecture Notes in Computer Science. Springer, pp 300 – 309. [6] Margahny, M. H. & Mitwaly, A. A. 2005. Fast algorithm for mining association rules. AIML 05 Conference, 19-21 December 2005, CICC, Cairo, Egypt, pp 1 – 5. [7] Yen, S.-J. & Chen, A. L. P. 2001. A graph-based approach for discovering various types of association rules. IEEE transactions on knowledge and data engineering, vol. 13, no. 5, September/October 2001, pp. 839 – 845. [8] Vivek T., Vipin T., Shailendra G., & Renu T. 2010. Association rule mining: A graph based approach for mining frequent itemsets. Networking and Information Technology (ICNIT), 2010 International Conference, pp. 309 – 313. [9] Karam, O. H., Hamad, A. & Riad, W. 2004. Fast Efficient Association Rule Mining From Web Data. Fourth International ICSC Symposium on Engineering of Intelligent Systems (EIS 2004), Portugal February 29 – March 2, 2004, pp.764–770.