SlideShare a Scribd company logo
David C. Wyld, et al. (Eds): CCSEA, SEA, CLOUD, DKMP, CS & IT 05, pp. 343–351, 2012.
© CS & IT-CSCP 2012 DOI : 10.5121/csit.2012.2234
SubGraD- An Approach for Subgraph Detection
Akshara Pande1
, Vivekanand Pant2,
S. Nigam1
1
Human Resource Development Group,
CSIR Complex, Library Avenue, Pusa, New Delhi, India.
pandeakshara@gmail.com, snigam@csirhrdg.res.in
2
IBM, A-26, Sector- 62, NOIDA, Gautam Buddha Nagar,
Noida - 201307, India.
vickypantmnr@gmail.com
ABSTRACT
A new approach of graph matching is introduced in this paper, which efficiently solves the
problem of graph isomorphism and subgraph isomorphism. In this paper we are introducing a
new approach called SubGraD, for query graph detection in source graph. Firstly consider the
model graph (query graph) and make the possible sets called model sets starting from the
chosen initial node or starter. Similarly, for the source graph (reference graph), all the possible
sets called reference sets could be made. Our aim is to make the reference set on the basis of the
model set. If it is possible to make the reference set, then it is said that query graph has been
detected in the source graph.
KEYWORDS
model graph, reference graph, starter, model set, reference set
1. INTRODUCTION
Graphs are data structures which are proved to be effective way of representing objects [1]. One
of the problems of interest, with graphs, is matching a model graph (query graph) against a
reference graph (source graph). Graph matching is the process of finding a correspondence
between the nodes and the edges of two graphs that satisfies some (more or less stringent)
constraints ensuring that similar substructures in one graph are mapped to similar substructures in
the other [2]. Many graph databases are growing rapidly in size. The growth is both in the number
of graphs and the sizes of graphs (the number of nodes and the number of edges). There is a
critical need for efficient and effective graph querying tools for querying and mining these
growing graph databases.
We are introducing a new method called SubGraD, which is helpful in graph (directed graph)
mining. Here we take the model graph first and make the model sets. Suppose that there are two
nodes in the model graph (query graph), and then corresponding model set will have only two
elements. Then we break the reference graph in such a way that each set of reference set will have
sets of two elements. We are assuming that the smallest model set is the set which has two
elements corresponding to two nodes of the query graph. We are not considering the case where
the self loop is present. In this paper, we are trying to make sets for reference set by using the
344 Computer Science & Information Technology (CS & IT)
SubGraD algorithm. If reference set could be made, then query graph is present in reference
graph otherwise not. Related works are discussed in section 2. In section 3, SubGraD algorithm is
given. Model sets have been made in section 4. And in section 5, it is tried to make Reference sets
so that we could find out whether the query graph is present on the source graph or not.
2. RELATED WORK
Graph matching is the concept that has been intensively used in various applications such as in
the field of social networks, road networks, network topology, chemical structures, graph
grammars and semantic networks or pattern recognition. Some of the related works associated
with different fields for graph mining is discussed in this section.
In Fuzzy set theory, there are so many works proposed for graph matching, for example Perchant
and I. Bloch [3-6] gave methods for graph matching using fuzzy set theory , and Hwan gave an
idea for sub-graph matching [7]. Fuzzy attributed graph models are suggested for very different
image type representations, such as fingerprint verification [8].
Fernandez and Valiente proposed a method to represent attributed relational graphs, the
maximum common subgraph and the minimum common super graph of two graphs by means of
simple structures [9]. Bunke [10] suggested the relationship between graph edit distance and the
maximum common subgraph, graph edit distance computation is equivalent to solving the
maximum common subgraph problem.
Cross [11] described an outline for performing relational graph matching using genetic algorithm.
He showed that Bayesian consistency measure could be proficiently optimized using a hybrid
genetic search process that includes a local search strategy using a hill-climbing step. This hill-
climbing step accelerates convergence considerably. Cross extended this idea in [12] that is also a
convergence analysis for the problem of attributed graph matching using genetic search.
Myers and Hancock gave an idea that in attributed graph matching problems usually there are
more than one valid and satisfactory solution, and they proposed a method to obtain different
solutions at the same time during the genetic search, using an appropriately modified genetic
algorithm [13-14].
Hancock and Kittler [15-16] used an iterative approach called probabilistic relaxation, and
considering binary relations and assuming a Gaussian error. Later a Bayesian perspective was
considered for both unary and binary attributes by Christmas [17], Gold and Rangarajan [18],
Wilson and Hancock [19-20]. Williams [21] presented a comparative study of various
deterministic discrete search-strategies for graph-matching, which was based on the previous
Bayesian consistency measure in [19-20] and Tabu search was proposed as a graph matching
algorithm.
3. SUBGRAD ALGORITHM
Graph Matching techniques are important and very general form of pattern matching that finds
realistic use in areas such as image processing, pattern recognition and computer vision, graph
grammars, graph transformation, bio computing, search operation in chemical structural formulae
database, etc. SubGraD Algorithm is one such graph matching method. In SubGraD method, an
adjacency matrix [A] is drawn for a query graph (model graph). Element Aij of [A] is 1 if an edge
Computer Science & Information Technology (CS & IT) 345
is present from node i to node j else 0. Using [A], model set M is prepared. Model set M is a set,
consisting all sets (i, j) as its element where Aij is 1. On the basis of model set we will try to make
the sets for reference set.
SubGraD
For i: 1 to number of nodes (n) in query graph
For j: 1 to number of nodes (n) in query graph
Make the adjacency matrix [A] of n×n
END FOR j
END FOR i
Make Model Set M, M has the elements of the type (a1, b1), (a1, b2),…… (an, bn) where Aab=1
For i: 1 to n
For j: 1 to n
M= (i, j), where Aij=1
END FOR j
END FOR i
For i: 1 to number of nodes (m) in source graph
For j: 1 to number of nodes (m) in source graph
Make the adjacency matrix [S] of m×m
END FOR j
END FOR i
Try to make the reference set R corresponding to Model Set M
Try to break adjacency matrix [S] into k sets R1, R2, ……,Rk , where Sij=1 (row i and column j
has entry 1) and each Rk has the same number of elements as M has
R= ( R1, R2, ……,Rk)
If reference R set can be made, then query graph is detected in source graph
Otherwise query graph doesn’t exist
END SubGraD
4. MODEL SETS
Suppose we have the query graphs as shown in Figure 1, and our purpose is to detect these query
graphs in a given source graph. First step will be to draw the adjacency matrices for query
graphs. In first query graph (Figure 1(a)), there are two nodes a and b, hence Adjacency matrix
[A] will be 2×2. A single edge is present from a to b so the adjacency matrix [A] will have Aab
entry as 1 and rest of the elements (Aaa , Aba , Abb ) as 0. Likewise, the adjacency matrices Figure
2(b), 2(c) and 2(d) are drawn for query graphs Figure 1(b), 1(c) and 1(d) respectively.
346 Computer Science & Information Technology (CS & IT)
Next step is to prepare Model sets using these adjacency matrices. The adjacency matrix of
Figure 1(a) has four elements Aaa , Aab , Aba , Abb (Figure 2(a)). The model set corresponding to
this adjacency matrix is (a, b), since only Aab entry is 1. Similarly for query graph (Figure 1(b)),
the model set is (a, b) and (b, c), since there are two entries which are 1 in adjacency matrix (i.e.
Figure 2(b)). Model sets for query graphs of Figure 1, is shown in TABLE 1.
Figure1. Query graphs
Figure 2. Adjacency matrices for query graphs
Computer Science & Information Technology (CS & IT) 347
Table 1. Model Sets for Query Graphs.
5. REFERENCE SETS
Let us consider the source graph present in Figure 3, now we will try to make reference set for
source graph. Firstly select the Model set, we are taking MODEL SET I (from Table 1), there is
only an edge that is present from node a to node b, so we are applying SubGraD algorithm, which
selects the set of row i and column j, where the entry is 1 in adjacency matrix [S], of source graph
(i.e. Figure 4). That means we are trying to make reference set R which consists of set
R1,R2….,Rn. In source graph for row 1, and for column 2, 3, 4 and 6 the entries are 1, so the
reference set for row 1 is (1, 2),(1, 3), (1, 4) (1, 6). Similarly for row 2, the reference set is (2, 4),
(2, 6), for row 3, the reference set is (3, 5), for row 4, the reference set is (4, 6), for row 5, the
reference set is (5, 1) and for row 6, the reference set is (6, 5). Corresponding to MODEL SET I,
the Reference set R I (Table 2) have been found. Hence the query graph is detected in source
graph.
Figure 3. Source graph
Figure 4. Adjacency matrix of Source graph
348 Computer Science & Information Technology (CS & IT)
Now suppose we want to find out whether query graph (Figure 1 (b)) is present or not, take the
MODEL SET II from TABLE 1, i.e. (a, b), (b, c). In MODEL SET II we can see that element b is
presented twice in the set. This type of element is called middle element. Actually, it is the node
which has an incoming edge as well as an outgoing edge (in our case the node b has an incoming
edge from node a and an outgoing edge to node c from Figure 1(b)). Now from source graph we
have to make such a reference set R = (R1, R2), where R1 should have the set (i, k) and R2 should
have set (k, j). Here element k is the middle element. Firstly we start from row 1 in source graph
(Figure 4), S12, S13, S14 and S16 are equal to 1, that means entries from row 1 to column 2, 3, 4 and
6 is 1. So next we will take element 2, 3, 4 and 6 as middle element, and will now consider row 2,
row 3, row 4 and row 6. From adjacency matrix [S] (Figure 4), we will see that which Sij entry is
1, here i is equal to 2, 3, 4 and 6. For row 2, S24 and S26 elements are 1. So in consequence there
are two elements, i.e. S12 and S24, which are 1. So the reference set R is made which has an
element (1, 2), (2, 4), where 2 is the middle element. Similarly other elements of the reference set
can be made. So reference set for row 1 is (1, 2)(2,4), (1, 2)(2, 6), (1, 3)(3, 5), (1, 4)(4, 6), (1,
6)(6, 5). For row 2, the reference set is (2, 4) (4, 6), (2, 6) (6, 5). For row 3, the reference set is (3,
5) (5, 1). For row 4, the reference set is (4, 6) (6, 5). For row 5, the reference set is (5, 1)(1, 2), (5,
1)(1, 3), (5, 1)(1, 4), (5, 1)(1, 6). For row 6, the reference set is (6, 5) (5, 1). For row 1, 2, 3, 4, 5
and 6 we can make the reference set for MODEL SET II, hence the query graph Figure 1(b)
exists in the source graph at all nodes. Reference set II for MODEL SET II can be seen by Table
2.
Now suppose we want to find out whether query graph (Figure 1 (c)) is present or not, take the
MODEL SET III from Table 1, i.e. (a, b) (b, c)(c, a). In MODEL SET III, we can see that there
are middle elements b and c, and a cycle is formed. Cycle formation means that the starting node
is also the ending node. It should be noted that the order of the set is not important, it means that,
(a, b) (b, c) (c, a) is same as (a, b) (c, a) (b, c). But the former one satisfies the definition of cycle
formation, so we should try to write in the order as the former one has. From adjacency matrix Sij,
we could see that for row 1, S13, S35, S51,S16, S65,S51 are 1. Reference set for row 1, is (1, 3)(3,
5)(5, 1) and (1, 6)(6, 5)(5, 1). Hence for node 1, the query graph has been detected twice.
Elements 1, 3 and 5 corresponds to the elements a, b, c respectively, which forms a cycle. The
null entry in Table 2 denotes that for that particular node, the query graph could not be detected.
It could be seen from Table 2, for node 2 there is no reference set for MODEL SET III. Hence the
query graph (Figure 1 (c)), is not present at node 2 of the source graph. Similarly for node 4 of
source graph, this query graph cannot be detected.
For the query graph (Figure 1 (d)), again the cycle is formed. But here the model set M is consists
of four sets. So now we will try to make reference set R which should have R1, R2, R3 and R4 in
such a way that they should form a cycle. For example from adjacency matrix [S], it could be
seen that S46, S65, S51 and S14 are 1, and they form a cycle. Hence query graph is detected at node
4. But at node 2 and node 3 it is not present. Hence if we could not able to make the reference set
for particular query graph, then query graph could not be detected in the source graph.
Computer Science & Information Technology (CS & IT) 349
Table 2. Reference sets for source graph.
.
5. CONCLUSION
In this paper we have introduced an efficient approach for graph matching SubGraD. We first
make the model set M for query graphs with the help of adjacency matrix and then corresponding
to that model set, we try to make reference set R for source graph. If reference set could be made,
then query graph is detected in source graph otherwise it does not exist in the source graph.
ACKNOWLEDGEMENTS
I am very much thankful to Human Resource Development Group, CSIR for providing me a
research internship to carry out this work.
REFERENCES
[1] A. Eshera and K.-S. Fu, (1986) “An image understanding using attributed symbolic representation
and inexact graph matching”, IEEE Transactions on Pattern Analysis and Machine Intelligence,
8(5):604–618.
[2] D Conte, P Foggia, C Sansone, M Vento (2004) “Thirty years of graph matching in pattern
recognition”, Int. Journal of Pattern Recognition and Artificial Intelligence 18, 265–298.
[3] Perchant and I. Bloch. (1999) “A new definition for fuzzy attributed graph homomorphism with
application to structural shape recognition in brain imaging”, In IMTC’99, 16th IEEE Instrumentation
and Measurement Technology Conference, pages 1801–1806, Venice, Italy.
[4] Perchant and I. Bloch. (2000) “Graph fuzzy homomorphism interpreted as fuzzy association graphs”,
In Proceedings of the International Conference on Pattern Recognition, ICPR 2000,volume2, pages
1046-1049, Barcelona, Spain.
350 Computer Science & Information Technology (CS & IT)
[5] Perchant and I. Bloch. (2000) “Semantic spatial fuzzy attribute design for graph modelling”, In 8th
International Conference on Information Processing and Management of Uncertainty in Knowledge
based Systems IPMU 2000, volume 3, pages 1397–1404, Madrid, Spain.
[6] Perchant and I. Bloch. (2002) “Fuzzy morphisms between graphs”, Fuzzy Sets and Systems, 128 (2)
pages 149–168.
[7] Sung Hwan. (2001) “Content-based image retrieval using fuzzy multiple attribute relational graph”,
IEEE International Symposium on Industrial Electronics Proceedings (ISIE 2001), 3:1508–1513.
[8] Fan, C. Liu, and Y. Wang. (2000) “A randomized approach with geometric constraints to fingerprint
verification”, Pattern Recognition, 33(11):1793–1803.
[9] Fernandez and G. Valiente. (2001) “A graph distance metric combining maximum common subgraph
and minimum common supergraph”, Pattern Recognition Letters, 22(6-7):753– 758.
[10] Bunke. (1997) “On a relation between graph edit distance and maximum common subgraph” Pattern
Recognition Letters, 18(8):689–694.
[11] D. J. Cross, R. C. Wilson, and E. R. Hancock. (1997) “Inexact graph matching using genetic search”,
Pattern Recognition, 30(6):953–970.
[12] D. J. Cross, R.Myers, and E. R. Hancock. (2000) “Convergence of a hill-climbing genetic algorithm
for graph matching”, Pattern Recognition, 33(11):1863–1880.
[13] R. Myers and E. R. Hancock. (2000) “Genetic algorithms for ambiguous labelling problems”, Pattern
Recognition, 33(4):685–704.
[14] R. Myers and E. R. Hancock. (2001) “Least-commitment graph matching with genetic algorithms”,
Pattern Recognition, 34(2):375–394.
[15] R. Hancock and J. Kittler. (1990) “Edge-labeling using dictionary-based relaxation”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 12(2):165–181.
[16] Kittler, W. J. Christmas, and M. Petrou. (1993) “Probabilistic relaxation for matching problems in
computer vision”, IEEE Proceedings of the International Conference on Computer Vision (ICCV93),
pages 666–673.
[17] J. Christmas, J. Kittler, and M. Petrou. (1995) “Structural matching in computer vision using
probabilistic relaxation” IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8):749–
64.
[18] Gold and A. Rangarajan. (1996) “A graduated assignment algorithm for graph matching”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 18(4):377–88.
[19] C. Wilson and E. R. Hancock. (1996) “Bayesian compatibility model for grach matching”, Pattern
Recognition Letters, 17:263–276.
[20] R. C. Wilson and E. R. Hancock. (1997) “Structural matching by discrete relaxation”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 19(6):634–698.
[21] Williams, R. C. Wilson, and E. R. Hancock. (1999) “Deterministic search for relational graph
matching”, Pattern Recognition, 32(7):1255–1271.
Computer Science & Information Technology (CS & IT) 351
Authors
Ms. Akshara Pande is a Research Intern in HRDG, CSIR. She has done M.Sc. in
Computer Science from J.K. Institute of Applied Physics & Technology, University of
Allahabad, Allahabad. Her research area is Design Pattern Detection and Subgraph
Detection.
Mr. Vivekanand Pant is a Senior Software Engineer in IBM Noida. He has eight years
industry experience. He has BTech in Information Technology from Motilal Nehru
National Institute of Technology, Allahabad. He is currently working in software
development area.
Mr. Shailendra Nigam holds a Masters degree in Electronics Science and is currently
working as a Sr. Principal Scientist in HRDG, CSIR. His research inte rests include DSP,
Text to Speech Synthesis, PC Based System Design and Workflow Automation. He holds
a patent and has filed many software copyrights.

More Related Content

PDF
A Join Operator for Property Graphs
PDF
Finding Top-k Similar Graphs in Graph Database @ ReadingCircle
PDF
Introduction to R Graphics with ggplot2
PDF
Stride Random Erasing Augmentation
PDF
R Regression Models with Zelig
PPTX
2. Linear Data Structure Using Arrays - Data Structures using C++ by Varsha P...
PDF
Utilizing Graph Theory to Model Forensic Examination
PDF
MIXTURES OF TRAINED REGRESSION CURVESMODELS FOR HANDRITTEN ARABIC CHARACTER R...
A Join Operator for Property Graphs
Finding Top-k Similar Graphs in Graph Database @ ReadingCircle
Introduction to R Graphics with ggplot2
Stride Random Erasing Augmentation
R Regression Models with Zelig
2. Linear Data Structure Using Arrays - Data Structures using C++ by Varsha P...
Utilizing Graph Theory to Model Forensic Examination
MIXTURES OF TRAINED REGRESSION CURVESMODELS FOR HANDRITTEN ARABIC CHARACTER R...

What's hot (18)

PPTX
8. Graph - Data Structures using C++ by Varsha Patil
PDF
A Preference Model on Adaptive Affinity Propagation
PPTX
Application of graph theory in drug design
PDF
AN ARITHMETIC OPERATION ON HEXADECAGONAL FUZZY NUMBER
PDF
D034017022
PDF
AN ARITHMETIC OPERATION ON HEXADECAGONAL FUZZY NUMBER
PDF
A New Hendecagonal Fuzzy Number For Optimization Problems
PDF
Dimensionality reduction by matrix factorization using concept lattice in dat...
PDF
MIXTURES OF TRAINED REGRESSION CURVES MODELS FOR HANDWRITTEN ARABIC CHARACTER...
PDF
Solving Fuzzy Maximal Flow Problem Using Octagonal Fuzzy Number
PDF
Applied Mathematics and Sciences: An International Journal (MathSJ)
PDF
data_mining
PPTX
Data Structure Graph DMZ #DMZone
PPTX
06-07 Chapter interpolation in MATLAB
PDF
Graph Based Pattern Recognition
PDF
ADAPTIVE MAP FOR SIMPLIFYING BOOLEAN EXPRESSIONS
PDF
5.quadratic equations
8. Graph - Data Structures using C++ by Varsha Patil
A Preference Model on Adaptive Affinity Propagation
Application of graph theory in drug design
AN ARITHMETIC OPERATION ON HEXADECAGONAL FUZZY NUMBER
D034017022
AN ARITHMETIC OPERATION ON HEXADECAGONAL FUZZY NUMBER
A New Hendecagonal Fuzzy Number For Optimization Problems
Dimensionality reduction by matrix factorization using concept lattice in dat...
MIXTURES OF TRAINED REGRESSION CURVES MODELS FOR HANDWRITTEN ARABIC CHARACTER...
Solving Fuzzy Maximal Flow Problem Using Octagonal Fuzzy Number
Applied Mathematics and Sciences: An International Journal (MathSJ)
data_mining
Data Structure Graph DMZ #DMZone
06-07 Chapter interpolation in MATLAB
Graph Based Pattern Recognition
ADAPTIVE MAP FOR SIMPLIFYING BOOLEAN EXPRESSIONS
5.quadratic equations
Ad

Similar to SubGraD- An Approach for Subgraph Detection (20)

PDF
SASUM: A Sharing-based Approach to Fast Approximate Subgraph Matching for Lar...
PDF
A Subgraph Pattern Search over Graph Databases
DOCX
SUBGRAPH MATCHING WITH SET SIMILARITY IN A LARGE GRAPH DATABASE - IEEE PROJE...
DOCX
Subgraph matching with set similarity in a
DOCX
Subgraph matching with set similarity in a
DOC
Graph Matching Algorithm-Through Isomorphism Detection
PDF
Scalable and Adaptive Graph Querying with MapReduce
PPTX
Survey of Graph Indexing
PDF
FREQUENT SUBGRAPH MINING ALGORITHMS - A SURVEY AND FRAMEWORK FOR CLASSIFICATION
PPTX
Dagstuhl seminar talk on querying big graphs
PPT
Trends In Graph Data Management And Mining
PPT
Survey on Frequent Pattern Mining on Graph Data - Slides
PDF
CSMR11b.ppt
PDF
Towards Quantum-based Graph Matching for IoT Systems
PDF
Ijetcas14 314
PPTX
SCPsubgraph coverage patterns presentation
PDF
Graph theory
PDF
Lgm pakdd2011 public
PPT
PDF
call for papers, research paper publishing, where to publish research paper, ...
SASUM: A Sharing-based Approach to Fast Approximate Subgraph Matching for Lar...
A Subgraph Pattern Search over Graph Databases
SUBGRAPH MATCHING WITH SET SIMILARITY IN A LARGE GRAPH DATABASE - IEEE PROJE...
Subgraph matching with set similarity in a
Subgraph matching with set similarity in a
Graph Matching Algorithm-Through Isomorphism Detection
Scalable and Adaptive Graph Querying with MapReduce
Survey of Graph Indexing
FREQUENT SUBGRAPH MINING ALGORITHMS - A SURVEY AND FRAMEWORK FOR CLASSIFICATION
Dagstuhl seminar talk on querying big graphs
Trends In Graph Data Management And Mining
Survey on Frequent Pattern Mining on Graph Data - Slides
CSMR11b.ppt
Towards Quantum-based Graph Matching for IoT Systems
Ijetcas14 314
SCPsubgraph coverage patterns presentation
Graph theory
Lgm pakdd2011 public
call for papers, research paper publishing, where to publish research paper, ...
Ad

More from cscpconf (20)

PDF
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
PDF
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
PDF
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
PDF
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PDF
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
PDF
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
PDF
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
PDF
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
PDF
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
PDF
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
PDF
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
PDF
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
PDF
AUTOMATED PENETRATION TESTING: AN OVERVIEW
PDF
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
PDF
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
PDF
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PDF
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
PDF
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
PDF
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
PDF
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
AUTOMATED PENETRATION TESTING: AN OVERVIEW
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT

Recently uploaded (20)

PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Classroom Observation Tools for Teachers
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Complications of Minimal Access Surgery at WLH
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Structure & Organelles in detailed.
PDF
Pre independence Education in Inndia.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Basic Mud Logging Guide for educational purpose
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Final Presentation General Medicine 03-08-2024.pptx
Classroom Observation Tools for Teachers
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Supply Chain Operations Speaking Notes -ICLT Program
Complications of Minimal Access Surgery at WLH
102 student loan defaulters named and shamed – Is someone you know on the list?
STATICS OF THE RIGID BODIES Hibbelers.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Structure & Organelles in detailed.
Pre independence Education in Inndia.pdf
Microbial disease of the cardiovascular and lymphatic systems
FourierSeries-QuestionsWithAnswers(Part-A).pdf
O7-L3 Supply Chain Operations - ICLT Program
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Basic Mud Logging Guide for educational purpose
human mycosis Human fungal infections are called human mycosis..pptx

SubGraD- An Approach for Subgraph Detection

  • 1. David C. Wyld, et al. (Eds): CCSEA, SEA, CLOUD, DKMP, CS & IT 05, pp. 343–351, 2012. © CS & IT-CSCP 2012 DOI : 10.5121/csit.2012.2234 SubGraD- An Approach for Subgraph Detection Akshara Pande1 , Vivekanand Pant2, S. Nigam1 1 Human Resource Development Group, CSIR Complex, Library Avenue, Pusa, New Delhi, India. pandeakshara@gmail.com, snigam@csirhrdg.res.in 2 IBM, A-26, Sector- 62, NOIDA, Gautam Buddha Nagar, Noida - 201307, India. vickypantmnr@gmail.com ABSTRACT A new approach of graph matching is introduced in this paper, which efficiently solves the problem of graph isomorphism and subgraph isomorphism. In this paper we are introducing a new approach called SubGraD, for query graph detection in source graph. Firstly consider the model graph (query graph) and make the possible sets called model sets starting from the chosen initial node or starter. Similarly, for the source graph (reference graph), all the possible sets called reference sets could be made. Our aim is to make the reference set on the basis of the model set. If it is possible to make the reference set, then it is said that query graph has been detected in the source graph. KEYWORDS model graph, reference graph, starter, model set, reference set 1. INTRODUCTION Graphs are data structures which are proved to be effective way of representing objects [1]. One of the problems of interest, with graphs, is matching a model graph (query graph) against a reference graph (source graph). Graph matching is the process of finding a correspondence between the nodes and the edges of two graphs that satisfies some (more or less stringent) constraints ensuring that similar substructures in one graph are mapped to similar substructures in the other [2]. Many graph databases are growing rapidly in size. The growth is both in the number of graphs and the sizes of graphs (the number of nodes and the number of edges). There is a critical need for efficient and effective graph querying tools for querying and mining these growing graph databases. We are introducing a new method called SubGraD, which is helpful in graph (directed graph) mining. Here we take the model graph first and make the model sets. Suppose that there are two nodes in the model graph (query graph), and then corresponding model set will have only two elements. Then we break the reference graph in such a way that each set of reference set will have sets of two elements. We are assuming that the smallest model set is the set which has two elements corresponding to two nodes of the query graph. We are not considering the case where the self loop is present. In this paper, we are trying to make sets for reference set by using the
  • 2. 344 Computer Science & Information Technology (CS & IT) SubGraD algorithm. If reference set could be made, then query graph is present in reference graph otherwise not. Related works are discussed in section 2. In section 3, SubGraD algorithm is given. Model sets have been made in section 4. And in section 5, it is tried to make Reference sets so that we could find out whether the query graph is present on the source graph or not. 2. RELATED WORK Graph matching is the concept that has been intensively used in various applications such as in the field of social networks, road networks, network topology, chemical structures, graph grammars and semantic networks or pattern recognition. Some of the related works associated with different fields for graph mining is discussed in this section. In Fuzzy set theory, there are so many works proposed for graph matching, for example Perchant and I. Bloch [3-6] gave methods for graph matching using fuzzy set theory , and Hwan gave an idea for sub-graph matching [7]. Fuzzy attributed graph models are suggested for very different image type representations, such as fingerprint verification [8]. Fernandez and Valiente proposed a method to represent attributed relational graphs, the maximum common subgraph and the minimum common super graph of two graphs by means of simple structures [9]. Bunke [10] suggested the relationship between graph edit distance and the maximum common subgraph, graph edit distance computation is equivalent to solving the maximum common subgraph problem. Cross [11] described an outline for performing relational graph matching using genetic algorithm. He showed that Bayesian consistency measure could be proficiently optimized using a hybrid genetic search process that includes a local search strategy using a hill-climbing step. This hill- climbing step accelerates convergence considerably. Cross extended this idea in [12] that is also a convergence analysis for the problem of attributed graph matching using genetic search. Myers and Hancock gave an idea that in attributed graph matching problems usually there are more than one valid and satisfactory solution, and they proposed a method to obtain different solutions at the same time during the genetic search, using an appropriately modified genetic algorithm [13-14]. Hancock and Kittler [15-16] used an iterative approach called probabilistic relaxation, and considering binary relations and assuming a Gaussian error. Later a Bayesian perspective was considered for both unary and binary attributes by Christmas [17], Gold and Rangarajan [18], Wilson and Hancock [19-20]. Williams [21] presented a comparative study of various deterministic discrete search-strategies for graph-matching, which was based on the previous Bayesian consistency measure in [19-20] and Tabu search was proposed as a graph matching algorithm. 3. SUBGRAD ALGORITHM Graph Matching techniques are important and very general form of pattern matching that finds realistic use in areas such as image processing, pattern recognition and computer vision, graph grammars, graph transformation, bio computing, search operation in chemical structural formulae database, etc. SubGraD Algorithm is one such graph matching method. In SubGraD method, an adjacency matrix [A] is drawn for a query graph (model graph). Element Aij of [A] is 1 if an edge
  • 3. Computer Science & Information Technology (CS & IT) 345 is present from node i to node j else 0. Using [A], model set M is prepared. Model set M is a set, consisting all sets (i, j) as its element where Aij is 1. On the basis of model set we will try to make the sets for reference set. SubGraD For i: 1 to number of nodes (n) in query graph For j: 1 to number of nodes (n) in query graph Make the adjacency matrix [A] of n×n END FOR j END FOR i Make Model Set M, M has the elements of the type (a1, b1), (a1, b2),…… (an, bn) where Aab=1 For i: 1 to n For j: 1 to n M= (i, j), where Aij=1 END FOR j END FOR i For i: 1 to number of nodes (m) in source graph For j: 1 to number of nodes (m) in source graph Make the adjacency matrix [S] of m×m END FOR j END FOR i Try to make the reference set R corresponding to Model Set M Try to break adjacency matrix [S] into k sets R1, R2, ……,Rk , where Sij=1 (row i and column j has entry 1) and each Rk has the same number of elements as M has R= ( R1, R2, ……,Rk) If reference R set can be made, then query graph is detected in source graph Otherwise query graph doesn’t exist END SubGraD 4. MODEL SETS Suppose we have the query graphs as shown in Figure 1, and our purpose is to detect these query graphs in a given source graph. First step will be to draw the adjacency matrices for query graphs. In first query graph (Figure 1(a)), there are two nodes a and b, hence Adjacency matrix [A] will be 2×2. A single edge is present from a to b so the adjacency matrix [A] will have Aab entry as 1 and rest of the elements (Aaa , Aba , Abb ) as 0. Likewise, the adjacency matrices Figure 2(b), 2(c) and 2(d) are drawn for query graphs Figure 1(b), 1(c) and 1(d) respectively.
  • 4. 346 Computer Science & Information Technology (CS & IT) Next step is to prepare Model sets using these adjacency matrices. The adjacency matrix of Figure 1(a) has four elements Aaa , Aab , Aba , Abb (Figure 2(a)). The model set corresponding to this adjacency matrix is (a, b), since only Aab entry is 1. Similarly for query graph (Figure 1(b)), the model set is (a, b) and (b, c), since there are two entries which are 1 in adjacency matrix (i.e. Figure 2(b)). Model sets for query graphs of Figure 1, is shown in TABLE 1. Figure1. Query graphs Figure 2. Adjacency matrices for query graphs
  • 5. Computer Science & Information Technology (CS & IT) 347 Table 1. Model Sets for Query Graphs. 5. REFERENCE SETS Let us consider the source graph present in Figure 3, now we will try to make reference set for source graph. Firstly select the Model set, we are taking MODEL SET I (from Table 1), there is only an edge that is present from node a to node b, so we are applying SubGraD algorithm, which selects the set of row i and column j, where the entry is 1 in adjacency matrix [S], of source graph (i.e. Figure 4). That means we are trying to make reference set R which consists of set R1,R2….,Rn. In source graph for row 1, and for column 2, 3, 4 and 6 the entries are 1, so the reference set for row 1 is (1, 2),(1, 3), (1, 4) (1, 6). Similarly for row 2, the reference set is (2, 4), (2, 6), for row 3, the reference set is (3, 5), for row 4, the reference set is (4, 6), for row 5, the reference set is (5, 1) and for row 6, the reference set is (6, 5). Corresponding to MODEL SET I, the Reference set R I (Table 2) have been found. Hence the query graph is detected in source graph. Figure 3. Source graph Figure 4. Adjacency matrix of Source graph
  • 6. 348 Computer Science & Information Technology (CS & IT) Now suppose we want to find out whether query graph (Figure 1 (b)) is present or not, take the MODEL SET II from TABLE 1, i.e. (a, b), (b, c). In MODEL SET II we can see that element b is presented twice in the set. This type of element is called middle element. Actually, it is the node which has an incoming edge as well as an outgoing edge (in our case the node b has an incoming edge from node a and an outgoing edge to node c from Figure 1(b)). Now from source graph we have to make such a reference set R = (R1, R2), where R1 should have the set (i, k) and R2 should have set (k, j). Here element k is the middle element. Firstly we start from row 1 in source graph (Figure 4), S12, S13, S14 and S16 are equal to 1, that means entries from row 1 to column 2, 3, 4 and 6 is 1. So next we will take element 2, 3, 4 and 6 as middle element, and will now consider row 2, row 3, row 4 and row 6. From adjacency matrix [S] (Figure 4), we will see that which Sij entry is 1, here i is equal to 2, 3, 4 and 6. For row 2, S24 and S26 elements are 1. So in consequence there are two elements, i.e. S12 and S24, which are 1. So the reference set R is made which has an element (1, 2), (2, 4), where 2 is the middle element. Similarly other elements of the reference set can be made. So reference set for row 1 is (1, 2)(2,4), (1, 2)(2, 6), (1, 3)(3, 5), (1, 4)(4, 6), (1, 6)(6, 5). For row 2, the reference set is (2, 4) (4, 6), (2, 6) (6, 5). For row 3, the reference set is (3, 5) (5, 1). For row 4, the reference set is (4, 6) (6, 5). For row 5, the reference set is (5, 1)(1, 2), (5, 1)(1, 3), (5, 1)(1, 4), (5, 1)(1, 6). For row 6, the reference set is (6, 5) (5, 1). For row 1, 2, 3, 4, 5 and 6 we can make the reference set for MODEL SET II, hence the query graph Figure 1(b) exists in the source graph at all nodes. Reference set II for MODEL SET II can be seen by Table 2. Now suppose we want to find out whether query graph (Figure 1 (c)) is present or not, take the MODEL SET III from Table 1, i.e. (a, b) (b, c)(c, a). In MODEL SET III, we can see that there are middle elements b and c, and a cycle is formed. Cycle formation means that the starting node is also the ending node. It should be noted that the order of the set is not important, it means that, (a, b) (b, c) (c, a) is same as (a, b) (c, a) (b, c). But the former one satisfies the definition of cycle formation, so we should try to write in the order as the former one has. From adjacency matrix Sij, we could see that for row 1, S13, S35, S51,S16, S65,S51 are 1. Reference set for row 1, is (1, 3)(3, 5)(5, 1) and (1, 6)(6, 5)(5, 1). Hence for node 1, the query graph has been detected twice. Elements 1, 3 and 5 corresponds to the elements a, b, c respectively, which forms a cycle. The null entry in Table 2 denotes that for that particular node, the query graph could not be detected. It could be seen from Table 2, for node 2 there is no reference set for MODEL SET III. Hence the query graph (Figure 1 (c)), is not present at node 2 of the source graph. Similarly for node 4 of source graph, this query graph cannot be detected. For the query graph (Figure 1 (d)), again the cycle is formed. But here the model set M is consists of four sets. So now we will try to make reference set R which should have R1, R2, R3 and R4 in such a way that they should form a cycle. For example from adjacency matrix [S], it could be seen that S46, S65, S51 and S14 are 1, and they form a cycle. Hence query graph is detected at node 4. But at node 2 and node 3 it is not present. Hence if we could not able to make the reference set for particular query graph, then query graph could not be detected in the source graph.
  • 7. Computer Science & Information Technology (CS & IT) 349 Table 2. Reference sets for source graph. . 5. CONCLUSION In this paper we have introduced an efficient approach for graph matching SubGraD. We first make the model set M for query graphs with the help of adjacency matrix and then corresponding to that model set, we try to make reference set R for source graph. If reference set could be made, then query graph is detected in source graph otherwise it does not exist in the source graph. ACKNOWLEDGEMENTS I am very much thankful to Human Resource Development Group, CSIR for providing me a research internship to carry out this work. REFERENCES [1] A. Eshera and K.-S. Fu, (1986) “An image understanding using attributed symbolic representation and inexact graph matching”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(5):604–618. [2] D Conte, P Foggia, C Sansone, M Vento (2004) “Thirty years of graph matching in pattern recognition”, Int. Journal of Pattern Recognition and Artificial Intelligence 18, 265–298. [3] Perchant and I. Bloch. (1999) “A new definition for fuzzy attributed graph homomorphism with application to structural shape recognition in brain imaging”, In IMTC’99, 16th IEEE Instrumentation and Measurement Technology Conference, pages 1801–1806, Venice, Italy. [4] Perchant and I. Bloch. (2000) “Graph fuzzy homomorphism interpreted as fuzzy association graphs”, In Proceedings of the International Conference on Pattern Recognition, ICPR 2000,volume2, pages 1046-1049, Barcelona, Spain.
  • 8. 350 Computer Science & Information Technology (CS & IT) [5] Perchant and I. Bloch. (2000) “Semantic spatial fuzzy attribute design for graph modelling”, In 8th International Conference on Information Processing and Management of Uncertainty in Knowledge based Systems IPMU 2000, volume 3, pages 1397–1404, Madrid, Spain. [6] Perchant and I. Bloch. (2002) “Fuzzy morphisms between graphs”, Fuzzy Sets and Systems, 128 (2) pages 149–168. [7] Sung Hwan. (2001) “Content-based image retrieval using fuzzy multiple attribute relational graph”, IEEE International Symposium on Industrial Electronics Proceedings (ISIE 2001), 3:1508–1513. [8] Fan, C. Liu, and Y. Wang. (2000) “A randomized approach with geometric constraints to fingerprint verification”, Pattern Recognition, 33(11):1793–1803. [9] Fernandez and G. Valiente. (2001) “A graph distance metric combining maximum common subgraph and minimum common supergraph”, Pattern Recognition Letters, 22(6-7):753– 758. [10] Bunke. (1997) “On a relation between graph edit distance and maximum common subgraph” Pattern Recognition Letters, 18(8):689–694. [11] D. J. Cross, R. C. Wilson, and E. R. Hancock. (1997) “Inexact graph matching using genetic search”, Pattern Recognition, 30(6):953–970. [12] D. J. Cross, R.Myers, and E. R. Hancock. (2000) “Convergence of a hill-climbing genetic algorithm for graph matching”, Pattern Recognition, 33(11):1863–1880. [13] R. Myers and E. R. Hancock. (2000) “Genetic algorithms for ambiguous labelling problems”, Pattern Recognition, 33(4):685–704. [14] R. Myers and E. R. Hancock. (2001) “Least-commitment graph matching with genetic algorithms”, Pattern Recognition, 34(2):375–394. [15] R. Hancock and J. Kittler. (1990) “Edge-labeling using dictionary-based relaxation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(2):165–181. [16] Kittler, W. J. Christmas, and M. Petrou. (1993) “Probabilistic relaxation for matching problems in computer vision”, IEEE Proceedings of the International Conference on Computer Vision (ICCV93), pages 666–673. [17] J. Christmas, J. Kittler, and M. Petrou. (1995) “Structural matching in computer vision using probabilistic relaxation” IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(8):749– 64. [18] Gold and A. Rangarajan. (1996) “A graduated assignment algorithm for graph matching”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(4):377–88. [19] C. Wilson and E. R. Hancock. (1996) “Bayesian compatibility model for grach matching”, Pattern Recognition Letters, 17:263–276. [20] R. C. Wilson and E. R. Hancock. (1997) “Structural matching by discrete relaxation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(6):634–698. [21] Williams, R. C. Wilson, and E. R. Hancock. (1999) “Deterministic search for relational graph matching”, Pattern Recognition, 32(7):1255–1271.
  • 9. Computer Science & Information Technology (CS & IT) 351 Authors Ms. Akshara Pande is a Research Intern in HRDG, CSIR. She has done M.Sc. in Computer Science from J.K. Institute of Applied Physics & Technology, University of Allahabad, Allahabad. Her research area is Design Pattern Detection and Subgraph Detection. Mr. Vivekanand Pant is a Senior Software Engineer in IBM Noida. He has eight years industry experience. He has BTech in Information Technology from Motilal Nehru National Institute of Technology, Allahabad. He is currently working in software development area. Mr. Shailendra Nigam holds a Masters degree in Electronics Science and is currently working as a Sr. Principal Scientist in HRDG, CSIR. His research inte rests include DSP, Text to Speech Synthesis, PC Based System Design and Workflow Automation. He holds a patent and has filed many software copyrights.