SlideShare a Scribd company logo
ECWAY TECHNOLOGIES
IEEE PROJECTS & SOFTWARE DEVELOPMENTS
OUR OFFICES @ CHENNAI / TRICHY / KARUR / ERODE / MADURAI / SALEM / COIMBATORE
CELL: +91 98949 17187, +91 875487 2111 / 3111 / 4111 / 5111 / 6111
VISIT: www.ecwayprojects.com MAIL TO: ecwaytechnologies@gmail.com

CLUSTERING LARGE PROBABILISTIC GRAPHS

ABSTRACT:

We study the problem of clustering probabilistic graphs. Similar to the problem of clustering
standard graphs, probabilistic graph clustering has numerous applications, such as finding
complexes in probabilistic protein-protein interaction (PPI) networks and discovering groups of
users in affiliation networks.

We extend the edit-distance-based definition of graph clustering to probabilistic graphs. We
establish a connection between our objective function and correlation clustering to propose
practical approximation algorithms for our problem. A benefit of our approach is that our
objective function is parameter-free. Therefore, the number of clusters is part of the output.

We develop methods for testing the statistical significance of the output clustering and study the
case of noisy clusterings. Using a real protein-protein interaction network and ground-truth data,
we show that our methods discover the correct number of clusters and identify established
protein relationships. Finally, we show the practicality of our techniques using a large social
network of Yahoo! users consisting of one billion edges.

More Related Content

PPTX
Application's of Numerical Math in CSE
PDF
Master's degree thesis testing algorithms for image & video understanding
PPTX
Finding Maximum Edge Biclique in Bipartite Networks by Integer Programming
PPTX
Learning to learn with meta learning
PDF
Bellman Equation in Dynamic Programming
DOCX
CONFLICT-AWARE WEIGHTED BIPARTITE B-MATCHING AND ITS APPLICATION TO E-COMMERCE
PPTX
Numerical Integral using NNI
Application's of Numerical Math in CSE
Master's degree thesis testing algorithms for image & video understanding
Finding Maximum Edge Biclique in Bipartite Networks by Integer Programming
Learning to learn with meta learning
Bellman Equation in Dynamic Programming
CONFLICT-AWARE WEIGHTED BIPARTITE B-MATCHING AND ITS APPLICATION TO E-COMMERCE
Numerical Integral using NNI

What's hot (8)

PDF
Fast activity detection indexing for temporal stochastic automaton based acti...
PDF
Ideas on Machine Learning Interpretability
PPTX
PPT
Dexa2007 Orsi V1.5
PDF
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
PDF
Introduction to Model-Based Machine Learning
PPT
1 00-introduction to computer graphics
PDF
Slides - Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Fast activity detection indexing for temporal stochastic automaton based acti...
Ideas on Machine Learning Interpretability
Dexa2007 Orsi V1.5
MediaEval 2015 - Geo_ML @ MediaEval Placing Task 2015
Introduction to Model-Based Machine Learning
1 00-introduction to computer graphics
Slides - Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Ad

Viewers also liked (20)

PDF
BALA RESUME
PDF
Überwerke 2016 - Kaapeli
PPTX
Ute. la diversidad en el aula
PDF
é Possível ganhar dinheiro com blogs seu negocio online na pratica com tec...
DOC
WASIM CV FM--A
DOCX
2013 ieee embedded projects
PDF
sols_print_enrolment_record
PDF
Business Intelligence
PPTX
Roles and Functions of Educational Technology in the 21st century Education
PDF
Nebosh IGC
DOCX
2013 ieee .net project titles
PPTX
Answer keys
PPTX
EXERCISE IS MEDICINE - final
DOCX
Estrategia continuidad
PDF
YouTube Tools To The Rescue - Tots & Technology 2015
PPTX
Media pembelajaran
DOCX
2013 ieee matlab project titles
PDF
Sample paper gat
DOCX
RUBRICA ESTUDIANTES
PDF
Covering points of interest with mobile sensors
BALA RESUME
Überwerke 2016 - Kaapeli
Ute. la diversidad en el aula
é Possível ganhar dinheiro com blogs seu negocio online na pratica com tec...
WASIM CV FM--A
2013 ieee embedded projects
sols_print_enrolment_record
Business Intelligence
Roles and Functions of Educational Technology in the 21st century Education
Nebosh IGC
2013 ieee .net project titles
Answer keys
EXERCISE IS MEDICINE - final
Estrategia continuidad
YouTube Tools To The Rescue - Tots & Technology 2015
Media pembelajaran
2013 ieee matlab project titles
Sample paper gat
RUBRICA ESTUDIANTES
Covering points of interest with mobile sensors
Ad

More from Ecway2004 (20)

PDF
Covering points of interest with mobile sensors
PDF
Coloring based inter-wban scheduling for mobile wireless body area networks
DOC
Code modulation based encryption & decryption technique for secure communicat...
PDF
Clustering sentence level text using a novel fuzzy relational clustering algo...
PDF
Cloudsim t-drive enhancing driving directions with taxi drivers’ intelligence
PDF
Cloudsim ranking on data manifold with sink points
PDF
Cloudsim quality-differentiated video multicast in multirate wireless networks
PDF
Cloudsim power allocation for statistical qo s provisioning in opportunistic...
PDF
Cloudsim distributed web systems performance forecasting using turning bands...
PDF
Cloudsim distributed processing of probabilistic top-k queries in wireless s...
DOCX
Civil 2013 titles
DOC
Chopper based dc motor speed control
PDF
Channel assignment for throughput optimization in multichannel multiradio wir...
PDF
Channel allocation and routing in hybrid multichannel multiradio wireless mes...
PDF
Casual stereoscopic photo authoring
DOCX
Casual stereoscopic photo authoring
PDF
Capacity of hybrid wireless mesh networks with random a ps
DOC
Bomb detection robot with wireless camera
DOC
Bed side patients monitoring system with emergency alert
PDF
Autonomous sensing order selection strategies exploiting channel access infor...
Covering points of interest with mobile sensors
Coloring based inter-wban scheduling for mobile wireless body area networks
Code modulation based encryption & decryption technique for secure communicat...
Clustering sentence level text using a novel fuzzy relational clustering algo...
Cloudsim t-drive enhancing driving directions with taxi drivers’ intelligence
Cloudsim ranking on data manifold with sink points
Cloudsim quality-differentiated video multicast in multirate wireless networks
Cloudsim power allocation for statistical qo s provisioning in opportunistic...
Cloudsim distributed web systems performance forecasting using turning bands...
Cloudsim distributed processing of probabilistic top-k queries in wireless s...
Civil 2013 titles
Chopper based dc motor speed control
Channel assignment for throughput optimization in multichannel multiradio wir...
Channel allocation and routing in hybrid multichannel multiradio wireless mes...
Casual stereoscopic photo authoring
Casual stereoscopic photo authoring
Capacity of hybrid wireless mesh networks with random a ps
Bomb detection robot with wireless camera
Bed side patients monitoring system with emergency alert
Autonomous sensing order selection strategies exploiting channel access infor...

Clustering large probabilistic graphs

  • 1. ECWAY TECHNOLOGIES IEEE PROJECTS & SOFTWARE DEVELOPMENTS OUR OFFICES @ CHENNAI / TRICHY / KARUR / ERODE / MADURAI / SALEM / COIMBATORE CELL: +91 98949 17187, +91 875487 2111 / 3111 / 4111 / 5111 / 6111 VISIT: www.ecwayprojects.com MAIL TO: ecwaytechnologies@gmail.com CLUSTERING LARGE PROBABILISTIC GRAPHS ABSTRACT: We study the problem of clustering probabilistic graphs. Similar to the problem of clustering standard graphs, probabilistic graph clustering has numerous applications, such as finding complexes in probabilistic protein-protein interaction (PPI) networks and discovering groups of users in affiliation networks. We extend the edit-distance-based definition of graph clustering to probabilistic graphs. We establish a connection between our objective function and correlation clustering to propose practical approximation algorithms for our problem. A benefit of our approach is that our objective function is parameter-free. Therefore, the number of clusters is part of the output. We develop methods for testing the statistical significance of the output clustering and study the case of noisy clusterings. Using a real protein-protein interaction network and ground-truth data, we show that our methods discover the correct number of clusters and identify established protein relationships. Finally, we show the practicality of our techniques using a large social network of Yahoo! users consisting of one billion edges.