SlideShare a Scribd company logo
Graph-based Analysis and
Opinion Mining in Social
Network
- Opinion about an entity
- Groups of entities

Khan Mostafa
Graduate Student (Computer Science)
Stony Brook University
positive
negative
objective

entity

keywords
subjective vs. objective
DT, -0.016
CD, -0.033
NNP, -0.052
FW, -0.060
USR, -0.072
SYM, -0.081
JJS, -0.085
WP, -0.098
URL, -0.103
RBS, -0.123
PDT, -0.143
WP$, -0.200
POS, -0.231

SUBJECTIVITY
WRB, 0.164
VBN, 0.140
VBD, 0.128
RB, 0.100
RP, 0.096
TO, 0.081
VBP, 0.078
PRP, 0.072
PRP$, 0.061
CC, 0.054
MD, 0.052
EX, 0.033
VBZ, 0.028
NNPS, 0.025
VBG, 0.017
WDT, 0.016
RBR, 0.012
JJ, 0.010
NNS, 0.008
IN, 0.006
JJR, 0.005
NN, 0.003
UH, 0.002
VB, 0.000
LS, 0.000
VB, 0.000
UH, -0.004
NN, -0.007
JJR, -0.010
IN, -0.012
NNS, -0.015
JJ, -0.019
RBR, -0.024
WDT, -0.031
VBG, -0.034
NNPS, -0.050
VBZ, -0.055
EX, -0.064
MD, -0.099
CC, -0.102
PRP$, -0.114
PRP, -0.135
VBP, -0.144
TO, -0.149
RP, -0.175
RB, -0.182
VBD, -0.227
VBN, -0.245
WRB, -0.282

PDT, 0.333
RBS, 0.280
URL, 0.229
WP, 0.217
JJS, 0.187
SYM, 0.176
USR, 0.155
FW, 0.127
NNP, 0.110
CD, 0.068
DT, 0.032

BIAS
POS, 0.600
WP$, 0.500

PoS distribution

Polarity Scorer
PoS based

N-gram based

n-gram
'enjoying break'
'happy birthday'
'so happy'
'follow back'
'miss my'
'no one notices'
'notices my'
'good day'
'follow please'
'my phone'
'presenting emotional'
'please follow'
'follow love'
'am sorry'
'so sad'
'miss u'
'new followers'
Positive
1
22
106
10
93
97
97
5
47
64
60
11
17
71
71
65
53
Negative
328
207
53
132
10
4
1
82
38
18
20
66
60
4
3
7
17

Polarity
score

positive vs. negative

Objective
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
tweet
entities

Proper nouns

keywords

Let data decide

polarity score
<TEkwPs>
<TEkwP i="699" pScore="0.460692807729435" marker="positive">
<T>? la familia RT @sheriishirlz: Lust was turnt up!!! @DeeA
llova always takes care of Shirlz ? </T>
<E>Sheriishirlz,DeeAllova,Shirlz,</E>
<kw>turnt,up,always,takes,</kw>
</TEkwP>
<TEkwP i="701" pScore="-0.316666516666734" marker="negative">
<T>@IanSmall4 @acorns47 @newmelinda I'll second that, Ian. <
/T>
<E>IanSmall4 Acorns47 Newmelinda,Ian,</E>
<kw>second,</kw>
</TEkwP>
<TEkwP i="706" pScore="0.35" marker="positive">
<T>@ManMadeMoon is pa bear having a do ? enjoy and have a be
er at my fav dive bar,doc holidays on 1st ave. it's the Star War
s bar on crack ? </T>
<E>ManMadeMoon,pa,Star Wars,</E>
<kw>do,enjoy,fav,1st,</kw>
</TEkwP>
<TEkwP i="711" pScore="-0.535463140011847" marker="negative">
<T>Photo: 90percentunrelated: I know I just included this in
that last picture set. But, I like it and this is... http://t.c
o/E8CmT1In5L </T>
<E>Photo,</E>
<kw>know,just,included,last,like,</kw>
</TEkwP>
</TEkwPs>
- Opinion about an entity

word

Overall polarity score
Keyword describing it
<opinion entity=Kyles'>
<score>0.2</score>
<analysis
post-count=‘500'
percent-positive='52.03'
percent-negative='24.59'/>
<keywords count="3">calls,
compelling, familiar</keywords>
</opinion>
1.5

1.5

1

1

0.5

0.5

0

0
0

-0.5

1

2

3

4

5

6

7

-0.5

-1

-1

-1.5
-1.5

Distribution of Polarity Score over entire
entity space

Polarity Score over ln(Occurance) of
entities

8
E×kw bigraph
E
tweet
entities
keywords
polarity score

E

E

E

E




pScore

E

kw
kw
kw
kw

E×kw bigraph such that,


kw

kw

weight

9
8

There exists an edge between Ei and kwj if there is one or more tweet that contains Ei
and kwj
The edge has a weight indicating co-occurrence of Ei and kwj. i.e.
weightij = Count ({Tk | Ei ∈ Tk.E ∧ kwj∈ Tk.kw})
The edge has pScore that is average of pScore (=P) for all such occurrences. i.e.
pScore =
Sum({Tk .pScore| Ei ∈ Tk.E ∧ kwj∈ Tk.kw})/weight

After this, a filter will be run on this graph to eliminate those links that exist between entity and
keyword where the keyword is not enough descriptive of the entity. This is done, by calculating
freq such that,

7
6
5
4
3
2
1
0
0

2000

4000

ln(Occurance)

6000

8000

10

10000

12000

14000

9
8
7
6

freqij = weightij/ Occurrence (Ei)

5
4

If freqij is smaller than certain threshold, εfreq then that keyword is filtered out for this entity Ei.

3
2
1
0
0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000
E×E graph
pScore1

E1

weight1

E1

E2

kw

kw

pScore2
weight2

pScore1

E2

pScore2

E×E graph, such that, there exists an edge between Ei and Ej if




Occurrence(Ei)> εeo ∧ Occurrence(Ej)> εeo
{kw(Ei) | Occurrence(kwx)< εkwo} ⋂ {kw(Ej) | Occurrence(kwx)< εkwo} is not empty
Polarity bias for both are similar

If a potential word occur in description of most entities then that is not an keyword but is a generic term
Very big graph, with lots of no edges!
Never built it
Entities with neighbors,
But not event this one is built.
Filtered entities with very few neighbors
Keyword from data
Tweets
analysis

E×kw
bigraph

E×E
graph

Community
detection

PoS
tagging

remove
low freq.

remove
high freq

consolidate
members

Preliminary
set

Legitimate
keywords

No generic
word as kw

Final set on
keywords

JJ, RB, VB

NN ?

Occurrence(kwx)< εkwo

size:= size of community
:= number of entities in it
Threshold := ln(size)
If (Occcurance(kw)< Threshold)
then Remove(kw)
- Groups of entities

<Communities>
<Community id="1" size="26"
conductance="0.524193548387097"
pScore="0.30589296726828">
<trapped-keywords count="8">
turning:6,single:20,Download:12,using:56,
working:6,cut:9,acting:4,crazy:6,
</trapped-keywords>
<e>Rollsroycerizzy</e><e>ZexyZek</e><e>Minecraft Trolling</e>
<e>Mediafire</e><e>Tool</e><e>Uptime24/7</e><e>IbottaApp</e>
<e>Vlambeer RADICAL FISHING</e><e>PE</e><e>Obamacare Website</e>
<e>HDM</e><e>Asuu Strike</e><e>Stevie</e><e>UH UH</e>
<e>Waze</e><e>TemmyAFC</e><e>JESU</e><e>Yuri</e>
<e>Shaq</e><e>Yourmaintopicc</e><e>Ones</e><e>CFB</e>
<e>Yotpo</e><e>2xAwesome</e><e>Urbanaira</e>
</Community>
Sample 1
Tweets
Time to analyze each
Build Bigraph
Generate EE graph
Time to Find Groups
Groups count
Largest Group size
Significant Entities
Legitimate Keywords

160711
48.91s
9.29s
1.54s
0.126s
157
136
1378
14997

Kw threshold
350
350
Minimum nodes
2
2
Common Noun as false
true
keyword
Potential kw
15108 31593
Legitimate kw
14967 31368
Entities
97147 97147
E occurring > 2
7580 7580
Significant E.
1190 2012
Groups
170
92
Largest size
70
1256

Large
Sample
485447
148.53s
34.24s
3.49s
0.310s
334
183
2627
25818

Very large
Sample
847276
262.01s
66.45
4.99s
0.358s
457
162
3560
35005

450
2
false
15108
14997
97147
7580
1378
157
136

polarity invariant version generated 174 groups
with largest group of size 598 for 1854
significant entities. Generated groups are also
significantly different.
thanks

To see result sets please visit, http://meaningofdata.com/mining/

More Related Content

PPTX
Individuals in school first half
PDF
Graph-based Analysis and Opinion Mining in Social Network
PPTX
Computing Social Score of Web Aritfacts
PDF
ArXiv Literature Exploration using Social Network Analysis
PDF
On the many graphs of the Web and the interest of adding their missing links.
PDF
C017141317
PDF
Feature Based Semantic Polarity Analysis Through Ontology
PDF
Twitter Sentiment Analysis
Individuals in school first half
Graph-based Analysis and Opinion Mining in Social Network
Computing Social Score of Web Aritfacts
ArXiv Literature Exploration using Social Network Analysis
On the many graphs of the Web and the interest of adding their missing links.
C017141317
Feature Based Semantic Polarity Analysis Through Ontology
Twitter Sentiment Analysis

Similar to Project Presentation: Graph-based Analysis and Opinion Mining in Social Network (20)

PPTX
Natural Language Processing in R (rNLP)
PPT
Semantic opinion mining ontology
PDF
Secured Ontology Mapping
PDF
Knowledge discoverylaurahollink
PPTX
Hacktoberfest 2020 - Intro to Knowledge Graphs
PDF
IGDTUW workshop
PDF
APPLICATION OF CLUSTERING TO ANALYZE ACADEMIC SOCIAL NETWORKS
PPTX
Learning to Classify Users in Online Interaction Networks
PPT
Collaborative Ontology Building Project
PPTX
Using Knowledge Graph for Promoting Cognitive Computing
PPTX
Semantic Data Retrieval: Search, Ranking, and Summarization
PDF
Opinion mining for social media
PPT
ESSIR 2013 - IR and Social Media
PDF
Extraction of common conceptual components from multiple ontologies
PDF
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
PPT
Trust influence and social media
PPT
Can you trust everything?
ODP
Research on collaborative information sharing systems
PDF
Building and using ontologies
PDF
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
Natural Language Processing in R (rNLP)
Semantic opinion mining ontology
Secured Ontology Mapping
Knowledge discoverylaurahollink
Hacktoberfest 2020 - Intro to Knowledge Graphs
IGDTUW workshop
APPLICATION OF CLUSTERING TO ANALYZE ACADEMIC SOCIAL NETWORKS
Learning to Classify Users in Online Interaction Networks
Collaborative Ontology Building Project
Using Knowledge Graph for Promoting Cognitive Computing
Semantic Data Retrieval: Search, Ranking, and Summarization
Opinion mining for social media
ESSIR 2013 - IR and Social Media
Extraction of common conceptual components from multiple ontologies
Using A Distributed Graph Database To Make Sense Of Disparate Data Stores
Trust influence and social media
Can you trust everything?
Research on collaborative information sharing systems
Building and using ontologies
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
Ad

More from Khan Mostafa (13)

PDF
Research in the Computing Industry
PDF
Semantic matchmaking Local Closed-World Reasoning
PDF
Survey on real media paint simulation in Computer Graphics
PDF
Seminal works on watercolor painting simulation
PDF
Reaction Paper Discussing Articles in Fields of Outlier Detection & Sentiment...
PDF
A Survey on Sentiment Mining Techniques
PPTX
The Career (CSE)
PPTX
RDF by Structured Reference to Semantics, the RS2 framework
PDF
Study Tour (KUET CSE 2k5) Poster
PDF
Traffic Jam Detection System by Ratul, Sadh, Shams
PPTX
Open Document Format
PPTX
GPU Computing
PPTX
An Approach To Emerge Web 3.0
Research in the Computing Industry
Semantic matchmaking Local Closed-World Reasoning
Survey on real media paint simulation in Computer Graphics
Seminal works on watercolor painting simulation
Reaction Paper Discussing Articles in Fields of Outlier Detection & Sentiment...
A Survey on Sentiment Mining Techniques
The Career (CSE)
RDF by Structured Reference to Semantics, the RS2 framework
Study Tour (KUET CSE 2k5) Poster
Traffic Jam Detection System by Ratul, Sadh, Shams
Open Document Format
GPU Computing
An Approach To Emerge Web 3.0
Ad

Recently uploaded (20)

PDF
KodekX | Application Modernization Development
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Spectroscopy.pptx food analysis technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Big Data Technologies - Introduction.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
KodekX | Application Modernization Development
Reach Out and Touch Someone: Haptics and Empathic Computing
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Programs and apps: productivity, graphics, security and other tools
Encapsulation_ Review paper, used for researhc scholars
Mobile App Security Testing_ A Comprehensive Guide.pdf
sap open course for s4hana steps from ECC to s4
Spectroscopy.pptx food analysis technology
Review of recent advances in non-invasive hemoglobin estimation
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
MIND Revenue Release Quarter 2 2025 Press Release
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
Big Data Technologies - Introduction.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Understanding_Digital_Forensics_Presentation.pptx
Network Security Unit 5.pdf for BCA BBA.
Advanced methodologies resolving dimensionality complications for autism neur...

Project Presentation: Graph-based Analysis and Opinion Mining in Social Network

  • 1. Graph-based Analysis and Opinion Mining in Social Network - Opinion about an entity - Groups of entities Khan Mostafa Graduate Student (Computer Science) Stony Brook University
  • 3. subjective vs. objective DT, -0.016 CD, -0.033 NNP, -0.052 FW, -0.060 USR, -0.072 SYM, -0.081 JJS, -0.085 WP, -0.098 URL, -0.103 RBS, -0.123 PDT, -0.143 WP$, -0.200 POS, -0.231 SUBJECTIVITY WRB, 0.164 VBN, 0.140 VBD, 0.128 RB, 0.100 RP, 0.096 TO, 0.081 VBP, 0.078 PRP, 0.072 PRP$, 0.061 CC, 0.054 MD, 0.052 EX, 0.033 VBZ, 0.028 NNPS, 0.025 VBG, 0.017 WDT, 0.016 RBR, 0.012 JJ, 0.010 NNS, 0.008 IN, 0.006 JJR, 0.005 NN, 0.003 UH, 0.002 VB, 0.000 LS, 0.000 VB, 0.000 UH, -0.004 NN, -0.007 JJR, -0.010 IN, -0.012 NNS, -0.015 JJ, -0.019 RBR, -0.024 WDT, -0.031 VBG, -0.034 NNPS, -0.050 VBZ, -0.055 EX, -0.064 MD, -0.099 CC, -0.102 PRP$, -0.114 PRP, -0.135 VBP, -0.144 TO, -0.149 RP, -0.175 RB, -0.182 VBD, -0.227 VBN, -0.245 WRB, -0.282 PDT, 0.333 RBS, 0.280 URL, 0.229 WP, 0.217 JJS, 0.187 SYM, 0.176 USR, 0.155 FW, 0.127 NNP, 0.110 CD, 0.068 DT, 0.032 BIAS POS, 0.600 WP$, 0.500 PoS distribution Polarity Scorer PoS based N-gram based n-gram 'enjoying break' 'happy birthday' 'so happy' 'follow back' 'miss my' 'no one notices' 'notices my' 'good day' 'follow please' 'my phone' 'presenting emotional' 'please follow' 'follow love' 'am sorry' 'so sad' 'miss u' 'new followers' Positive 1 22 106 10 93 97 97 5 47 64 60 11 17 71 71 65 53 Negative 328 207 53 132 10 4 1 82 38 18 20 66 60 4 3 7 17 Polarity score positive vs. negative Objective 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
  • 4. tweet entities Proper nouns keywords Let data decide polarity score <TEkwPs> <TEkwP i="699" pScore="0.460692807729435" marker="positive"> <T>? la familia RT @sheriishirlz: Lust was turnt up!!! @DeeA llova always takes care of Shirlz ? </T> <E>Sheriishirlz,DeeAllova,Shirlz,</E> <kw>turnt,up,always,takes,</kw> </TEkwP> <TEkwP i="701" pScore="-0.316666516666734" marker="negative"> <T>@IanSmall4 @acorns47 @newmelinda I'll second that, Ian. < /T> <E>IanSmall4 Acorns47 Newmelinda,Ian,</E> <kw>second,</kw> </TEkwP> <TEkwP i="706" pScore="0.35" marker="positive"> <T>@ManMadeMoon is pa bear having a do ? enjoy and have a be er at my fav dive bar,doc holidays on 1st ave. it's the Star War s bar on crack ? </T> <E>ManMadeMoon,pa,Star Wars,</E> <kw>do,enjoy,fav,1st,</kw> </TEkwP> <TEkwP i="711" pScore="-0.535463140011847" marker="negative"> <T>Photo: 90percentunrelated: I know I just included this in that last picture set. But, I like it and this is... http://t.c o/E8CmT1In5L </T> <E>Photo,</E> <kw>know,just,included,last,like,</kw> </TEkwP> </TEkwPs>
  • 5. - Opinion about an entity word Overall polarity score Keyword describing it <opinion entity=Kyles'> <score>0.2</score> <analysis post-count=‘500' percent-positive='52.03' percent-negative='24.59'/> <keywords count="3">calls, compelling, familiar</keywords> </opinion> 1.5 1.5 1 1 0.5 0.5 0 0 0 -0.5 1 2 3 4 5 6 7 -0.5 -1 -1 -1.5 -1.5 Distribution of Polarity Score over entire entity space Polarity Score over ln(Occurance) of entities 8
  • 6. E×kw bigraph E tweet entities keywords polarity score E E E E   pScore E kw kw kw kw E×kw bigraph such that,  kw kw weight 9 8 There exists an edge between Ei and kwj if there is one or more tweet that contains Ei and kwj The edge has a weight indicating co-occurrence of Ei and kwj. i.e. weightij = Count ({Tk | Ei ∈ Tk.E ∧ kwj∈ Tk.kw}) The edge has pScore that is average of pScore (=P) for all such occurrences. i.e. pScore = Sum({Tk .pScore| Ei ∈ Tk.E ∧ kwj∈ Tk.kw})/weight After this, a filter will be run on this graph to eliminate those links that exist between entity and keyword where the keyword is not enough descriptive of the entity. This is done, by calculating freq such that, 7 6 5 4 3 2 1 0 0 2000 4000 ln(Occurance) 6000 8000 10 10000 12000 14000 9 8 7 6 freqij = weightij/ Occurrence (Ei) 5 4 If freqij is smaller than certain threshold, εfreq then that keyword is filtered out for this entity Ei. 3 2 1 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
  • 7. E×E graph pScore1 E1 weight1 E1 E2 kw kw pScore2 weight2 pScore1 E2 pScore2 E×E graph, such that, there exists an edge between Ei and Ej if    Occurrence(Ei)> εeo ∧ Occurrence(Ej)> εeo {kw(Ei) | Occurrence(kwx)< εkwo} ⋂ {kw(Ej) | Occurrence(kwx)< εkwo} is not empty Polarity bias for both are similar If a potential word occur in description of most entities then that is not an keyword but is a generic term
  • 8. Very big graph, with lots of no edges! Never built it
  • 9. Entities with neighbors, But not event this one is built.
  • 10. Filtered entities with very few neighbors
  • 11. Keyword from data Tweets analysis E×kw bigraph E×E graph Community detection PoS tagging remove low freq. remove high freq consolidate members Preliminary set Legitimate keywords No generic word as kw Final set on keywords JJ, RB, VB NN ? Occurrence(kwx)< εkwo size:= size of community := number of entities in it Threshold := ln(size) If (Occcurance(kw)< Threshold) then Remove(kw)
  • 12. - Groups of entities <Communities> <Community id="1" size="26" conductance="0.524193548387097" pScore="0.30589296726828"> <trapped-keywords count="8"> turning:6,single:20,Download:12,using:56, working:6,cut:9,acting:4,crazy:6, </trapped-keywords> <e>Rollsroycerizzy</e><e>ZexyZek</e><e>Minecraft Trolling</e> <e>Mediafire</e><e>Tool</e><e>Uptime24/7</e><e>IbottaApp</e> <e>Vlambeer RADICAL FISHING</e><e>PE</e><e>Obamacare Website</e> <e>HDM</e><e>Asuu Strike</e><e>Stevie</e><e>UH UH</e> <e>Waze</e><e>TemmyAFC</e><e>JESU</e><e>Yuri</e> <e>Shaq</e><e>Yourmaintopicc</e><e>Ones</e><e>CFB</e> <e>Yotpo</e><e>2xAwesome</e><e>Urbanaira</e> </Community>
  • 13. Sample 1 Tweets Time to analyze each Build Bigraph Generate EE graph Time to Find Groups Groups count Largest Group size Significant Entities Legitimate Keywords 160711 48.91s 9.29s 1.54s 0.126s 157 136 1378 14997 Kw threshold 350 350 Minimum nodes 2 2 Common Noun as false true keyword Potential kw 15108 31593 Legitimate kw 14967 31368 Entities 97147 97147 E occurring > 2 7580 7580 Significant E. 1190 2012 Groups 170 92 Largest size 70 1256 Large Sample 485447 148.53s 34.24s 3.49s 0.310s 334 183 2627 25818 Very large Sample 847276 262.01s 66.45 4.99s 0.358s 457 162 3560 35005 450 2 false 15108 14997 97147 7580 1378 157 136 polarity invariant version generated 174 groups with largest group of size 598 for 1854 significant entities. Generated groups are also significantly different.
  • 14. thanks To see result sets please visit, http://meaningofdata.com/mining/