SlideShare a Scribd company logo
What is the link and text doing here: A Case Study of Cyworld Minihompies in KoreaSteven Sams and Han Woo Park
BackgroundThis study analyses user-generated comments posted to Korean politicians on SNS Cyworld that contain a URLThe study examines the type of service being linked to    through the URL and determines the frequency of           servicesA developed program captures all comments given to a   selected set of politicians within a predefined timeframeThe text component of messages is analyzed using two    separate machine-learning mechanisms
Types of HyperlinksFive social functions that hyperlinks can be said to performInformation ProvisionNetwork StrengtheningIdentity BuildingAudience SharingMessage AmplificationAckland et al. (2010)
Online Korean Political SphereAs in other countries, Korean politicians are increasingly  turning to social networks as a means to engage with      their electorateIn 2007 Cyworld commanded a penetration rate of one third of the total population of South Korea, and since then all indications are that this proportion has increased.
SampleOne hundred and thirty Korean National Assembly        Members’ Cyworld Minihomies.The date parameters of the study were April 2008 – June 2009One hundred and fifty three thousand six hundred and    two comments were collected for period chosen for the     study. One thousand two hundred and seventy six comments    contained links
Data Collection MethodA program was developed that performs HTTP call to     request one page of comments from the politician’s         visitor boardThe content and date are isolated and held in temporary storage. The process repeats until the target date parameters have been met.
Data Analysis Method: LinksThe links are checked to determine the number of unique URLs and corresponding number of unique domains.       These links / domains are then manually categorised into website type, such as portals, media, parties, homepages  of politicians, petition sites, online fan clubs, and NGOs)Location of service found using network query tool to determine the proportion of domestic and international     websites
Data Analysis Method: TextTo analyse a large body of text, Natural Language            Processing (NLP) is one approach to categorisation that  can mitigate the problem of obtaining accurate results     that is unfeasible to perform manuallyA rudimentary Java class was developed that wrapped a small subset of the methods provided in the LingPipe API so that they could be called on the extracted text comments.The developed Java class enabled two forms of analysis: Sentiment Analysis and Collocation
Sentiment AnalysisA polarity analyser was developed that is able to locate   significant word combinations and, using the developed    corpus model as a training dataset, determine if the         combination is generally positive or negativeAn accessible corpus of positive and negative sentiment composed in Korean has yet to be realized.A sample body of 2000 Korean text statements were      coded into objective,  subjective - positive and subjective - negative categories
Collocation	Collocation analysis can determine which tokens are      more frequently found together than would normally be          expected. Collocation can identify proper nouns in this   way (such as the names or persons, places, or events) that would be lost if the frequency of each token were           analysed in isolation.
Results - Links153,602 comments were collected for period chosen for the study1,276 comments contained hyperlinksTotal link count was 1,920 as it was common to have     more than one hyperlink contained within an individual   posting	762 were unique full URLs and 259 were unique domains1,849 URLs encountered in the sample were found to belong to services based in Korea and 71 from international servicePerforming message amplification and network building were prominent causes of link posting
Table 1: LexiURL Unique / Full hostsBased on the top 10 domains (24.5%) by occurrence out of 259
Table 2: LexiURL Unique / Full URLs
Table 3: Total links to each domain (Korea)Based on 1,078 (58.3%) of 1,849 links to Korean services
Table 4: Total links to each domain (Overseas)Based on 51 (71.8%) of 71 links to overseas services
Table 5: poster-gender and politician background
Table 6: Comments categorized by link type from the six groups of gender and political affiliationTable 6: Comments categorized by link type from the six groups of gender and political affiliationBased on 206 comments agreed on by both coders from the initial set of 300
Results - TextMay and June 2008 were found to have high numbers of comments containing links that showed negative sentiment, and this date corresponds with the period of the candlelight protestMay 2009 also shows large numbers of comments containing  hyperlinks that indicate negative sentiment, coinciding with the suicide of ex-president Roh Moo-HyunThe name of Korean President Lee Myung-bak was found to   occur two hundred and twenty nine timesTerms pertaining to the candlelight protests, such as Mad Cow disease, beef, American goods, and candlelight protest occurred frequentlyGini coefficient and a less formal term describing a similar measurement of wealth occurred frequently
Figure 1. Positive and negative sentiment from comments containing links
Confidence Levels	To determine the effectiveness of the classification          approach, 10% of training data was removed from the      training set and used to evaluate the developed model.    This approach allows testing the classification based on   known human-classified data. The Average Conditional     Probability score provides a basis for determining the      ability of the classifier to correctly identify positive and    negative sentiment.  Based on the training set used, the    Average Conditional Probability was found to be 87%.
LimitationsLess than 1% of all comments posted to the sample of politicians and indicates that although previous studies have shown how links can support communication in SNSs, their frequency in the Korean online political environment remains rareComments deleted over the period of the study may omit the full extent of negative sentiment towards politiciansThe practice of deleting content in Korea has been found to be less constrained by social norms than found in Western SNSs, such as FacebookLegal mechanisms also exist in Korea to encourage the removal of negative content during election periods
ConclusionLinks are almost solely targeted to Korean domestic services,  and   the few that do point to overseas sites are usually related in some  way to domestic issues in KoreaMales are marginally more likely to comment on Cyworld              Minihompies using links than females, and those Minihompies        managed by ruling politicians were found to be of greater               prominence than those of the opposition partiesMessage Amplification and Network Building were found to be the            dominant purpose for submitting links within user-generated           comments. Using two forms of machine-based learning algorithms, sentiment analysis   and collocation of significant phrases, revealed primarily        negative sentiment towards President Lee and his role in the           reintroduction of American beef  imports.  Issues surrounding the    suicide of ex-President Roh suggested anger towards those who    were seen to be harassing him prior to his death
Acknowledgement	Research for this paper has been supported by the World Class University (WCU) program through the National Research Foundation of Korea, which is funded by the Ministry of Education, Science and Technology (No. 515-82-06574).
Thank you

More Related Content

PDF
Daniel Preotiuc-Pietro - 2017 - Beyond Binary Labels: Political Ideology Pred...
PPTX
Tracking web visibility of Korean politicians
PPTX
9th triplehelix: Web visibility on political innovation system
PPTX
06 Network Study Design: Ethical Considerations and Safeguards
PPTX
The Structural Relationship between Politicians' Web Visibility and Political...
PDF
03 Ego Network Analysis
PPT
Aspects of broad folksonomies
PDF
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data
Daniel Preotiuc-Pietro - 2017 - Beyond Binary Labels: Political Ideology Pred...
Tracking web visibility of Korean politicians
9th triplehelix: Web visibility on political innovation system
06 Network Study Design: Ethical Considerations and Safeguards
The Structural Relationship between Politicians' Web Visibility and Political...
03 Ego Network Analysis
Aspects of broad folksonomies
Altmetrics: Listening & Giving Voice to Ideas with Social Media Data

What's hot (18)

PPT
INFO4990_Hossain
PPTX
18th home blog_twitter_English (12OCT2010)
PDF
An evolutionary approach to comparative analysis of detecting Bangla abusive ...
PPTX
02 Network Data Collection
PPT
01 Introduction to Networks Methods and Measures
PDF
Liao and petzold opensym berlin wikipedia geolinguistic normalization
PDF
Mapping big data science
PDF
09 Respondent Driven Sampling and Network Sampling with Memory
PPTX
11 Network Experiments and Interventions
PPT
presentation29
PPTX
TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’
PDF
Social listening: how to do it and how to use (SNA Perspective)
PPT
Power Of Online Conversation 2009.05.01
PDF
Data collection thru social media
PPTX
04 Ego Network Analysis
PDF
IJSRED-V2I2P09
PPTX
12 Network Experiments and Interventions: Studying Information Diffusion and ...
PDF
“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)
INFO4990_Hossain
18th home blog_twitter_English (12OCT2010)
An evolutionary approach to comparative analysis of detecting Bangla abusive ...
02 Network Data Collection
01 Introduction to Networks Methods and Measures
Liao and petzold opensym berlin wikipedia geolinguistic normalization
Mapping big data science
09 Respondent Driven Sampling and Network Sampling with Memory
11 Network Experiments and Interventions
presentation29
TCI 2015 What Do Links Mean in Innovation Clusters? ‘Relational Dialectics’
Social listening: how to do it and how to use (SNA Perspective)
Power Of Online Conversation 2009.05.01
Data collection thru social media
04 Ego Network Analysis
IJSRED-V2I2P09
12 Network Experiments and Interventions: Studying Information Diffusion and ...
“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)
Ad

Viewers also liked (7)

PDF
Image text duke_political_conference(25_may2010)presentation
PPTX
웹보메트릭스와 계량정보학11 1
PPTX
웹보메트릭스와 계량정보학11 2
PPTX
Jiwon disc
PPTX
대구경북언론사(21 march2013)
PPTX
웹보메트릭스와 계량정보학 강의소개
PPTX
Толерантность
Image text duke_political_conference(25_may2010)presentation
웹보메트릭스와 계량정보학11 1
웹보메트릭스와 계량정보학11 2
Jiwon disc
대구경북언론사(21 march2013)
웹보메트릭스와 계량정보학 강의소개
Толерантность
Ad

Similar to Target link presentation (20)

PPT
Doing An Internet Study
PPT
Political change in the digital age
DOCX
NLP journal paper
PDF
Document(2)
DOCX
Urban Topic- Cycpercities and china's censorship
PDF
IRJET - Political Orientation Prediction using Social Media Activity
PPT
How to social scientists use link data (11 june2010)
PDF
Constructing collectivity in diversity soriano ica paper 4 30
PDF
Predicting Elections with Twitter
PDF
What do Chinese-language microblog users do with Baidu Baike and Chinese Wiki...
PPTX
Social media in the public sector south korea twitter
PDF
Monitoring of the Last US Presidential Elections
PDF
WE ROCK
PDF
Hyperlocal media and data journalism
PPTX
Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)
PDF
Kottler Thesis 2011
PDF
As a platform for unorganised individual Chinese expression, how is today’s i...
PDF
NED Annual Report 2012 Highlight Spreads
PDF
Web Index o Indice de la Web 2012
PDF
Temporal Exploration in 2D Visualization of Emotions on Twitter Stream
Doing An Internet Study
Political change in the digital age
NLP journal paper
Document(2)
Urban Topic- Cycpercities and china's censorship
IRJET - Political Orientation Prediction using Social Media Activity
How to social scientists use link data (11 june2010)
Constructing collectivity in diversity soriano ica paper 4 30
Predicting Elections with Twitter
What do Chinese-language microblog users do with Baidu Baike and Chinese Wiki...
Social media in the public sector south korea twitter
Monitoring of the Last US Presidential Elections
WE ROCK
Hyperlocal media and data journalism
Cyworld Jeju 2009 Conference(10 Aug2009)No2(2)
Kottler Thesis 2011
As a platform for unorganised individual Chinese expression, how is today’s i...
NED Annual Report 2012 Highlight Spreads
Web Index o Indice de la Web 2012
Temporal Exploration in 2D Visualization of Emotions on Twitter Stream

More from Han Woo PARK (20)

PDF
소셜 빅데이터를 활용한_페이스북_이용자들의_반응과_관계_분석
PDF
페이스북 선도자 탄핵촛불에서 캠폐인 이동경로
PDF
WATEF 2018 신년 세미나(수정)
PDF
세계트리플헬릭스미래전략학회 WATEF 2018 신년 세미나
PDF
Disc 2015 보도자료 (휴대폰번호 삭제-수정)
PDF
Another Interdisciplinary Transformation: Beyond an Area-studies Journal
PPTX
4차산업혁명 린든달러 비트코인 알트코인 암호화폐 가상화폐 등
PDF
KISTI-WATEF-BK21Plus-사이버감성연구소 2017 동계세미나 자료집
PPTX
박한우 교수 프로파일 (31 oct2017)
PPTX
Global mapping of artificial intelligence in Google and Google Scholar
DOCX
박한우 영어 이력서 Curriculum vitae 경희대 행사 제출용
DOCX
향기담은 하루찻집
PPTX
Twitter network map of #ACPC2017 1st day using NodeXL
PDF
페이스북 댓글을 통해 살펴본 대구·경북(TK) 촛불집회
PPTX
Facebook bigdata to understand regime change and migration patterns during ca...
DOCX
세계산학관협력총회 Watef 패널을 공지합니다
PDF
2017 대통령선거 후보수락 유튜브 후보수락 동영상 김찬우 박효찬 박한우
PDF
2017년 인포그래픽스 과제모음
PDF
SNS 매개 학습공동체의 학습네트워크 탐색 : 페이스북 그룹을 중심으로
PDF
2016년 촛불집회의 페이스북 댓글 데이터를 통해 본 하이브리드 미디어 현상
소셜 빅데이터를 활용한_페이스북_이용자들의_반응과_관계_분석
페이스북 선도자 탄핵촛불에서 캠폐인 이동경로
WATEF 2018 신년 세미나(수정)
세계트리플헬릭스미래전략학회 WATEF 2018 신년 세미나
Disc 2015 보도자료 (휴대폰번호 삭제-수정)
Another Interdisciplinary Transformation: Beyond an Area-studies Journal
4차산업혁명 린든달러 비트코인 알트코인 암호화폐 가상화폐 등
KISTI-WATEF-BK21Plus-사이버감성연구소 2017 동계세미나 자료집
박한우 교수 프로파일 (31 oct2017)
Global mapping of artificial intelligence in Google and Google Scholar
박한우 영어 이력서 Curriculum vitae 경희대 행사 제출용
향기담은 하루찻집
Twitter network map of #ACPC2017 1st day using NodeXL
페이스북 댓글을 통해 살펴본 대구·경북(TK) 촛불집회
Facebook bigdata to understand regime change and migration patterns during ca...
세계산학관협력총회 Watef 패널을 공지합니다
2017 대통령선거 후보수락 유튜브 후보수락 동영상 김찬우 박효찬 박한우
2017년 인포그래픽스 과제모음
SNS 매개 학습공동체의 학습네트워크 탐색 : 페이스북 그룹을 중심으로
2016년 촛불집회의 페이스북 댓글 데이터를 통해 본 하이브리드 미디어 현상

Target link presentation

  • 1. What is the link and text doing here: A Case Study of Cyworld Minihompies in KoreaSteven Sams and Han Woo Park
  • 2. BackgroundThis study analyses user-generated comments posted to Korean politicians on SNS Cyworld that contain a URLThe study examines the type of service being linked to through the URL and determines the frequency of servicesA developed program captures all comments given to a selected set of politicians within a predefined timeframeThe text component of messages is analyzed using two separate machine-learning mechanisms
  • 3. Types of HyperlinksFive social functions that hyperlinks can be said to performInformation ProvisionNetwork StrengtheningIdentity BuildingAudience SharingMessage AmplificationAckland et al. (2010)
  • 4. Online Korean Political SphereAs in other countries, Korean politicians are increasingly turning to social networks as a means to engage with their electorateIn 2007 Cyworld commanded a penetration rate of one third of the total population of South Korea, and since then all indications are that this proportion has increased.
  • 5. SampleOne hundred and thirty Korean National Assembly Members’ Cyworld Minihomies.The date parameters of the study were April 2008 – June 2009One hundred and fifty three thousand six hundred and two comments were collected for period chosen for the study. One thousand two hundred and seventy six comments contained links
  • 6. Data Collection MethodA program was developed that performs HTTP call to request one page of comments from the politician’s visitor boardThe content and date are isolated and held in temporary storage. The process repeats until the target date parameters have been met.
  • 7. Data Analysis Method: LinksThe links are checked to determine the number of unique URLs and corresponding number of unique domains. These links / domains are then manually categorised into website type, such as portals, media, parties, homepages of politicians, petition sites, online fan clubs, and NGOs)Location of service found using network query tool to determine the proportion of domestic and international websites
  • 8. Data Analysis Method: TextTo analyse a large body of text, Natural Language Processing (NLP) is one approach to categorisation that can mitigate the problem of obtaining accurate results that is unfeasible to perform manuallyA rudimentary Java class was developed that wrapped a small subset of the methods provided in the LingPipe API so that they could be called on the extracted text comments.The developed Java class enabled two forms of analysis: Sentiment Analysis and Collocation
  • 9. Sentiment AnalysisA polarity analyser was developed that is able to locate significant word combinations and, using the developed corpus model as a training dataset, determine if the combination is generally positive or negativeAn accessible corpus of positive and negative sentiment composed in Korean has yet to be realized.A sample body of 2000 Korean text statements were coded into objective, subjective - positive and subjective - negative categories
  • 10. Collocation Collocation analysis can determine which tokens are more frequently found together than would normally be expected. Collocation can identify proper nouns in this way (such as the names or persons, places, or events) that would be lost if the frequency of each token were analysed in isolation.
  • 11. Results - Links153,602 comments were collected for period chosen for the study1,276 comments contained hyperlinksTotal link count was 1,920 as it was common to have more than one hyperlink contained within an individual posting 762 were unique full URLs and 259 were unique domains1,849 URLs encountered in the sample were found to belong to services based in Korea and 71 from international servicePerforming message amplification and network building were prominent causes of link posting
  • 12. Table 1: LexiURL Unique / Full hostsBased on the top 10 domains (24.5%) by occurrence out of 259
  • 13. Table 2: LexiURL Unique / Full URLs
  • 14. Table 3: Total links to each domain (Korea)Based on 1,078 (58.3%) of 1,849 links to Korean services
  • 15. Table 4: Total links to each domain (Overseas)Based on 51 (71.8%) of 71 links to overseas services
  • 16. Table 5: poster-gender and politician background
  • 17. Table 6: Comments categorized by link type from the six groups of gender and political affiliationTable 6: Comments categorized by link type from the six groups of gender and political affiliationBased on 206 comments agreed on by both coders from the initial set of 300
  • 18. Results - TextMay and June 2008 were found to have high numbers of comments containing links that showed negative sentiment, and this date corresponds with the period of the candlelight protestMay 2009 also shows large numbers of comments containing hyperlinks that indicate negative sentiment, coinciding with the suicide of ex-president Roh Moo-HyunThe name of Korean President Lee Myung-bak was found to occur two hundred and twenty nine timesTerms pertaining to the candlelight protests, such as Mad Cow disease, beef, American goods, and candlelight protest occurred frequentlyGini coefficient and a less formal term describing a similar measurement of wealth occurred frequently
  • 19. Figure 1. Positive and negative sentiment from comments containing links
  • 20. Confidence Levels To determine the effectiveness of the classification approach, 10% of training data was removed from the training set and used to evaluate the developed model. This approach allows testing the classification based on known human-classified data. The Average Conditional Probability score provides a basis for determining the ability of the classifier to correctly identify positive and negative sentiment. Based on the training set used, the Average Conditional Probability was found to be 87%.
  • 21. LimitationsLess than 1% of all comments posted to the sample of politicians and indicates that although previous studies have shown how links can support communication in SNSs, their frequency in the Korean online political environment remains rareComments deleted over the period of the study may omit the full extent of negative sentiment towards politiciansThe practice of deleting content in Korea has been found to be less constrained by social norms than found in Western SNSs, such as FacebookLegal mechanisms also exist in Korea to encourage the removal of negative content during election periods
  • 22. ConclusionLinks are almost solely targeted to Korean domestic services, and the few that do point to overseas sites are usually related in some way to domestic issues in KoreaMales are marginally more likely to comment on Cyworld Minihompies using links than females, and those Minihompies managed by ruling politicians were found to be of greater prominence than those of the opposition partiesMessage Amplification and Network Building were found to be the dominant purpose for submitting links within user-generated comments. Using two forms of machine-based learning algorithms, sentiment analysis and collocation of significant phrases, revealed primarily negative sentiment towards President Lee and his role in the reintroduction of American beef imports. Issues surrounding the suicide of ex-President Roh suggested anger towards those who were seen to be harassing him prior to his death
  • 23. Acknowledgement Research for this paper has been supported by the World Class University (WCU) program through the National Research Foundation of Korea, which is funded by the Ministry of Education, Science and Technology (No. 515-82-06574).

Editor's Notes

  • #8: this approach is largely descriptive and does not consider the accompanying text
  • #9: LingPipe is a comprehensive NLP toolkit and the methods used in the developed Java class enabled three forms of analysis: Sentiment Analysis, Collocation, and Language Identification
  • #10: LingPipe is a comprehensive NLP toolkit and the methods used in the developed Java class enabled three forms of analysis: Sentiment Analysis, Collocation, and Language Identification
  • #22: as wall-cleaning (Raynes–Goldie, 2010), occurs when the owner of a profile page periodically or reactively evaluates comments and deletes those that cast the owner in an unfavorable light. Howver, whilst the occurrence of this process on facebook is not in question, the degree to which this happens has been challenged. Walther, et al. (2008) explain that deleting content regardless of whether it is deemed to be negative or unflattering is avoided as this contravenes the spirit of open content. Smith and Kidder (2010) extend this concept to other forms of user generated content and explain that social norms deter users from deleting content once it is in the community.  The practice of deleting content in Korea however appears to not be restrained by the same unwritten rules as that which govern Facebook. Yoo (2009) explains that content that submissions to user message boards are routinely deleted if the SNS page owner judges them to be unflattering or negative.  In addition to the practice of cultural deleting of content, there also exists a legal motivation to remove that which is deemed to be incorrect or negative. The extent to which this deletion practice occurs remains unclear, although legal frameworks exist in Korea and elsewhere to encourage the deletion of content by either the service provider or owner of the SNS account.
  • #23: For example, the linking of a petition to call upon the governing president to be impeached combined with the name of the president occurring frequently and the negative sentiment recorded does point to….