SlideShare a Scribd company logo
자연어처리 연구실
M2020064
조단비
Published: ACM Journals; ACM Computing Surveys, Vol.51, No.4, 2018
Content
1. Why study Hate Speech automatic detection?
2. What is Hate Speech?
3. What has been done so far in automatic Hate Speech detection?
4. Resources for Gate Speech classification
5. Research challenges and opportunities
#Kookmin_University #Natural_Language_Processing_lab. 1
Introduction
> Describe the motivation for conducting research
- “how hate speech online has been evolving”
- “who are the main targets of it”
> Provide the detailed definitions
> Analyze the previous survey with systematic literature review
- focusing on descriptive statistics about Hate Speech detection
- focusing on algorithms for Hate Speech detection
#Kookmin_University #Natural_Language_Processing_lab. 2
1. Why study Hate Speech automatic detection?
- European Union Commission directives
- Automatic techniques not available
- Lack of data about hate speech
- Hate speech removal
- Quality of service
#Kookmin_University #Natural_Language_Processing_lab. 3
2. What is Hate Speech?
> Definition from several sources
> Our definition of Hate Speech
: “jokes also must be marked as hate speech.”
#Kookmin_University #Natural_Language_Processing_lab. 4
1
2
3
4
2. What is Hate Speech?
> Particular cases and examples of Hate speech
- In Facebook,
hate speech = a verbal attack + the target of the attack from “protected category”
#Kookmin_University #Natural_Language_Processing_lab. 5
2. What is Hate Speech?
> Hate Speech and
other related concepts
- Hate: 증오
- Cyberbullying: 사이버 괴롭힘
- Discrimination: 차별
- Flaming: 모욕
- Abusive language: 욕설
- Profanity: 욕설
- Toxic language or comment: 악성 댓글
- Extremism: 극단주의 (폭력 조장)
- Radicalization: 급진주의
#Kookmin_University #Natural_Language_Processing_lab. 6
3. What has been done so far
in automatic Hate Speech detection?
1) Systematic Literature Review
> Method description
> Document Collection and Annotation
- A total of 127 documents (2016.09.01 ~ 2017.05.18)
- “Law and Social Sciences”: 76 / “Computer Science and Engineering”: 51
- Low number of citations
#Kookmin_University #Natural_Language_Processing_lab. 7
3. What has been done so far
in automatic Hate Speech detection?
> Keywords in the Document
- Related concepts (cyberbullying, cyber hate, sectarianism, …)
- Machine learning (classification, sentiment analysis, filtering systems, …)
- Social media (internet, social media, social network, …)
#Kookmin_University #Natural_Language_Processing_lab. 8
3. What has been done so far
in automatic Hate Speech detection?
> Social Networks & Number of Used Instances
#Kookmin_University #Natural_Language_Processing_lab. 9
3. What has been done so far
in automatic Hate Speech detection?
> General or Particular Hate Speech & Algorithms Used
#Kookmin_University #Natural_Language_Processing_lab. 10
3. What has been done so far
in automatic Hate Speech detection?
> Type of Approach in the Document
#Kookmin_University #Natural_Language_Processing_lab. 11
9
1
17
3. What has been done so far
in automatic Hate Speech detection?
2) Documents focusing on descriptive statistics about Hate Speech detection
- There are descriptive articles
about Racism(인종차별), Sexism(성차별), Prejudice toward refugees(난민에 대한 편견),
Homophobia(동성애 혐오증), and general hate speech(일반적인 증오심)
3) Documents focusing on algorithms for Hate Speech detection
> Dataset used in the papers
> Achieved performances
- metrics: Precision, Recall, F-measure, accuracy, and AUC
#Kookmin_University #Natural_Language_Processing_lab. 12
3. What has been done so far
in automatic Hate Speech detection?
#Kookmin_University #Natural_Language_Processing_lab. 13
3. What has been done so far
in automatic Hate Speech detection?
4) Text mining approaches in automatic Hate Speech detection
: feature extraction
(1). general features used in text mining
: dictionary, distance metric, Bag-of-words, N-grams, TF-IDF, Part-of-speech, …
(2). The specific hate speech detection features
#Kookmin_University #Natural_Language_Processing_lab. 14
4. Resources for Hate Speech classification
1). Dataset & open source projects
#Kookmin_University #Natural_Language_Processing_lab. 15
4. Resources for
Hate Speech classification
https://paperswithcode.com/datasets?
task=hate-speech-detection
#Kookmin_University #Natural_Language_Processing_lab. 16
4. Resources for
Hate Speech classification
https://github.com/kocohub/
korean-hate-speech
#Kookmin_University #Natural_Language_Processing_lab. 17
5. Research challenges and opportunities
> challenges
- Lack of expertise
- Difficulty to track all racial and minority insults
- Evolution of language among young population
- Transition of hate speech such as sarcasm
> Opportunity
- Open source platforms or algorithms
- Definition of a main dataset
- Comparative studies
- Multilingual research
#Kookmin_University #Natural_Language_Processing_lab. 18
Thank You.
19
#Kookmin_University #Natural_Language_Processing_lab.

More Related Content

PDF
Comparative studies on detecting abusive language on twitter
PDF
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
PPTX
2019 Triangle Machine Learning Day - Defending against Machine Learning based...
PPTX
NAACL Tutorial
Social Media Predictive Analytics
PDF
Understanding real-world personalized risk factors of self-harm through onlin...
PPTX
Presentation about stray dogs
PPTX
Anti-plagiarism tools for our repositories
PPTX
Presentation-Detecting Spammers on Social Networks
Comparative studies on detecting abusive language on twitter
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
2019 Triangle Machine Learning Day - Defending against Machine Learning based...
NAACL Tutorial
Social Media Predictive Analytics
Understanding real-world personalized risk factors of self-harm through onlin...
Presentation about stray dogs
Anti-plagiarism tools for our repositories
Presentation-Detecting Spammers on Social Networks

What's hot (14)

PDF
AINL 2016: Shavrina, Selegey
PPT
Plagiarism
PPTX
Red Blue Presentation
PDF
Social media as a tool for terminological research
PDF
Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)
PDF
Twitter provides a selfie of envolving language
PDF
Plagiarism
PPTX
Rhetorical Recycling: When Can You Use Your Ideas and Writing for More than O...
PDF
Plagiarism:-Types and Causes
PDF
Team CDTW Capstone Presentation
PPT
Plagarism + Turnitin Bus Induction Feb 2010
PDF
Eavesdropping on the Twitter Microblogging Site
PPTX
Plagiarism and its detection
PPTX
What’s in a Country Name – Twitter Hashtag Analysis of #singapore
AINL 2016: Shavrina, Selegey
Plagiarism
Red Blue Presentation
Social media as a tool for terminological research
Comparing Three Plagiarism Tools (Ferret, Sherlock, and Turnitin)
Twitter provides a selfie of envolving language
Plagiarism
Rhetorical Recycling: When Can You Use Your Ideas and Writing for More than O...
Plagiarism:-Types and Causes
Team CDTW Capstone Presentation
Plagarism + Turnitin Bus Induction Feb 2010
Eavesdropping on the Twitter Microblogging Site
Plagiarism and its detection
What’s in a Country Name – Twitter Hashtag Analysis of #singapore
Ad

More from Danbi Cho (11)

PDF
Crf based named entity recognition using a korean lexical semantic network
PDF
Gpt models
PDF
Attention boosted deep networks for video classification
PDF
A survey on deep learning based approaches for action and gesture recognition...
PDF
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
PDF
Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...
PDF
Decision tree and ensemble
PDF
Can recurrent neural networks warp time
PDF
Man is to computer programmer as woman is to homemaker debiasing word embeddings
PDF
Situation recognition visual semantic role labeling for image understanding
PDF
Mitigating unwanted biases with adversarial learning
Crf based named entity recognition using a korean lexical semantic network
Gpt models
Attention boosted deep networks for video classification
A survey on deep learning based approaches for action and gesture recognition...
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...
Decision tree and ensemble
Can recurrent neural networks warp time
Man is to computer programmer as woman is to homemaker debiasing word embeddings
Situation recognition visual semantic role labeling for image understanding
Mitigating unwanted biases with adversarial learning
Ad

Recently uploaded (20)

PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
System and Network Administraation Chapter 3
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
Introduction to Artificial Intelligence
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Nekopoi APK 2025 free lastest update
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
history of c programming in notes for students .pptx
PDF
AI in Product Development-omnex systems
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Wondershare Filmora 15 Crack With Activation Key [2025
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Softaken Excel to vCard Converter Software.pdf
Upgrade and Innovation Strategies for SAP ERP Customers
Understanding Forklifts - TECH EHS Solution
Navsoft: AI-Powered Business Solutions & Custom Software Development
System and Network Administraation Chapter 3
2025 Textile ERP Trends: SAP, Odoo & Oracle
VVF-Customer-Presentation2025-Ver1.9.pptx
PTS Company Brochure 2025 (1).pdf.......
Introduction to Artificial Intelligence
Odoo Companies in India – Driving Business Transformation.pdf
Nekopoi APK 2025 free lastest update
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
history of c programming in notes for students .pptx
AI in Product Development-omnex systems

A survey on automatic detection of hate speech in text

  • 1. 자연어처리 연구실 M2020064 조단비 Published: ACM Journals; ACM Computing Surveys, Vol.51, No.4, 2018
  • 2. Content 1. Why study Hate Speech automatic detection? 2. What is Hate Speech? 3. What has been done so far in automatic Hate Speech detection? 4. Resources for Gate Speech classification 5. Research challenges and opportunities #Kookmin_University #Natural_Language_Processing_lab. 1
  • 3. Introduction > Describe the motivation for conducting research - “how hate speech online has been evolving” - “who are the main targets of it” > Provide the detailed definitions > Analyze the previous survey with systematic literature review - focusing on descriptive statistics about Hate Speech detection - focusing on algorithms for Hate Speech detection #Kookmin_University #Natural_Language_Processing_lab. 2
  • 4. 1. Why study Hate Speech automatic detection? - European Union Commission directives - Automatic techniques not available - Lack of data about hate speech - Hate speech removal - Quality of service #Kookmin_University #Natural_Language_Processing_lab. 3
  • 5. 2. What is Hate Speech? > Definition from several sources > Our definition of Hate Speech : “jokes also must be marked as hate speech.” #Kookmin_University #Natural_Language_Processing_lab. 4 1 2 3 4
  • 6. 2. What is Hate Speech? > Particular cases and examples of Hate speech - In Facebook, hate speech = a verbal attack + the target of the attack from “protected category” #Kookmin_University #Natural_Language_Processing_lab. 5
  • 7. 2. What is Hate Speech? > Hate Speech and other related concepts - Hate: 증오 - Cyberbullying: 사이버 괴롭힘 - Discrimination: 차별 - Flaming: 모욕 - Abusive language: 욕설 - Profanity: 욕설 - Toxic language or comment: 악성 댓글 - Extremism: 극단주의 (폭력 조장) - Radicalization: 급진주의 #Kookmin_University #Natural_Language_Processing_lab. 6
  • 8. 3. What has been done so far in automatic Hate Speech detection? 1) Systematic Literature Review > Method description > Document Collection and Annotation - A total of 127 documents (2016.09.01 ~ 2017.05.18) - “Law and Social Sciences”: 76 / “Computer Science and Engineering”: 51 - Low number of citations #Kookmin_University #Natural_Language_Processing_lab. 7
  • 9. 3. What has been done so far in automatic Hate Speech detection? > Keywords in the Document - Related concepts (cyberbullying, cyber hate, sectarianism, …) - Machine learning (classification, sentiment analysis, filtering systems, …) - Social media (internet, social media, social network, …) #Kookmin_University #Natural_Language_Processing_lab. 8
  • 10. 3. What has been done so far in automatic Hate Speech detection? > Social Networks & Number of Used Instances #Kookmin_University #Natural_Language_Processing_lab. 9
  • 11. 3. What has been done so far in automatic Hate Speech detection? > General or Particular Hate Speech & Algorithms Used #Kookmin_University #Natural_Language_Processing_lab. 10
  • 12. 3. What has been done so far in automatic Hate Speech detection? > Type of Approach in the Document #Kookmin_University #Natural_Language_Processing_lab. 11 9 1 17
  • 13. 3. What has been done so far in automatic Hate Speech detection? 2) Documents focusing on descriptive statistics about Hate Speech detection - There are descriptive articles about Racism(인종차별), Sexism(성차별), Prejudice toward refugees(난민에 대한 편견), Homophobia(동성애 혐오증), and general hate speech(일반적인 증오심) 3) Documents focusing on algorithms for Hate Speech detection > Dataset used in the papers > Achieved performances - metrics: Precision, Recall, F-measure, accuracy, and AUC #Kookmin_University #Natural_Language_Processing_lab. 12
  • 14. 3. What has been done so far in automatic Hate Speech detection? #Kookmin_University #Natural_Language_Processing_lab. 13
  • 15. 3. What has been done so far in automatic Hate Speech detection? 4) Text mining approaches in automatic Hate Speech detection : feature extraction (1). general features used in text mining : dictionary, distance metric, Bag-of-words, N-grams, TF-IDF, Part-of-speech, … (2). The specific hate speech detection features #Kookmin_University #Natural_Language_Processing_lab. 14
  • 16. 4. Resources for Hate Speech classification 1). Dataset & open source projects #Kookmin_University #Natural_Language_Processing_lab. 15
  • 17. 4. Resources for Hate Speech classification https://paperswithcode.com/datasets? task=hate-speech-detection #Kookmin_University #Natural_Language_Processing_lab. 16
  • 18. 4. Resources for Hate Speech classification https://github.com/kocohub/ korean-hate-speech #Kookmin_University #Natural_Language_Processing_lab. 17
  • 19. 5. Research challenges and opportunities > challenges - Lack of expertise - Difficulty to track all racial and minority insults - Evolution of language among young population - Transition of hate speech such as sarcasm > Opportunity - Open source platforms or algorithms - Definition of a main dataset - Comparative studies - Multilingual research #Kookmin_University #Natural_Language_Processing_lab. 18