SlideShare a Scribd company logo
Naeemul Hassan1 Fatma Arslan2 Chengkai Li2 Mark Tremayne3
1Department of Computer and Information Science, University of Mississippi
2Deparment of Computer Science and Engineering, University of Texas at Arlington
3Department of Communication, University of Texas at Arlington
Fake-news floods social media (“filter bubbles” and “echo chambers”)
The Quest to Automate Fact-checking
Politicians make false and misleading claims
§ Facebook trending topic algorithms promoted fake-news.
§ A sample of 140,000 Twitter users in the battleground state of Michigan shared as many junk news
items as professional news during the final ten days of the 2016 election. http://politicalbots.org/?p=1064
National security threats
§ Russian government interfered with the 2016 election. Fake-news websites and bots used.
§ Pizzagate: conspiracy theory led to shooting
§ 100+ active fact-checking sites in 2017 (PolitiFact.com, FullFact.org, CNN,
Washington Post, …)
§ Google and Bing include fact-checks in search results.
§ Facebook lets users report false items and flags items disputed by fact-checkers.
Claim Spotting: Check-worthy Factual Claims Detection
Presidential Debate
Transcripts (1960-2012)
20788 sentences
Ground
Truth
Human
Annotation Feature
Vectors
Feature
Extraction
Learning
Algorithm
Important
Factual Claims
2016
Presidential
Debates
Classification and ranking by check-worthiness
§ Non-Factual Sentence (NFS) (Opinions, beliefs,
declarations): “But I think it’s time to talk about the future.”
§ Unimportant Factual Sentence (UFS): “Two days ago we
ate lunch at a restaurant.”
§ Check-worthy Factual Sentence (CFS): “He voted against
the first Gulf War.”
Feature extraction and selection
I was in a state where my legislature was 87 percent Democrat.
Entity Type: QuantityPart-of-Speech: Noun Concept: United States
Sentiment: 0.032 Words: state, legislature, 87, percent, democrat
Case Study: 2016 U.S. Presidential Election Debates
Data Labeling and Ground-Truth Collection
20788 sentences
374 coders
76552 labels
86 top-quality coders
52333 labels
Majority voting
20617 admitted sentences
Combating falsehoods
Comparison of topic distributions of CNN, PolitiFact fact-checked
sentences and sentences scored high (>=.5) by ClaimBuster
Funded by
Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster
Fact-checks on major party presidential nominees by PolitiFact
Lack of automated tools that assist fact-checkers
Coding
website
bit.ly/claimbusters
o 20788 sentences
o 20 months, 374 coders, ~$4,000 paid
o 30 training sentences
o 1032 screening sentences (731 NFS,
63 UFS, 238 CFS) to detect spammers
& low-quality coders
Coder quality
Quality assurance
Feature importance
§ “The Holy Grail”: fully automated fact-checking
End-to-End Fact-Checking
System idir.uta.edu/claimbuster
Classification and Ranking Accuracy

More Related Content

PPTX
5200 final ppt
DOCX
Electronic Media And United States Terrorism
PPTX
Intelligence chief defends internet spying program
PDF
PDF
Are Americans worried about the NSA?
 
PPTX
Gun Laws
PPTX
Reporting on Manipulation of Internet Public Opinion by South Korea’s Spy Age...
PDF
Lawmakers say Obama surveillance idea won't work
5200 final ppt
Electronic Media And United States Terrorism
Intelligence chief defends internet spying program
Are Americans worried about the NSA?
 
Gun Laws
Reporting on Manipulation of Internet Public Opinion by South Korea’s Spy Age...
Lawmakers say Obama surveillance idea won't work

What's hot (20)

PDF
Lawmakers say Obama surveillance idea won't work
PPTX
Big Data for a Better World
PDF
National Post Humber News
PDF
Data-Driven Enterprise on Any Beat by Manuel Torres - Monroe, La., NewsTrain ...
PDF
Team CDTW Capstone Presentation
PPT
Associated Press California Proposition 8 Database Project
PPTX
CJA 314 HELPS Learn by Doing/cja314helps.com
DOCX
Master Thesis
PPTX
Data Analytics Capstone
PDF
US mining data from 9 leading internet firms and companies deny knowledge
DOCX
Use a web search engine to search for local, state, and federal gove
PPTX
Nr14: Ten tips for data journalists
DOCX
CJA 314 Education Organization / snaptutorial.com
PPTX
Fake news presentation
PPTX
CRJS250 Carsuso Criminology Research Paper Guide
PDF
Исковое заявление против ФБР
PDF
Microsoft: Predicting the future with Search
PPTX
Spreading the Message
PPT
Find It On The Net
PDF
Ebooks and Electronic Devices at the University of Salamanca: perception and ...
Lawmakers say Obama surveillance idea won't work
Big Data for a Better World
National Post Humber News
Data-Driven Enterprise on Any Beat by Manuel Torres - Monroe, La., NewsTrain ...
Team CDTW Capstone Presentation
Associated Press California Proposition 8 Database Project
CJA 314 HELPS Learn by Doing/cja314helps.com
Master Thesis
Data Analytics Capstone
US mining data from 9 leading internet firms and companies deny knowledge
Use a web search engine to search for local, state, and federal gove
Nr14: Ten tips for data journalists
CJA 314 Education Organization / snaptutorial.com
Fake news presentation
CRJS250 Carsuso Criminology Research Paper Guide
Исковое заявление против ФБР
Microsoft: Predicting the future with Search
Spreading the Message
Find It On The Net
Ebooks and Electronic Devices at the University of Salamanca: perception and ...
Ad

Similar to Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster (20)

PDF
Enabling Computational Journalism: Automated Fact-Checking and Story-Finding
PDF
Comparing Automated Factual Claim Detection Against Judgments of Journalism O...
PDF
Restoring Trust by Computing: Data-driven Fact-checking and Exceptional Fact ...
PPTX
The Fact Checking Project from the American Press Institute
PPTX
Fact-Checking Workshop by API & PolitiFact
PPTX
Frontiers of Computational Journalism week 10 - Truth and Trust
PDF
Fake News Detection
PPTX
Analyzing language in fake news and political fact checking
KEY
This History and Current State of Fact Checking
PDF
Full fact the-state_of_automated_factchecking_aug_2016
PDF
Towards Explainable Fact Checking
PDF
IRJET- Milestones and Challenges of Fake News Detection using Digital Forensi...
PPTX
FakeNewsDetector.pptx
PDF
Fact-checking in the newsroom: best practices, open questions
PPTX
DeepFakes H4D Stanford 2019
PDF
A review of Fake News Detection Methods
PPTX
KDD22_tutorial_slides_final_sharing.pptx
PDF
The rise of fact checking sites in europe
PDF
Fact Checking & Information Retrieval
Enabling Computational Journalism: Automated Fact-Checking and Story-Finding
Comparing Automated Factual Claim Detection Against Judgments of Journalism O...
Restoring Trust by Computing: Data-driven Fact-checking and Exceptional Fact ...
The Fact Checking Project from the American Press Institute
Fact-Checking Workshop by API & PolitiFact
Frontiers of Computational Journalism week 10 - Truth and Trust
Fake News Detection
Analyzing language in fake news and political fact checking
This History and Current State of Fact Checking
Full fact the-state_of_automated_factchecking_aug_2016
Towards Explainable Fact Checking
IRJET- Milestones and Challenges of Fake News Detection using Digital Forensi...
FakeNewsDetector.pptx
Fact-checking in the newsroom: best practices, open questions
DeepFakes H4D Stanford 2019
A review of Fake News Detection Methods
KDD22_tutorial_slides_final_sharing.pptx
The rise of fact checking sites in europe
Fact Checking & Information Retrieval
Ad

More from The Innovative Data Intelligence Research (IDIR) Laboratory, University of Texas at Arlington (20)

PDF
Tackling Usability Challenges in Querying Massive, Ultra-heterogeneous Graphs
PDF
Facetedpedia: Dynamic Generation of Query-Dependent Faceted Interfaces for Wi...
PDF
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScout
PDF
Data In, Fact Out: Automated Monitoring of Facts by FactWatcher
PDF
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScoutCrewsc...
PDF
VIIQ: Auto-suggestion Enabled Visual Interface for Interactive Graph Query Fo...
PDF
Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
PDF
TableView: A Visual Interface for Generating Preview Tables of Entity Graphs
PDF
Maverick: Discovering Exceptional Facts from Knowledge Graphs
PDF
An Empirical Study on Identifying Sentences with Salient Factual Statements
PDF
Continuous Monitoring of Pareto Frontiers on Partially Ordered Attributes for...
PDF
Maverick: Discovering Exceptional Facts from Knowledge Graphs
PDF
ClaimPortal: Integrated Monitoring, Searching, Checking, and Analytics of Fac...
Tackling Usability Challenges in Querying Massive, Ultra-heterogeneous Graphs
Facetedpedia: Dynamic Generation of Query-Dependent Faceted Interfaces for Wi...
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScout
Data In, Fact Out: Automated Monitoring of Facts by FactWatcher
Anything You Can Do, I Can Do Better: Finding Expert Teams by CrewScoutCrewsc...
VIIQ: Auto-suggestion Enabled Visual Interface for Interactive Graph Query Fo...
Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
TableView: A Visual Interface for Generating Preview Tables of Entity Graphs
Maverick: Discovering Exceptional Facts from Knowledge Graphs
An Empirical Study on Identifying Sentences with Salient Factual Statements
Continuous Monitoring of Pareto Frontiers on Partially Ordered Attributes for...
Maverick: Discovering Exceptional Facts from Knowledge Graphs
ClaimPortal: Integrated Monitoring, Searching, Checking, and Analytics of Fac...

Recently uploaded (20)

PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Global Data and Analytics Market Outlook Report
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
Managing Community Partner Relationships
PPTX
modul_python (1).pptx for professional and student
PDF
Microsoft Core Cloud Services powerpoint
PPTX
Database Infoormation System (DBIS).pptx
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
annual-report-2024-2025 original latest.
PDF
Business Analytics and business intelligence.pdf
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
Transcultural that can help you someday.
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Global Data and Analytics Market Outlook Report
IBA_Chapter_11_Slides_Final_Accessible.pptx
importance of Data-Visualization-in-Data-Science. for mba studnts
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
[EN] Industrial Machine Downtime Prediction
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Managing Community Partner Relationships
modul_python (1).pptx for professional and student
Microsoft Core Cloud Services powerpoint
Database Infoormation System (DBIS).pptx
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
annual-report-2024-2025 original latest.
Business Analytics and business intelligence.pdf
SAP 2 completion done . PRESENTATION.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Transcultural that can help you someday.
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj

Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster

  • 1. Naeemul Hassan1 Fatma Arslan2 Chengkai Li2 Mark Tremayne3 1Department of Computer and Information Science, University of Mississippi 2Deparment of Computer Science and Engineering, University of Texas at Arlington 3Department of Communication, University of Texas at Arlington Fake-news floods social media (“filter bubbles” and “echo chambers”) The Quest to Automate Fact-checking Politicians make false and misleading claims § Facebook trending topic algorithms promoted fake-news. § A sample of 140,000 Twitter users in the battleground state of Michigan shared as many junk news items as professional news during the final ten days of the 2016 election. http://politicalbots.org/?p=1064 National security threats § Russian government interfered with the 2016 election. Fake-news websites and bots used. § Pizzagate: conspiracy theory led to shooting § 100+ active fact-checking sites in 2017 (PolitiFact.com, FullFact.org, CNN, Washington Post, …) § Google and Bing include fact-checks in search results. § Facebook lets users report false items and flags items disputed by fact-checkers. Claim Spotting: Check-worthy Factual Claims Detection Presidential Debate Transcripts (1960-2012) 20788 sentences Ground Truth Human Annotation Feature Vectors Feature Extraction Learning Algorithm Important Factual Claims 2016 Presidential Debates Classification and ranking by check-worthiness § Non-Factual Sentence (NFS) (Opinions, beliefs, declarations): “But I think it’s time to talk about the future.” § Unimportant Factual Sentence (UFS): “Two days ago we ate lunch at a restaurant.” § Check-worthy Factual Sentence (CFS): “He voted against the first Gulf War.” Feature extraction and selection I was in a state where my legislature was 87 percent Democrat. Entity Type: QuantityPart-of-Speech: Noun Concept: United States Sentiment: 0.032 Words: state, legislature, 87, percent, democrat Case Study: 2016 U.S. Presidential Election Debates Data Labeling and Ground-Truth Collection 20788 sentences 374 coders 76552 labels 86 top-quality coders 52333 labels Majority voting 20617 admitted sentences Combating falsehoods Comparison of topic distributions of CNN, PolitiFact fact-checked sentences and sentences scored high (>=.5) by ClaimBuster Funded by Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster Fact-checks on major party presidential nominees by PolitiFact Lack of automated tools that assist fact-checkers Coding website bit.ly/claimbusters o 20788 sentences o 20 months, 374 coders, ~$4,000 paid o 30 training sentences o 1032 screening sentences (731 NFS, 63 UFS, 238 CFS) to detect spammers & low-quality coders Coder quality Quality assurance Feature importance § “The Holy Grail”: fully automated fact-checking End-to-End Fact-Checking System idir.uta.edu/claimbuster Classification and Ranking Accuracy