SlideShare a Scribd company logo
NEWORDER – Science in the Online Knowledge
Order
Stefan Dietze, 13.10.2023
Discourse Interactions
Algorithms/AI
Motivation: science online discourse vs offline society & policies
Society, Media, Politics & Policies
(Offline & Online)
Science discourse online
(NEWORDER focus: news & social media) 2
3
▪ Percentage of tweets containing
links to scientific articles (journals,
publishers, science blogs etc)
▪ Uses list of > 17 K science web
domains (URLs)
▪ Data source: 1% sample of Twitter
(https://data.gesis.org/tweetskb/),
(> 14 bn tweets archived since
2013)
Motivation: scientific online discourse is on the rise
Example: Twitter / X
4
NEWORDER project
Interdisciplinary approach, team and objectives
▪ Perception of roles, sources, and authority; impact on trust-
worthiness assessment
(Cognitive Psychology)
▪ Dissolution of phases, hierarchies and contexts in the
scientific process
(Social Sciences, Media & Communication Studies)
▪ Computational methods for collecting, detecting and
classifying scientific online discourse
(Computer Science/AI & Computational Linguistics)
Cress, Utz (IWM & Uni Tübingen)
Marcinkowski, Koss (HHU)
Dietze, Boland, Jabeen (GESIS), Kallmeyer (HHU)
How can „scientific discourse“ be defined?
Example: Twitter / X
5
Science claim
Science reference
Science relevance
No science
Science reference
Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse,
CIKM2022
Training AI to detect science discourse: SciTweets dataset & classifier
6
▪ Manual annotation of ground truth dataset for
testing models (heuristics-based sampling,
annotation framework, > 1K annotated tweets)
▪ Training AI models to detect science discourse in
large-scale discourse data (e.g. Web archives)
▪ Reasonable classification performance using fine-
tuned language model (SciBERT) applied to
TweetsKB data
Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse,
CIKM2022
https://github.com/AI-4-Sci/SciTweets
What is science discourse and how does it evolve?
Increasing amount and proportion of not peer-reviewed science works
7
Absolute amount of tweets sharing preprints Proportion of preprints among shared science URLs
How is public attention distributed?
Power law distribution
8
• 10% of studies receive
> 75% of all Twitter
mentions
• Long tail of studies
with few mentions
• Data source: 1.67 M
tweets mentioning at
least one of the
primary science
studies in the
„Altmetrics“ corpus
Top x (%) of mentioned science studies
Share
of
twitter
mentions
(%)
Challenge: online science discourse is not well-informed
Links to actual scientific studies/context missing in news & social media
9
▪ NLP models able to predict missing primary science reference (e.g. DOI or journal paper link) for
given informal reference (e.g. “Heinsberg Studie”) or secondary reference (news article)
Challenge: online science discourse is not well-informed
Links to actual scientific studies/context missing in news & social media
10
▪ NLP models able to predict missing primary science reference (e.g. DOI or journal paper link) for
given informal reference (e.g. “Heinsberg Studie”) or secondary reference (news article)
Challenge: online science discourse is not well-informed
Links to actual scientific studies/context missing in news & social media
11
▪ Supervised & unsupervised approaches using DL language models
Science discourse is „different“
12
Examples from http://snopes.com
Non-science claim
Science claim
Computational (AI) challenge
NLP methods (e.g. for fact-checking) perform worse on science discourse
13
▪ Take-away: AI-based methods geared towards scientific discourse required
Performance of state-of-the-art AI/deep learning using standard benchmark datasets
Claim Check-Worthiness
Detection
Fake News Detection
Claim Verification
Wrap-Up & Outlook: Interdisciplinary Work Plan
Media & Communication Studies
(spreading pattern & societal impact)
WP5 Longitudinal online discourse analysis
WP2 Dissolution of phases & contexts
WP3 Perception of roles, credibility & trust
WP1 Data collection & study preparation
Cognitive & Social Psychology
(effects on individuals)
Computer & Information Science
(understanding online discourse)
WP4 NLP for classifying sources & roles
15
http://gesis.org/en/kts

More Related Content

PPTX
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
PDF
AI in between online and offline discourse - and what has ChatGPT to do with ...
PPTX
WEBINAR: Joining the "buzz": the role of social media in raising research vi...
PDF
Helig webinar 6 nov_2014
PDF
Joining the ‘buzz’ : the role of social media in raising research visibility ...
PPTX
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
PDF
Research-Open Access-Social Media: A winning combination
PPTX
Research-Open Access-Social Media: a winning combination
Disseminating Scientific Papers via Twitter: Practical Insights and Research ...
AI in between online and offline discourse - and what has ChatGPT to do with ...
WEBINAR: Joining the "buzz": the role of social media in raising research vi...
Helig webinar 6 nov_2014
Joining the ‘buzz’ : the role of social media in raising research visibility ...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Research-Open Access-Social Media: A winning combination
Research-Open Access-Social Media: a winning combination

Similar to NEWORDER Project - Science in the online knowledge order (20)

PPTX
OII Summer Doctoral Programme 2010: Global brain by Meyer & Schroeder
PPTX
Big data divided (24 march2014)
PDF
Being an Open Scholar in a Connected World
PPT
The Internet, Science, and Transformations of Knowledge (Ralph Schroeder)
PPTX
Public engagement while you sleep? How altmetrics can help researchers broade...
PDF
Public engagement while you sleep
PPT
An Online & Social Media Training Curriculum to Facilitate Bench-to-Bedside I...
PPT
Can Technology 'Democratize' Academia?
PPTX
Public engagement while you sleep
PDF
Social Media in Science and Altmetrics - New Ways of Measuring Research Impact
PDF
Sari18 sept2015
PDF
Weller social media as research data_psm15
PDF
Scholarly Communication: Tools and Strategies for Learning and Sharing in the...
PDF
Scienceofscience
PPTX
Social media as a tool for researchers
PPTX
Evolving and emerging scholarly communication services in libraries: public a...
PDF
Studying Cybercrime: Raising Awareness of Objectivity & Bias
PPTX
Digital Scholarly Communication @Claremont Colleges
PDF
An introduction to social media for scientists
PDF
From Informal Academic Debate To Cyber Harassment- Navigating The Minefield A...
OII Summer Doctoral Programme 2010: Global brain by Meyer & Schroeder
Big data divided (24 march2014)
Being an Open Scholar in a Connected World
The Internet, Science, and Transformations of Knowledge (Ralph Schroeder)
Public engagement while you sleep? How altmetrics can help researchers broade...
Public engagement while you sleep
An Online & Social Media Training Curriculum to Facilitate Bench-to-Bedside I...
Can Technology 'Democratize' Academia?
Public engagement while you sleep
Social Media in Science and Altmetrics - New Ways of Measuring Research Impact
Sari18 sept2015
Weller social media as research data_psm15
Scholarly Communication: Tools and Strategies for Learning and Sharing in the...
Scienceofscience
Social media as a tool for researchers
Evolving and emerging scholarly communication services in libraries: public a...
Studying Cybercrime: Raising Awareness of Objectivity & Bias
Digital Scholarly Communication @Claremont Colleges
An introduction to social media for scientists
From Informal Academic Debate To Cyber Harassment- Navigating The Minefield A...
Ad

More from Stefan Dietze (20)

PDF
Understanding Scientific and Societal Adoption and Impact of Science Through ...
PDF
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
PDF
An interdisciplinary journey with the SAL spaceship – results and challenges ...
PDF
Research Knowledge Graphs at NFDI4DS & GESIS
PDF
Research Knowledge Graphs at GESIS & NFDI4DataScience
PDF
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
PDF
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
PDF
Towards research data knowledge graphs
PDF
Beyond research data infrastructures: exploiting artificial & crowd intellige...
PDF
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
PDF
Using AI to understand everyday learning on the Web
PDF
Analysing User Knowledge, Competence and Learning during Online Activities
PDF
Analysing & Improving Learning Resources Markup on the Web
PDF
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
PDF
Big Data in Learning Analytics - Analytics for Everyday Learning
PDF
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
PDF
Mining and Understanding Activities and Resources on the Web
PDF
Towards embedded Markup of Learning Resources on the Web
PDF
Semantic Linking & Retrieval for Digital Libraries
PDF
Linked Data for Architecture, Engineering and Construction (AEC)
Understanding Scientific and Societal Adoption and Impact of Science Through ...
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
An interdisciplinary journey with the SAL spaceship – results and challenges ...
Research Knowledge Graphs at NFDI4DS & GESIS
Research Knowledge Graphs at GESIS & NFDI4DataScience
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Towards research data knowledge graphs
Beyond research data infrastructures: exploiting artificial & crowd intellige...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
Using AI to understand everyday learning on the Web
Analysing User Knowledge, Competence and Learning during Online Activities
Analysing & Improving Learning Resources Markup on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Big Data in Learning Analytics - Analytics for Everyday Learning
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Mining and Understanding Activities and Resources on the Web
Towards embedded Markup of Learning Resources on the Web
Semantic Linking & Retrieval for Digital Libraries
Linked Data for Architecture, Engineering and Construction (AEC)
Ad

Recently uploaded (20)

PPTX
C1 cut-Methane and it's Derivatives.pptx
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
Overview of calcium in human muscles.pptx
PPTX
Microbiology with diagram medical studies .pptx
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PDF
Sciences of Europe No 170 (2025)
PDF
An interstellar mission to test astrophysical black holes
PPT
6.1 High Risk New Born. Padetric health ppt
PPTX
2. Earth - The Living Planet earth and life
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
C1 cut-Methane and it's Derivatives.pptx
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
2Systematics of Living Organisms t-.pptx
Overview of calcium in human muscles.pptx
Microbiology with diagram medical studies .pptx
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Sciences of Europe No 170 (2025)
An interstellar mission to test astrophysical black holes
6.1 High Risk New Born. Padetric health ppt
2. Earth - The Living Planet earth and life
neck nodes and dissection types and lymph nodes levels
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
7. General Toxicologyfor clinical phrmacy.pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
2. Earth - The Living Planet Module 2ELS
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
Placing the Near-Earth Object Impact Probability in Context
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Looking into the jet cone of the neutrino-associated very high-energy blazar ...

NEWORDER Project - Science in the online knowledge order

  • 1. NEWORDER – Science in the Online Knowledge Order Stefan Dietze, 13.10.2023
  • 2. Discourse Interactions Algorithms/AI Motivation: science online discourse vs offline society & policies Society, Media, Politics & Policies (Offline & Online) Science discourse online (NEWORDER focus: news & social media) 2
  • 3. 3 ▪ Percentage of tweets containing links to scientific articles (journals, publishers, science blogs etc) ▪ Uses list of > 17 K science web domains (URLs) ▪ Data source: 1% sample of Twitter (https://data.gesis.org/tweetskb/), (> 14 bn tweets archived since 2013) Motivation: scientific online discourse is on the rise Example: Twitter / X
  • 4. 4 NEWORDER project Interdisciplinary approach, team and objectives ▪ Perception of roles, sources, and authority; impact on trust- worthiness assessment (Cognitive Psychology) ▪ Dissolution of phases, hierarchies and contexts in the scientific process (Social Sciences, Media & Communication Studies) ▪ Computational methods for collecting, detecting and classifying scientific online discourse (Computer Science/AI & Computational Linguistics) Cress, Utz (IWM & Uni Tübingen) Marcinkowski, Koss (HHU) Dietze, Boland, Jabeen (GESIS), Kallmeyer (HHU)
  • 5. How can „scientific discourse“ be defined? Example: Twitter / X 5 Science claim Science reference Science relevance No science Science reference Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse, CIKM2022
  • 6. Training AI to detect science discourse: SciTweets dataset & classifier 6 ▪ Manual annotation of ground truth dataset for testing models (heuristics-based sampling, annotation framework, > 1K annotated tweets) ▪ Training AI models to detect science discourse in large-scale discourse data (e.g. Web archives) ▪ Reasonable classification performance using fine- tuned language model (SciBERT) applied to TweetsKB data Hafid, S., Schellhammer, S., Bringay, S., Todorov, K., Dietze, S., SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse, CIKM2022 https://github.com/AI-4-Sci/SciTweets
  • 7. What is science discourse and how does it evolve? Increasing amount and proportion of not peer-reviewed science works 7 Absolute amount of tweets sharing preprints Proportion of preprints among shared science URLs
  • 8. How is public attention distributed? Power law distribution 8 • 10% of studies receive > 75% of all Twitter mentions • Long tail of studies with few mentions • Data source: 1.67 M tweets mentioning at least one of the primary science studies in the „Altmetrics“ corpus Top x (%) of mentioned science studies Share of twitter mentions (%)
  • 9. Challenge: online science discourse is not well-informed Links to actual scientific studies/context missing in news & social media 9
  • 10. ▪ NLP models able to predict missing primary science reference (e.g. DOI or journal paper link) for given informal reference (e.g. “Heinsberg Studie”) or secondary reference (news article) Challenge: online science discourse is not well-informed Links to actual scientific studies/context missing in news & social media 10
  • 11. ▪ NLP models able to predict missing primary science reference (e.g. DOI or journal paper link) for given informal reference (e.g. “Heinsberg Studie”) or secondary reference (news article) Challenge: online science discourse is not well-informed Links to actual scientific studies/context missing in news & social media 11 ▪ Supervised & unsupervised approaches using DL language models
  • 12. Science discourse is „different“ 12 Examples from http://snopes.com Non-science claim Science claim
  • 13. Computational (AI) challenge NLP methods (e.g. for fact-checking) perform worse on science discourse 13 ▪ Take-away: AI-based methods geared towards scientific discourse required Performance of state-of-the-art AI/deep learning using standard benchmark datasets Claim Check-Worthiness Detection Fake News Detection Claim Verification
  • 14. Wrap-Up & Outlook: Interdisciplinary Work Plan Media & Communication Studies (spreading pattern & societal impact) WP5 Longitudinal online discourse analysis WP2 Dissolution of phases & contexts WP3 Perception of roles, credibility & trust WP1 Data collection & study preparation Cognitive & Social Psychology (effects on individuals) Computer & Information Science (understanding online discourse) WP4 NLP for classifying sources & roles