#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV
1
Learning in the wild: Predicting the formation
of ties in ‘Ask’ subreddit communities using
ERG models
Marc Esteve del Valle, University of Groningen
Anatoliy Gruzd, Ryerson University
Caroline Haythornthwaite, Syracuse University
Priya Kumar, Ryerson University
Sarah Gilbert, University of British Columbia
Drew Paulin, University of California Berkeley
Networked Learning 2018
May 14-16, Zagreb, Croatia
How do we make sense of the vast amount of data
generated by learners, especially in the “wild”?
‘Learning analytics is concerned with
collecting data from learners’ actions,
developing techniques to analyze
these data, and making the results
useful to practitioners’ (Long &
Siemens, 2011).
Learning analytics focuses on making
sense of “big data” collected from
learning management systems (LMS),
MOOCs and social media
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 2
We wanted to know
1. How are social media sites used in informal learning?
2. How are people engaging in informal learning processes on Reddit?
3. How do network configurations and individuals’ attributes affect access
and the ability to act on informal networked learning environments?
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 3
Introducing Reddit
• “the front page of the Internet”
• Reddit ranks 17th in terms of global traffic
and 4th in the U.S
• Diversity of niche communities called
‘subreddits’ maintain distinct subcultures,
thematic interests and subject focus
• Subreddit communities are user-generated
and moderated by Reddit members
• Content curation (Upvoting/downvoting)
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 4
‘AskHistorians’ Subreddit (May 15, 2017)
Sampled Subreddits
Subreddit Mandate Guidelines and Norms
Asksocialscience • The goal of AskSocialScience is to provide great
answers to social science questions, based on solid
theory, practice, and research.
• All claims in the top comments must be supported
by citations
• Questions should be novel and specific and
answerable. No “what if” question that require
speculative answers.
• Top level comments must be serious attempts to
answer the question, focus the question or ask
follow-up questions.
• Discussions must be based on social science
findings and research, not opinions, anecdotes or
personal politics.
Askstatistics • Ask a question about statistics • If it is a homework send it to r/homeworkhelp
• If you answer a question you can assign your own
flair to briefly describe your education or
professional background in statistics.
• If the question is “What statistical test should I use
for this data/hypothesis?”, then start by reading
this and ask follow-ups as necessary.
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 5
Data Collection
Subreddit Community Year of
Creation
Number of
Subscribers
(2018)
Number of
Posts
(2015)
Number of
Replies
(2015)
Asksocialscience 2012 65,975 1,523 7,723
Askstatistics 2012 8,317 2,352 4,301
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 6
Data Collection
Redditors’ Characteristics Karma ‘Link’
Points
Gold Membership
Status
Moderator Role
Asksocialscience X X X
Askstatistics X X X
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 7
Methods
Exponential Random Graph Models (p* models):
Goal: test whether the presence of ties in the subreddits was based more on the
network properties and the nodes’ attributes than by chance alone.
Network properties:
Reciprocity
Transitivity
Popularity
Redittors’ attributes:
Karma ‘link’ points
Gold Membership status
Moderator role
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 8
Results (Descriptive Network Statistics)
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 9
AskStatistics AskSocialScience
P (number of
posts)
2,352 1,523
N (number of
nodes)
1,951 3,689
R (number or
replies)
4,301 7,723
Graph Density 0.001 0.001
Average Path
Length
4.409 5.232
Average
Degree
2.205 2.094
Results (Network Visualizations)
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 10
AskStatistics AskSocialScience
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 11
Results (ERG models)
ERG model (Model 4) AskStatistics AskSocialScience
EST SE EST SE
Structural features
Edges -7.348 8.111 -7.174 5.830
Reciprocity 6.789 3.427 8.041 1.031
Popularity -1.151 1.362 -2.375 8.604
Transitivity 6.137 3.383 3.940 1.644
Redditors’ Attributes
Gold Member 3.939 9.670 -1.544 5.207
Karma -1.027 3.658 4.982 3.966
Moderator 9.343 9.779 9.903 4.430
Akaike Information Criterion (AIC) 58,294 119,680
Bayesian Information Criterion (BIC) 58,386 119,781
Note: Coefficients in bold are significant at the 99 percent level
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 12
Results (Godness-of-Fit Diagnostics)
Note: Godness-of-Fit Diagnostics (Model 4- AskStatistics)
• Network level: network ties are formed in both subreddits, although the level
of connectivity in both networks is low.
• Meso level (network parameters): Reciprocity and transitivity increase the
Redditors’ likelihood of establishing networked ties, whereas popularity
decreases their likelihood of forming ties.
• Individual level (Redittors’ attributes): Being a moderator highly increase the
likelihood of establishing networked ties in both networks, whereas no clear
conclusions can be drawn on the effects of karma (points) and being a ‘Gold
Member’ in increasing or decreasing these ties.
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 13
Summary of Results
Future Work
• Determine the effects of popularity on establishing communications ties in
Reddit.
• Extend our analysis of networked learning to other subreddits (e.g.
‘AskHistorians’).
• Extend our analysis of networked learning to other social media (e.g. Twitter).
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 14
#pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV
15
Learning in the wild: Predicting the formation
of ties in ‘Ask’ subreddit communities using
ERG models
Marc Esteve del Valle, University of Groningen
Anatoliy Gruzd, Ryerson University
Caroline Haythornthwaite, Syracuse University
Priya Kumar, Ryerson University
Sarah Gilbert, University of British Columbia
Drew Paulin, University of California Berkeley
Networked Learning 2018
May 14-16, Zagreb, Croatia

More Related Content

PPT
1 Mechanics
PPTX
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
PPT
Finding Emerging Topics Using Chaos and Community Detection in Social Media G...
PDF
Introduction to Social Network Analysis
PPTX
Social Network Analysis Introduction including Data Structure Graph overview.
PPT
How to conduct a social network analysis: A tool for empowering teams and wor...
PPTX
Infotainment and the Impact of Connective Action: The Case of #MilkedDry
PPT
The Basics of Social Network Analysis
1 Mechanics
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Finding Emerging Topics Using Chaos and Community Detection in Social Media G...
Introduction to Social Network Analysis
Social Network Analysis Introduction including Data Structure Graph overview.
How to conduct a social network analysis: A tool for empowering teams and wor...
Infotainment and the Impact of Connective Action: The Case of #MilkedDry
The Basics of Social Network Analysis

What's hot (20)

PDF
Social listening: how to do it and how to use (SNA Perspective)
PDF
03 Ego Network Analysis
PPTX
Design and Evaluation of Euler Diagram and Treemap for Social Network Visuali...
PPT
Detecting Communities in Science Blogs
PPTX
02 Network Data Collection
PPTX
11 Network Experiments and Interventions
PDF
01 Network Data Collection
PPTX
Social Network Analysis (Part 1)
PPTX
Who creates trends in online social media
PPTX
Cite track presentation
PDF
05 Communities in Network
PPTX
Dynamics of a Scandal: The Centrelink Robodebt Affair on Twitter
PPT
01 Introduction to Networks Methods and Measures
PPTX
Social Media in Australia: A ‘Big Data’ Perspective on Twitter
PPTX
07 Whole Network Descriptive Statistics
PDF
Social Network Analysis (SNA) Made Easy
PDF
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
PDF
Interpreting sslar
PPTX
AAAS 2014: How the Web Changes Collaboration
PPTX
10 More than a Pretty Picture: Visual Thinking in Network Studies
Social listening: how to do it and how to use (SNA Perspective)
03 Ego Network Analysis
Design and Evaluation of Euler Diagram and Treemap for Social Network Visuali...
Detecting Communities in Science Blogs
02 Network Data Collection
11 Network Experiments and Interventions
01 Network Data Collection
Social Network Analysis (Part 1)
Who creates trends in online social media
Cite track presentation
05 Communities in Network
Dynamics of a Scandal: The Centrelink Robodebt Affair on Twitter
01 Introduction to Networks Methods and Measures
Social Media in Australia: A ‘Big Data’ Perspective on Twitter
07 Whole Network Descriptive Statistics
Social Network Analysis (SNA) Made Easy
13 An Introduction to Stochastic Actor-Oriented Models (aka SIENA)
Interpreting sslar
AAAS 2014: How the Web Changes Collaboration
10 More than a Pretty Picture: Visual Thinking in Network Studies
Ad

Similar to Learning in the wild: Predicting the formation of ties in 'Ask' subreddit communities using ERG models (20)

PDF
Social Network Analysis based on MOOC's (Massive Open Online Classes)
PDF
Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...
PPTX
Working with Social Media Data: Ethics & good practice around collecting, usi...
PPTX
Learning Analytics: Realizing the Big Data Promise in the CSU
PPTX
NCME Big Data in Education
PDF
Big Data Analytics : A Social Network Approach
PDF
How data informs decision making 2
PDF
Sdal air education workforce analytics workshop jan. 7 , 2014.pptx
PDF
Organizing to Get Analytics Right
PPT
BIG-DATAPPTFINAL.ppt
PPTX
SHEILA-CRLI seminar
PDF
STEM EcoSystem Overview
PPTX
Will Data Science Approaches Impact Our Science?
PDF
Big data dan riset sosial dan politik
PPTX
Ona For Community Roundtable
PPTX
Classification & Clustering.pptx
PPT
Big Data ( Charactertics of 6vs of Big Data)
PDF
Profile Analysis of Users in Data Analytics Domain
PPTX
Real-time applications of Data Science.pptx
PPT
Data Sharing & Data Citation
Social Network Analysis based on MOOC's (Massive Open Online Classes)
Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...
Working with Social Media Data: Ethics & good practice around collecting, usi...
Learning Analytics: Realizing the Big Data Promise in the CSU
NCME Big Data in Education
Big Data Analytics : A Social Network Approach
How data informs decision making 2
Sdal air education workforce analytics workshop jan. 7 , 2014.pptx
Organizing to Get Analytics Right
BIG-DATAPPTFINAL.ppt
SHEILA-CRLI seminar
STEM EcoSystem Overview
Will Data Science Approaches Impact Our Science?
Big data dan riset sosial dan politik
Ona For Community Roundtable
Classification & Clustering.pptx
Big Data ( Charactertics of 6vs of Big Data)
Profile Analysis of Users in Data Analytics Domain
Real-time applications of Data Science.pptx
Data Sharing & Data Citation
Ad

More from University of Groningen (The Netherlands) (12)

PPTX
Homophily in Twitter Political Networks_A Cross_Country Analysis_Presentation...
PPTX
The Patio_Presentation_Second_Meeting_Dr.Sean White
PPTX
Presentation platform politics_mev_11_13_2019
PPTX
Social media and political polarization: the Catalan case
PPTX
Masterclass Research Support
PPTX
Ecpr general conference_presentation
PPTX
Unpredictably Trump? Predicting the Clictivist-like Actions on Trump's Facebo...
PPTX
Automated Analysis of Journalists' and Politicians' Online Behavior on Social...
PPTX
Training Session on Using Nvivo and SPSS
PPTX
Is social media challenging traditional politics ?
PPTX
Project on Learning Analytics in the Social Media Age (#pLASMA)
Homophily in Twitter Political Networks_A Cross_Country Analysis_Presentation...
The Patio_Presentation_Second_Meeting_Dr.Sean White
Presentation platform politics_mev_11_13_2019
Social media and political polarization: the Catalan case
Masterclass Research Support
Ecpr general conference_presentation
Unpredictably Trump? Predicting the Clictivist-like Actions on Trump's Facebo...
Automated Analysis of Journalists' and Politicians' Online Behavior on Social...
Training Session on Using Nvivo and SPSS
Is social media challenging traditional politics ?
Project on Learning Analytics in the Social Media Age (#pLASMA)

Recently uploaded (20)

PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PPTX
modul_python (1).pptx for professional and student
PPTX
SET 1 Compulsory MNH machine learning intro
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Introduction to Data Science and Data Analysis
PPTX
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PPTX
IMPACT OF LANDSLIDE.....................
PDF
Microsoft Core Cloud Services powerpoint
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
Managing Community Partner Relationships
DOCX
Factor Analysis Word Document Presentation
PDF
Introduction to the R Programming Language
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PDF
[EN] Industrial Machine Downtime Prediction
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
DU, AIS, Big Data and Data Analytics.ppt
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
modul_python (1).pptx for professional and student
SET 1 Compulsory MNH machine learning intro
A Complete Guide to Streamlining Business Processes
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Introduction to Data Science and Data Analysis
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
SAP 2 completion done . PRESENTATION.pptx
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
IMPACT OF LANDSLIDE.....................
Microsoft Core Cloud Services powerpoint
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
Managing Community Partner Relationships
Factor Analysis Word Document Presentation
Introduction to the R Programming Language
Topic 5 Presentation 5 Lesson 5 Corporate Fin
[EN] Industrial Machine Downtime Prediction
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}

Learning in the wild: Predicting the formation of ties in 'Ask' subreddit communities using ERG models

  • 1. #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 1 Learning in the wild: Predicting the formation of ties in ‘Ask’ subreddit communities using ERG models Marc Esteve del Valle, University of Groningen Anatoliy Gruzd, Ryerson University Caroline Haythornthwaite, Syracuse University Priya Kumar, Ryerson University Sarah Gilbert, University of British Columbia Drew Paulin, University of California Berkeley Networked Learning 2018 May 14-16, Zagreb, Croatia
  • 2. How do we make sense of the vast amount of data generated by learners, especially in the “wild”? ‘Learning analytics is concerned with collecting data from learners’ actions, developing techniques to analyze these data, and making the results useful to practitioners’ (Long & Siemens, 2011). Learning analytics focuses on making sense of “big data” collected from learning management systems (LMS), MOOCs and social media #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 2
  • 3. We wanted to know 1. How are social media sites used in informal learning? 2. How are people engaging in informal learning processes on Reddit? 3. How do network configurations and individuals’ attributes affect access and the ability to act on informal networked learning environments? #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 3
  • 4. Introducing Reddit • “the front page of the Internet” • Reddit ranks 17th in terms of global traffic and 4th in the U.S • Diversity of niche communities called ‘subreddits’ maintain distinct subcultures, thematic interests and subject focus • Subreddit communities are user-generated and moderated by Reddit members • Content curation (Upvoting/downvoting) #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 4 ‘AskHistorians’ Subreddit (May 15, 2017)
  • 5. Sampled Subreddits Subreddit Mandate Guidelines and Norms Asksocialscience • The goal of AskSocialScience is to provide great answers to social science questions, based on solid theory, practice, and research. • All claims in the top comments must be supported by citations • Questions should be novel and specific and answerable. No “what if” question that require speculative answers. • Top level comments must be serious attempts to answer the question, focus the question or ask follow-up questions. • Discussions must be based on social science findings and research, not opinions, anecdotes or personal politics. Askstatistics • Ask a question about statistics • If it is a homework send it to r/homeworkhelp • If you answer a question you can assign your own flair to briefly describe your education or professional background in statistics. • If the question is “What statistical test should I use for this data/hypothesis?”, then start by reading this and ask follow-ups as necessary. #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 5
  • 6. Data Collection Subreddit Community Year of Creation Number of Subscribers (2018) Number of Posts (2015) Number of Replies (2015) Asksocialscience 2012 65,975 1,523 7,723 Askstatistics 2012 8,317 2,352 4,301 #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 6
  • 7. Data Collection Redditors’ Characteristics Karma ‘Link’ Points Gold Membership Status Moderator Role Asksocialscience X X X Askstatistics X X X #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 7
  • 8. Methods Exponential Random Graph Models (p* models): Goal: test whether the presence of ties in the subreddits was based more on the network properties and the nodes’ attributes than by chance alone. Network properties: Reciprocity Transitivity Popularity Redittors’ attributes: Karma ‘link’ points Gold Membership status Moderator role #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 8
  • 9. Results (Descriptive Network Statistics) #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 9 AskStatistics AskSocialScience P (number of posts) 2,352 1,523 N (number of nodes) 1,951 3,689 R (number or replies) 4,301 7,723 Graph Density 0.001 0.001 Average Path Length 4.409 5.232 Average Degree 2.205 2.094
  • 10. Results (Network Visualizations) #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 10 AskStatistics AskSocialScience
  • 11. #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 11 Results (ERG models) ERG model (Model 4) AskStatistics AskSocialScience EST SE EST SE Structural features Edges -7.348 8.111 -7.174 5.830 Reciprocity 6.789 3.427 8.041 1.031 Popularity -1.151 1.362 -2.375 8.604 Transitivity 6.137 3.383 3.940 1.644 Redditors’ Attributes Gold Member 3.939 9.670 -1.544 5.207 Karma -1.027 3.658 4.982 3.966 Moderator 9.343 9.779 9.903 4.430 Akaike Information Criterion (AIC) 58,294 119,680 Bayesian Information Criterion (BIC) 58,386 119,781 Note: Coefficients in bold are significant at the 99 percent level
  • 12. #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 12 Results (Godness-of-Fit Diagnostics) Note: Godness-of-Fit Diagnostics (Model 4- AskStatistics)
  • 13. • Network level: network ties are formed in both subreddits, although the level of connectivity in both networks is low. • Meso level (network parameters): Reciprocity and transitivity increase the Redditors’ likelihood of establishing networked ties, whereas popularity decreases their likelihood of forming ties. • Individual level (Redittors’ attributes): Being a moderator highly increase the likelihood of establishing networked ties in both networks, whereas no clear conclusions can be drawn on the effects of karma (points) and being a ‘Gold Member’ in increasing or decreasing these ties. #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 13 Summary of Results
  • 14. Future Work • Determine the effects of popularity on establishing communications ties in Reddit. • Extend our analysis of networked learning to other subreddits (e.g. ‘AskHistorians’). • Extend our analysis of networked learning to other social media (e.g. Twitter). #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 14
  • 15. #pLASMA M.ESTEVE.DEL.VALLE@RUG.NL - @NETMEV 15 Learning in the wild: Predicting the formation of ties in ‘Ask’ subreddit communities using ERG models Marc Esteve del Valle, University of Groningen Anatoliy Gruzd, Ryerson University Caroline Haythornthwaite, Syracuse University Priya Kumar, Ryerson University Sarah Gilbert, University of British Columbia Drew Paulin, University of California Berkeley Networked Learning 2018 May 14-16, Zagreb, Croatia