SlideShare a Scribd company logo
DATA
SCIENCE
POP UP
AUSTIN
The Science of Sharing
Jason Baldridge
Co-Founder / Chief Scientist, People Pattern
jasonbaldridge
DATA
SCIENCE
POP UP
AUSTIN
#datapopupaustin
April 13, 2016
Galvanize, Austin Campus
Data Science Popup Austin: The Science of Sharing
The Science of Sharing
Jason Baldridge
Co-founder, People Pattern
Associate Professor, The University of Texas at Austin
@jasonbaldridge
Preliminary notes
• This talk incorporates results and images from many different research papers by people working primarily in social
network analysis.
• As such, this talk is a synthesis of that work put together into a narrative to introduce key abilities and results. I felt this
high-level view was the best way to discuss “The Science of Sharing”, rather than relying primarily on my own work or work
done at People Pattern. Also, I was really impressed by the work researchers are doing in social network analysis and
wanted to share even a glimpse of the problems they are tackling and what they are finding.
• The high-level progression of this talk is:
• Document analysis at scale: meme tracking combined with other variables like sentiment and bias
• Social network at scale: information cascades and virality, inference of social networks given meme-like information as
contagions.
• The node level perspective and its effects on what an individual sees and shares: Illusions, effort and overload, topics,
personality and demographics.
• Personas and segmentation: grouping based on demographics and interests.
• The last item is work done at People Pattern. I stress that neither I nor People Pattern was involved with the research
papers cited in the other slides. My own academic research focuses on natural language processing, especially machine
learning for learning syntactic parsers and performing geolocation using text. For more on those topics, see: http://
www.jasonbaldridge.com/papers
• References and links to PDF’s of all cited work are at the end of this deck. They are also available on this post on my blog:
https://bcomposes.wordpress.com/2015/10/23/references-for-my-izeafest-talk/
Meme tracking
Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.”
Automatic detection and tracking of memes over time.
Meme tracking
Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.”
Meme oscillation heartbeat from blogs to mainstream media.
Quoting Patterns in
Political Coverage
Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.”
Measuring bias is subjective and hard.
Personal estimates of bias are influenced by the availability heuristic.
57% of Americans perceive media as biased.
73% of conservatives think bias is liberal.
11% of liberals think bias is liberal.
Similarly: husbands and wives both estimate their
contributions to family activities differently.
[Lee & Waite (2005): http://www.jstor.org/stable/3600272]
Read this!
Quoting Patterns in
Political Coverage
Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.”
Automated tracking of quotations from Obama’s speeches.
Red: quoted in
conservative media. Blue: quoted in
liberal media.
Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.”
Dimensionality reduction reveals two main bias dimensions:
(one) independent-mainstream & (two) foreign-liberal-conservative.
Quoting Patterns in
Political Coverage
Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.”
Sentiment across two bias dimensions:
more mainstream & conservative correlates with negative sentiment.
Quoting Patterns in
Political Coverage
Structural Virality
Goel et al. (2015). “The Structural Virality of Online Diffusion”
Information cascades can propagate via broadcast and viral diffusion.
Most cascades contain both broadcast and viral spreading.
Broadcast Viral
Structural Virality
Goel et al. (2015). “The Structural Virality of Online Diffusion”
Twitter cascades characterized by structural virality,
increasing down and to the right.
Structural Virality
Goel et al. (2015). “The Structural Virality of Online Diffusion”
Petition cascades are smallest, but have highest structural virality.
Structural Virality
Goel et al. (2015). “The Structural Virality of Online Diffusion”
99% of content
adoptions terminate in
a single generation
The largest image and
video cascades are low on
structural virality.
Broadcast is by far the dominant mode to reach large audiences.
This means pay-to-play when you need to go big reliably.
Information propagation
Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”
Contagion model: Information infects nodes, which become
active.
Information spreads from active nodes along the network edges.
Information propagation
Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”
Given information cascades, infer network using contagion model.
Information propagation
Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”
Inferred structure shows emerging and vanishing clusters.
Red: mainstream media. Blue: blogs.
March 2011
June 2011
October 2011
Information propagation
Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”
Evolution of network for Fukushima articles.
Information propagation
Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”
Information generally flows from mainstream media to blogs.
Blogs play a crucial role in information dissemination in civil movements.
Information propagation
Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.”
Blogs and mainstream media swap influence during course of event.
Increased blog influence proportion correlates with social unrest.
Is virality/contagion
a bad metaphor?
Taylor Swift has 65 million Twitter followers who can
receive her messages. One individual cannot sneeze on
and infect that many people simultaneously.
The likelihood of disease infection increases
independently with exposure to different infected
individuals, but “infection” by an idea increases greatly
when exposed to it by multiple, independent parties.
Majority illusion
Lerman et al. (2015). “The Majority Illusion in Social Networks.”
Friendship paradox: on average most people have
fewer friends than their friends.
This generalizes to any node attribute, which may
explain why people overestimate their friends’
alcohol consumption.
Majority illusion
Lerman et al. (2015). “The Majority Illusion in Social Networks.”
The connectedness of “infected” people greatly impacts the perception of others.
A minority opinion can appear extremely popular for each individual (left side).
Majority illusion
Lerman et al. (2015). “The Majority Illusion in Social Networks.”
The size of majority illusion in Digg and political blogs, varying
the number and connectedness of infected nodes.
Personality classification
Yarkoni (2010). “Personality in 100,000 Words: A large-scale analysis of personality and word use among bloggers.”
Language production provides a window on personality at scale.
Personality classification
Iacobelli et al. (2015). “Large Scale Personality Classification of Bloggers.”
Bigrams as indicators of high/low scorers in personality classification.
High scorers Low scorers
Neuroticism
Extroversion
Openness
Agreeableness
Conscientiousness
Ad Targeting
and Personality
Chen et al. (2015). “Making Use of Derived Personality: The Case of Social Media Ad Targeting.”
Twitter users whose language indicates higher openness and lower
neuroticism are more likely to respond positively to an ad.
Antisocial Behavior Online
Cheng et al. (2015). “Antisocial Behavior in Online Discussion Communities.”
Comparing banned & normal users (in retrospect): banned users wrote
posts that are less relevant, harder to read, and less positive.
FBU: Future banned users
NBU: Never banned users
Race and sharing
http://www.theatlantic.com/technology/archive/2015/10/race-social-media/408889/
Frequency of sharing for topics on social media varies by race.
Events or entertainment Education or schools
Re “race”: please
read this book.
Tailored audiences
People Pattern and Smarty Pants Vitamins case study.
Human analysis and machine learning can be used to characterize
and identify personas using social media profiles.
+
Tailored audiences
People Pattern and Smarty Pants Vitamins case study.
Interest prediction and extraction of interest-specific keywords.
Promoted tweet copy informed by persona-based keywords.
+
Tailored audiences
People Pattern and Smarty Pants Vitamins case study.
Persona-based campaigns with audience-driven ad copy
produced higher engagement at lower cost per conversion.
+
Conversions
0
60
120
180
240
Control Overscheduled Parent Grab & Go
Cost per conversion
0
10
20
30
40
Sub-micro segmentation
We have limited attention and many options.
The best, most relevant content is often created by those
with very similar passions, interests, and demographics.
Doresa Jennings Cheryl Baldridge
• PhD, BGSU
• Lives in the southern USA
• Mother of profoundly gifted
children
• Homeschooler
• Commitment to STEM
• African-American
• JD, Yale
• Lives in the southern USA
• Mother of profoundly gifted
children
• Homeschooler
• Commitment to STEM
• African-American
Dr. J creates a lot of original text and video.
My busy wife makes time for it all.
Other content is less compelling for her.
http://kdacademy.blogspot.com/
https://www.youtube.com/user/DAJedu
Conclusion
Authentic, original
content is the most
compelling.
Audience understanding is essential:
demographics, personality and microsegment relevance.
Pay-to-play to reliably get
your word out.
Content consumers must
constantly manage
information overload.
Large scale analysis of
networks and documents
reveals hidden patterns.
References
• Chen et al. (2015). “Making Use of Derived Personality: The Case of Social Media Ad Targeting.”
- http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10508
• Cheng et al. (2015). “Antisocial Behavior in Online Discussion Communities.” - http://arxiv.org/
abs/1504.00680
• Friggeri et al. (2015). “Rumor Cascades.” - http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/
paper/view/8122
• Goel et al. (2015). “The Structural Virality of Online Diffusion.” - https://5harad.com/papers/
twiral.pdf
• Gomez-Rodriguez et al. (2014). “Quantifying Information Overload in Social Media and its
Impact on Social Contagions.” - http://arxiv.org/abs/1403.6838
• Gomez Rodriguez et al. (2014). "Uncovering the structure and temporal dynamics of information
propagation." - http://www.mpi-sws.org/~manuelgr/pubs/S2050124214000034a.pdf
• Iacobelli et al. (2015). “Large Scale Personality Classification of Bloggers.” - http://
www.research.ed.ac.uk/portal/files/12949424/
Iacobelli_Gill_et_al_2011_Large_scale_personality_classification_of_bloggers.pdf
References
• Kang and Lerman (2015). “User Effort and Network Structure Mediate Access to Information in
Networks.” - http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10483
• Kooti et al. (2015). “Evolution of Conversations in the Age of Email Overload.” - http://arxiv.org/
abs/1504.00704
• Kulshrestha et al (2015). “Characterizing Information Diets of Social Media Users.” - https://
www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/viewFile/10595/10505
• Lerman et al. (2015). “The Majority Illusion in Social Networks.” - http://arxiv.org/abs/1506.03022
• Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.” - http://
www.memetracker.org/quotes-kdd09.pdf
• Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by
Quoting Patterns.” - http://snap.stanford.edu/quotus/
• Weng et al. (2014). “Predicting Successful Memes using Network and Community Structure.” -
http://arxiv.org/abs/1403.6199
• Yarkoni (2010). “Personality in 100,000 Words: A large-scale analysis of personality and word
use among bloggers.” - http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2885844/
DATA
SCIENCE
POP UP
AUSTIN
@datapopup
#datapopupaustin

More Related Content

PDF
People Pattern: "The Science of Sharing"
PDF
The Benefits and Barriers for Social Media for Scientists
PDF
APS 2014 - Social Media Can Be For Science
PPTX
Science and Social Media: The Importance of Being Online
PPTX
Outreach Through Social Media | Ocean Sciences 2014
PDF
Temporal_Patterns_of_Misinformation_Diffusion_in_Online_Social_Networks
PPTX
The View from Here and Here: Making the Invisible Visible in the Hypertextual...
PPTX
Casting A Wider Net: Using social media to improve scientific research, commu...
People Pattern: "The Science of Sharing"
The Benefits and Barriers for Social Media for Scientists
APS 2014 - Social Media Can Be For Science
Science and Social Media: The Importance of Being Online
Outreach Through Social Media | Ocean Sciences 2014
Temporal_Patterns_of_Misinformation_Diffusion_in_Online_Social_Networks
The View from Here and Here: Making the Invisible Visible in the Hypertextual...
Casting A Wider Net: Using social media to improve scientific research, commu...

What's hot (20)

PPTX
Social media as echo chamber
PDF
Detailed Research on Fake News: Opportunities, Challenges and Methods
PPT
How information spreads on social networks when unexpected events occur
PPT
Social Web 2.0 Class Week 4: Social Networks, Privacy
PDF
Science and Social Media
PPTX
AAPOR - comparing found data from social media and made data from surveys
PPTX
NASW Workshop: The Secret Life of Social Media
PPT
Picturing the Social: Talk for Transforming Digital Methods Winter School
PDF
Measuring User Influence in Twitter
PPT
Peace on Facebook? Problematising social media as spaces for intergroup conta...
PDF
Impact & Interaction: social media as part of communication strategy for rese...
PDF
Studying Cybercrime: Raising Awareness of Objectivity & Bias
PPT
To Comment Or Not To Comment - Marie K. Shanahan
PPTX
Researching Social Media – Big Data and Social Media Analysis
PPT
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
PPT
Social Media Analysis: Present and Future
PPTX
Beyond the Bubble: A Critical Review of the Evidence for Echo Chambers and Fi...
PPT
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
PPT
The evolution of research on social media
Social media as echo chamber
Detailed Research on Fake News: Opportunities, Challenges and Methods
How information spreads on social networks when unexpected events occur
Social Web 2.0 Class Week 4: Social Networks, Privacy
Science and Social Media
AAPOR - comparing found data from social media and made data from surveys
NASW Workshop: The Secret Life of Social Media
Picturing the Social: Talk for Transforming Digital Methods Winter School
Measuring User Influence in Twitter
Peace on Facebook? Problematising social media as spaces for intergroup conta...
Impact & Interaction: social media as part of communication strategy for rese...
Studying Cybercrime: Raising Awareness of Objectivity & Bias
To Comment Or Not To Comment - Marie K. Shanahan
Researching Social Media – Big Data and Social Media Analysis
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Social Media Analysis: Present and Future
Beyond the Bubble: A Critical Review of the Evidence for Echo Chambers and Fi...
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
The evolution of research on social media
Ad

Viewers also liked (6)

PDF
5 Golden rules of virality
PPTX
Beyond The Listicle: The Science of Virality
PDF
Mr. Blog Goes to Washington by Keith Reynolds
PDF
Virality: Bandwagons, Timely Events, and User Generated Content (Digital Mark...
PDF
How To Build Your Product For Viral Growth.
PPTX
Social Media Kampagnen - MAZ Kompaktkurs - Good Practice und Mechanismen Habe...
5 Golden rules of virality
Beyond The Listicle: The Science of Virality
Mr. Blog Goes to Washington by Keith Reynolds
Virality: Bandwagons, Timely Events, and User Generated Content (Digital Mark...
How To Build Your Product For Viral Growth.
Social Media Kampagnen - MAZ Kompaktkurs - Good Practice und Mechanismen Habe...
Ad

Similar to Data Science Popup Austin: The Science of Sharing (20)

PDF
Social network analysis and audience segmentation, presented by Jason Baldridge
PPTX
Web Science Session 2: Social Media
PDF
Computational Approaches to Studying Anti-Social Behaviour on Social Media
PDF
Mathematical Models of the Spread of Diseases, Opinions, Information, and Mis...
PPTX
The threats of connectivity
PPTX
How to Become a Successful Digital Scholar
PDF
Media literacy in the age of information overload
PPT
NZAP 2.0 Presentation
PPTX
The Failure of Skepticism: Rethinking Information Literacy and Political Pol...
PDF
Network Theory: A Brief Introduction june 2012
PDF
Introduction to Computational Social Science
PPTX
The Joneses
PDF
Ocswssw digital literacy pub
PPTX
Collectiveactionandchallengesofsocialchange revised
PPTX
Collectiveactionandchallengesofsocialchange revised dec 18
PDF
Thinking in networks: what it means for policy makers – PDF 2014
PPTX
'Drinking from the fire hose? The pitfalls and potential of Big Data'.
PPT
Information Retrieval and Social Media
PDF
Twitter And Society Katrin Weller Axel Bruns Jean Burgess Merja Mahrt
PDF
Thesis - A Little Birdie Told Me
Social network analysis and audience segmentation, presented by Jason Baldridge
Web Science Session 2: Social Media
Computational Approaches to Studying Anti-Social Behaviour on Social Media
Mathematical Models of the Spread of Diseases, Opinions, Information, and Mis...
The threats of connectivity
How to Become a Successful Digital Scholar
Media literacy in the age of information overload
NZAP 2.0 Presentation
The Failure of Skepticism: Rethinking Information Literacy and Political Pol...
Network Theory: A Brief Introduction june 2012
Introduction to Computational Social Science
The Joneses
Ocswssw digital literacy pub
Collectiveactionandchallengesofsocialchange revised
Collectiveactionandchallengesofsocialchange revised dec 18
Thinking in networks: what it means for policy makers – PDF 2014
'Drinking from the fire hose? The pitfalls and potential of Big Data'.
Information Retrieval and Social Media
Twitter And Society Katrin Weller Axel Bruns Jean Burgess Merja Mahrt
Thesis - A Little Birdie Told Me

More from Domino Data Lab (20)

PDF
What's in your workflow? Bringing data science workflows to business analysis...
PDF
The Proliferation of New Database Technologies and Implications for Data Scie...
PDF
Racial Bias in Policing: an analysis of Illinois traffic stops data
PPTX
Data Quality Analytics: Understanding what is in your data, before using it
PPTX
Supporting innovation in insurance with randomized experimentation
PPTX
Leveraging Data Science in the Automotive Industry
PDF
Summertime Analytics: Predicting E. coli and West Nile Virus
PPTX
Reproducible Dashboards and other great things to do with Jupyter
PDF
GeoViz: A Canvas for Data Science
PPTX
Managing Data Science | Lessons from the Field
PDF
Doing your first Kaggle (Python for Big Data sets)
PDF
Leveraged Analytics at Scale
PDF
How I Learned to Stop Worrying and Love Linked Data
PDF
Software Engineering for Data Scientists
PDF
Making Big Data Smart
PPTX
Moving Data Science from an Event to A Program: Considerations in Creating Su...
PPTX
Building Data Analytics pipelines in the cloud using serverless technology
PPTX
Leveraging Open Source Automated Data Science Tools
PPTX
Domino and AWS: collaborative analytics and model governance at financial ser...
PDF
The Role and Importance of Curiosity in Data Science
What's in your workflow? Bringing data science workflows to business analysis...
The Proliferation of New Database Technologies and Implications for Data Scie...
Racial Bias in Policing: an analysis of Illinois traffic stops data
Data Quality Analytics: Understanding what is in your data, before using it
Supporting innovation in insurance with randomized experimentation
Leveraging Data Science in the Automotive Industry
Summertime Analytics: Predicting E. coli and West Nile Virus
Reproducible Dashboards and other great things to do with Jupyter
GeoViz: A Canvas for Data Science
Managing Data Science | Lessons from the Field
Doing your first Kaggle (Python for Big Data sets)
Leveraged Analytics at Scale
How I Learned to Stop Worrying and Love Linked Data
Software Engineering for Data Scientists
Making Big Data Smart
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Building Data Analytics pipelines in the cloud using serverless technology
Leveraging Open Source Automated Data Science Tools
Domino and AWS: collaborative analytics and model governance at financial ser...
The Role and Importance of Curiosity in Data Science

Recently uploaded (20)

PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Computer network topology notes for revision
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Knowledge Engineering Part 1
PDF
annual-report-2024-2025 original latest.
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Mega Projects Data Mega Projects Data
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Fluorescence-microscope_Botany_detailed content
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
ISS -ESG Data flows What is ESG and HowHow
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
STERILIZATION AND DISINFECTION-1.ppthhhbx
1_Introduction to advance data techniques.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Computer network topology notes for revision
IB Computer Science - Internal Assessment.pptx
Lecture1 pattern recognition............
Introduction to Knowledge Engineering Part 1
annual-report-2024-2025 original latest.
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction-to-Cloud-ComputingFinal.pptx
SAP 2 completion done . PRESENTATION.pptx
Clinical guidelines as a resource for EBP(1).pdf
Supervised vs unsupervised machine learning algorithms
Mega Projects Data Mega Projects Data
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Fluorescence-microscope_Botany_detailed content

Data Science Popup Austin: The Science of Sharing

  • 1. DATA SCIENCE POP UP AUSTIN The Science of Sharing Jason Baldridge Co-Founder / Chief Scientist, People Pattern jasonbaldridge
  • 4. The Science of Sharing Jason Baldridge Co-founder, People Pattern Associate Professor, The University of Texas at Austin @jasonbaldridge
  • 5. Preliminary notes • This talk incorporates results and images from many different research papers by people working primarily in social network analysis. • As such, this talk is a synthesis of that work put together into a narrative to introduce key abilities and results. I felt this high-level view was the best way to discuss “The Science of Sharing”, rather than relying primarily on my own work or work done at People Pattern. Also, I was really impressed by the work researchers are doing in social network analysis and wanted to share even a glimpse of the problems they are tackling and what they are finding. • The high-level progression of this talk is: • Document analysis at scale: meme tracking combined with other variables like sentiment and bias • Social network at scale: information cascades and virality, inference of social networks given meme-like information as contagions. • The node level perspective and its effects on what an individual sees and shares: Illusions, effort and overload, topics, personality and demographics. • Personas and segmentation: grouping based on demographics and interests. • The last item is work done at People Pattern. I stress that neither I nor People Pattern was involved with the research papers cited in the other slides. My own academic research focuses on natural language processing, especially machine learning for learning syntactic parsers and performing geolocation using text. For more on those topics, see: http:// www.jasonbaldridge.com/papers • References and links to PDF’s of all cited work are at the end of this deck. They are also available on this post on my blog: https://bcomposes.wordpress.com/2015/10/23/references-for-my-izeafest-talk/
  • 6. Meme tracking Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.” Automatic detection and tracking of memes over time.
  • 7. Meme tracking Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.” Meme oscillation heartbeat from blogs to mainstream media.
  • 8. Quoting Patterns in Political Coverage Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.” Measuring bias is subjective and hard. Personal estimates of bias are influenced by the availability heuristic. 57% of Americans perceive media as biased. 73% of conservatives think bias is liberal. 11% of liberals think bias is liberal. Similarly: husbands and wives both estimate their contributions to family activities differently. [Lee & Waite (2005): http://www.jstor.org/stable/3600272] Read this!
  • 9. Quoting Patterns in Political Coverage Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.” Automated tracking of quotations from Obama’s speeches. Red: quoted in conservative media. Blue: quoted in liberal media.
  • 10. Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.” Dimensionality reduction reveals two main bias dimensions: (one) independent-mainstream & (two) foreign-liberal-conservative. Quoting Patterns in Political Coverage
  • 11. Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Pattterns.” Sentiment across two bias dimensions: more mainstream & conservative correlates with negative sentiment. Quoting Patterns in Political Coverage
  • 12. Structural Virality Goel et al. (2015). “The Structural Virality of Online Diffusion” Information cascades can propagate via broadcast and viral diffusion. Most cascades contain both broadcast and viral spreading. Broadcast Viral
  • 13. Structural Virality Goel et al. (2015). “The Structural Virality of Online Diffusion” Twitter cascades characterized by structural virality, increasing down and to the right.
  • 14. Structural Virality Goel et al. (2015). “The Structural Virality of Online Diffusion” Petition cascades are smallest, but have highest structural virality.
  • 15. Structural Virality Goel et al. (2015). “The Structural Virality of Online Diffusion” 99% of content adoptions terminate in a single generation The largest image and video cascades are low on structural virality. Broadcast is by far the dominant mode to reach large audiences. This means pay-to-play when you need to go big reliably.
  • 16. Information propagation Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.” Contagion model: Information infects nodes, which become active. Information spreads from active nodes along the network edges.
  • 17. Information propagation Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.” Given information cascades, infer network using contagion model.
  • 18. Information propagation Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.” Inferred structure shows emerging and vanishing clusters. Red: mainstream media. Blue: blogs. March 2011 June 2011 October 2011
  • 19. Information propagation Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.” Evolution of network for Fukushima articles.
  • 20. Information propagation Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.” Information generally flows from mainstream media to blogs. Blogs play a crucial role in information dissemination in civil movements.
  • 21. Information propagation Gomez Rodriguez et al (2014). “Uncovering the structure and temporal dynamics of information propagation.” Blogs and mainstream media swap influence during course of event. Increased blog influence proportion correlates with social unrest.
  • 22. Is virality/contagion a bad metaphor? Taylor Swift has 65 million Twitter followers who can receive her messages. One individual cannot sneeze on and infect that many people simultaneously. The likelihood of disease infection increases independently with exposure to different infected individuals, but “infection” by an idea increases greatly when exposed to it by multiple, independent parties.
  • 23. Majority illusion Lerman et al. (2015). “The Majority Illusion in Social Networks.” Friendship paradox: on average most people have fewer friends than their friends. This generalizes to any node attribute, which may explain why people overestimate their friends’ alcohol consumption.
  • 24. Majority illusion Lerman et al. (2015). “The Majority Illusion in Social Networks.” The connectedness of “infected” people greatly impacts the perception of others. A minority opinion can appear extremely popular for each individual (left side).
  • 25. Majority illusion Lerman et al. (2015). “The Majority Illusion in Social Networks.” The size of majority illusion in Digg and political blogs, varying the number and connectedness of infected nodes.
  • 26. Personality classification Yarkoni (2010). “Personality in 100,000 Words: A large-scale analysis of personality and word use among bloggers.” Language production provides a window on personality at scale.
  • 27. Personality classification Iacobelli et al. (2015). “Large Scale Personality Classification of Bloggers.” Bigrams as indicators of high/low scorers in personality classification. High scorers Low scorers Neuroticism Extroversion Openness Agreeableness Conscientiousness
  • 28. Ad Targeting and Personality Chen et al. (2015). “Making Use of Derived Personality: The Case of Social Media Ad Targeting.” Twitter users whose language indicates higher openness and lower neuroticism are more likely to respond positively to an ad.
  • 29. Antisocial Behavior Online Cheng et al. (2015). “Antisocial Behavior in Online Discussion Communities.” Comparing banned & normal users (in retrospect): banned users wrote posts that are less relevant, harder to read, and less positive. FBU: Future banned users NBU: Never banned users
  • 30. Race and sharing http://www.theatlantic.com/technology/archive/2015/10/race-social-media/408889/ Frequency of sharing for topics on social media varies by race. Events or entertainment Education or schools Re “race”: please read this book.
  • 31. Tailored audiences People Pattern and Smarty Pants Vitamins case study. Human analysis and machine learning can be used to characterize and identify personas using social media profiles. +
  • 32. Tailored audiences People Pattern and Smarty Pants Vitamins case study. Interest prediction and extraction of interest-specific keywords. Promoted tweet copy informed by persona-based keywords. +
  • 33. Tailored audiences People Pattern and Smarty Pants Vitamins case study. Persona-based campaigns with audience-driven ad copy produced higher engagement at lower cost per conversion. + Conversions 0 60 120 180 240 Control Overscheduled Parent Grab & Go Cost per conversion 0 10 20 30 40
  • 34. Sub-micro segmentation We have limited attention and many options. The best, most relevant content is often created by those with very similar passions, interests, and demographics. Doresa Jennings Cheryl Baldridge • PhD, BGSU • Lives in the southern USA • Mother of profoundly gifted children • Homeschooler • Commitment to STEM • African-American • JD, Yale • Lives in the southern USA • Mother of profoundly gifted children • Homeschooler • Commitment to STEM • African-American Dr. J creates a lot of original text and video. My busy wife makes time for it all. Other content is less compelling for her. http://kdacademy.blogspot.com/ https://www.youtube.com/user/DAJedu
  • 35. Conclusion Authentic, original content is the most compelling. Audience understanding is essential: demographics, personality and microsegment relevance. Pay-to-play to reliably get your word out. Content consumers must constantly manage information overload. Large scale analysis of networks and documents reveals hidden patterns.
  • 36. References • Chen et al. (2015). “Making Use of Derived Personality: The Case of Social Media Ad Targeting.” - http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10508 • Cheng et al. (2015). “Antisocial Behavior in Online Discussion Communities.” - http://arxiv.org/ abs/1504.00680 • Friggeri et al. (2015). “Rumor Cascades.” - http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/ paper/view/8122 • Goel et al. (2015). “The Structural Virality of Online Diffusion.” - https://5harad.com/papers/ twiral.pdf • Gomez-Rodriguez et al. (2014). “Quantifying Information Overload in Social Media and its Impact on Social Contagions.” - http://arxiv.org/abs/1403.6838 • Gomez Rodriguez et al. (2014). "Uncovering the structure and temporal dynamics of information propagation." - http://www.mpi-sws.org/~manuelgr/pubs/S2050124214000034a.pdf • Iacobelli et al. (2015). “Large Scale Personality Classification of Bloggers.” - http:// www.research.ed.ac.uk/portal/files/12949424/ Iacobelli_Gill_et_al_2011_Large_scale_personality_classification_of_bloggers.pdf
  • 37. References • Kang and Lerman (2015). “User Effort and Network Structure Mediate Access to Information in Networks.” - http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10483 • Kooti et al. (2015). “Evolution of Conversations in the Age of Email Overload.” - http://arxiv.org/ abs/1504.00704 • Kulshrestha et al (2015). “Characterizing Information Diets of Social Media Users.” - https:// www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/viewFile/10595/10505 • Lerman et al. (2015). “The Majority Illusion in Social Networks.” - http://arxiv.org/abs/1506.03022 • Leskovec et al. (2009). “Meme-tracking and the Dynamics of the News Cycle.” - http:// www.memetracker.org/quotes-kdd09.pdf • Niculae et al. (2015). “QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns.” - http://snap.stanford.edu/quotus/ • Weng et al. (2014). “Predicting Successful Memes using Network and Community Structure.” - http://arxiv.org/abs/1403.6199 • Yarkoni (2010). “Personality in 100,000 Words: A large-scale analysis of personality and word use among bloggers.” - http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2885844/