SlideShare a Scribd company logo
Dataset Repositories 
Fabio Fumarola 
Computer Science Department 
University of Bari
Stanford Large Network Dataset Collection 
https://snap.stanford.edu/data/ 
• Social networks : online social networks, edges represent interactions between people 
• Networks with ground-truth communities : ground-truth network communities in social and 
information networks 
•Communication networks : email communication networks with edges representing communication 
• Citation networks : nodes represent papers, edges represent citations 
• Collaboration networks : nodes represent scientists, edges represent collaborations (co-authoring a 
paper) 
•Web graphs : nodes represent webpages and edges are hyperlinks 
•Amazon networks : nodes represent products and edges link commonly co-purchased products 
• Internet networks : nodes represent computers and edges communication 
• Road networks : nodes represent intersections and edges roads connecting the intersections 
• Autonomous systems : graphs of the internet 
• Signed networks : networks with positive and negative edges (friend/foe, trust/distrust) 
• Location-based online social networks : Social networks with geographic check-ins 
• Wikipedia networks and metadata : Talk, editing and voting data from Wikipedia 
• Twitter and Memetracker : Memetracker phrases, links and 467 million Tweets 
• Online communities : Data from online communities such as Reddit and Flickr 
• Online reviews : Data from online review systems such as BeerAdvocate and Amazon 
• Information cascades : ...
Twitter Dataset 
http://an.kaist.ac.kr/traces/WWW2010.html
100 Million facebook pages 
http://it.slashdot.org/story/10/07/28/1350222/100-Million-Facebook-Pages-Leaked-On-Torrent- 
Site?art_pos=6
UCIrvine Datasets 
http://odysseas.calit2.uci.edu/doku.php/public:online_social_networks
Additional Results 
• http://stats.stackexchange.com/questions/4451/socia 
l-network-datasets 
• https://networkdata.ics.uci.edu/resources.php 
• http://kevinchai.net/datasets 
• https://archive.org/details/friendster-dataset-201107 
• http://realitycommons.media.mit.edu/socialevolution 
.html

More Related Content

PPTX
Data for the Humanities
PPTX
Social Network Analysis - an Introduction (minus the Maths)
PPTX
Only as good as our sources
PPTX
Personal Learning Networks and Professional Learning Communities in Informati...
ODP
Review of "Tastes, ties, and time: A new social network dataset using Faceboo...
PPTX
The Blossoming of the Semantic Web
PPTX
Introduction to LIS 1321
PDF
Rogers studyingpoliticalissues mar2014_optimized_ii_
Data for the Humanities
Social Network Analysis - an Introduction (minus the Maths)
Only as good as our sources
Personal Learning Networks and Professional Learning Communities in Informati...
Review of "Tastes, ties, and time: A new social network dataset using Faceboo...
The Blossoming of the Semantic Web
Introduction to LIS 1321
Rogers studyingpoliticalissues mar2014_optimized_ii_

What's hot (19)

PPTX
User Engagement with Digital Archives: A Case Study of Emblematica Online
PPTX
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
PDF
Exploring Graph Visualization
PPTX
Tei2012 slides revised
PPTX
VRA_2015_CatalogingRoundup_Seneff
PPTX
Network structure and data sources
PPT
63demo dfa
PPT
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
PPTX
‘Big Social Data’ in Context: Connecting Social Media Data and Other Sources
PPTX
Mapping a National Twittersphere: A 'Big Data' Analysis of Australian Twitter...
PDF
Shaun_L_Michel_CV_2015_Web
PDF
Greek independent media and the antifascist movement
PPTX
Building the Archive of DH Research
PDF
Digital Libraries on International Campuses
PPTX
Trends in Cataloging & Metadata
PPTX
Workset Creation for Scholarly Analysis Project presentation at CNI 2013
PPT
Internet Presentation
PPTX
PPT ON INTERNET
User Engagement with Digital Archives: A Case Study of Emblematica Online
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
Exploring Graph Visualization
Tei2012 slides revised
VRA_2015_CatalogingRoundup_Seneff
Network structure and data sources
63demo dfa
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
‘Big Social Data’ in Context: Connecting Social Media Data and Other Sources
Mapping a National Twittersphere: A 'Big Data' Analysis of Australian Twitter...
Shaun_L_Michel_CV_2015_Web
Greek independent media and the antifascist movement
Building the Archive of DH Research
Digital Libraries on International Campuses
Trends in Cataloging & Metadata
Workset Creation for Scholarly Analysis Project presentation at CNI 2013
Internet Presentation
PPT ON INTERNET
Ad

Similar to 08 datasets (20)

PPTX
Introduction To Internet and Applications.pptx
PDF
Predicting Communication Intention in Social Media
PPTX
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
PPT
Internet and Its Services - what kind of service do access thru the Internet
PPTX
Offthe shelfe bookforacademiclibrarians
PPTX
20111103 con tech2011-marc smith
PPTX
LSS'11: Charting Collections Of Connections In Social Media
PPT
Envisioning Social Applications of Library Linked Data
PPTX
AAAS 2014: How the Web Changes Collaboration
PDF
Data sharing in the age of the Social Machine
PPTX
Fa13 7718-ch3-kim
PPTX
Relationship status: Libraries and linked data in Europe
PDF
Digital Methods by Richard Rogers
PPTX
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
PDF
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
PPTX
WIDS 2021--An Introduction to Network Science
PDF
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
PPT
El futuro delas Redes Sociales
Introduction To Internet and Applications.pptx
Predicting Communication Intention in Social Media
2010-November-8-NIA - Smart Society and Civic Culture - Marc Smith
Internet and Its Services - what kind of service do access thru the Internet
Offthe shelfe bookforacademiclibrarians
20111103 con tech2011-marc smith
LSS'11: Charting Collections Of Connections In Social Media
Envisioning Social Applications of Library Linked Data
AAAS 2014: How the Web Changes Collaboration
Data sharing in the age of the Social Machine
Fa13 7718-ch3-kim
Relationship status: Libraries and linked data in Europe
Digital Methods by Richard Rogers
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
WIDS 2021--An Introduction to Network Science
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
El futuro delas Redes Sociales
Ad

More from Fabio Fumarola (20)

PPT
11. From Hadoop to Spark 2/2
PPT
11. From Hadoop to Spark 1:2
PPT
10b. Graph Databases Lab
PPT
10. Graph Databases
PPT
9b. Document-Oriented Databases lab
PPT
9. Document Oriented Databases
PPT
8b. Column Oriented Databases Lab
PPT
8a. How To Setup HBase with Docker
PPT
8. column oriented databases
PPT
8. key value databases laboratory
PPT
7. Key-Value Databases: In Depth
PPT
6 Data Modeling for NoSQL 2/2
PPT
5 Data Modeling for NoSQL 1/2
PPT
PPT
2 Linux Container and Docker
PDF
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
PPT
Scala and spark
PPT
Hbase an introduction
PPT
An introduction to maven gradle and sbt
PPT
Develop with linux containers and docker
11. From Hadoop to Spark 2/2
11. From Hadoop to Spark 1:2
10b. Graph Databases Lab
10. Graph Databases
9b. Document-Oriented Databases lab
9. Document Oriented Databases
8b. Column Oriented Databases Lab
8a. How To Setup HBase with Docker
8. column oriented databases
8. key value databases laboratory
7. Key-Value Databases: In Depth
6 Data Modeling for NoSQL 2/2
5 Data Modeling for NoSQL 1/2
2 Linux Container and Docker
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
Scala and spark
Hbase an introduction
An introduction to maven gradle and sbt
Develop with linux containers and docker

Recently uploaded (20)

PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
additive manufacturing of ss316l using mig welding
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
web development for engineering and engineering
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Sustainable Sites - Green Building Construction
PPTX
Geodesy 1.pptx...............................................
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
UNIT-1 - COAL BASED THERMAL POWER PLANTS
CH1 Production IntroductoryConcepts.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
additive manufacturing of ss316l using mig welding
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
R24 SURVEYING LAB MANUAL for civil enggi
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Model Code of Practice - Construction Work - 21102022 .pdf
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
web development for engineering and engineering
Operating System & Kernel Study Guide-1 - converted.pdf
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
UNIT 4 Total Quality Management .pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Sustainable Sites - Green Building Construction
Geodesy 1.pptx...............................................
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...

08 datasets

  • 1. Dataset Repositories Fabio Fumarola Computer Science Department University of Bari
  • 2. Stanford Large Network Dataset Collection https://snap.stanford.edu/data/ • Social networks : online social networks, edges represent interactions between people • Networks with ground-truth communities : ground-truth network communities in social and information networks •Communication networks : email communication networks with edges representing communication • Citation networks : nodes represent papers, edges represent citations • Collaboration networks : nodes represent scientists, edges represent collaborations (co-authoring a paper) •Web graphs : nodes represent webpages and edges are hyperlinks •Amazon networks : nodes represent products and edges link commonly co-purchased products • Internet networks : nodes represent computers and edges communication • Road networks : nodes represent intersections and edges roads connecting the intersections • Autonomous systems : graphs of the internet • Signed networks : networks with positive and negative edges (friend/foe, trust/distrust) • Location-based online social networks : Social networks with geographic check-ins • Wikipedia networks and metadata : Talk, editing and voting data from Wikipedia • Twitter and Memetracker : Memetracker phrases, links and 467 million Tweets • Online communities : Data from online communities such as Reddit and Flickr • Online reviews : Data from online review systems such as BeerAdvocate and Amazon • Information cascades : ...
  • 4. 100 Million facebook pages http://it.slashdot.org/story/10/07/28/1350222/100-Million-Facebook-Pages-Leaked-On-Torrent- Site?art_pos=6
  • 6. Additional Results • http://stats.stackexchange.com/questions/4451/socia l-network-datasets • https://networkdata.ics.uci.edu/resources.php • http://kevinchai.net/datasets • https://archive.org/details/friendster-dataset-201107 • http://realitycommons.media.mit.edu/socialevolution .html