SlideShare a Scribd company logo
Theses on
Human-generated Content
and Quantitative Analysis
Marco Brambilla
marco.brambilla@polimi.it
marcobrambi
Problem 1.
Knowledge
Extraction
The Answer to the Great Question...
Of Life, the Universe and Everything
Data
Information
Knowledge
WisdomContext
independence
Understanding
Understanding relations
Understanding patterns
Understanding principles
Overview
Knowledge Enrichment Setting
HF Entity1 HF Entity5
HF Entity2 HF Entity4
HF Entity3
LF Entity1
??
LF Entity2 LF Entity4
LF Entity3
??
High Frequency
Entities
Low Frequency
Entities
??
?? ????
??
Type1
Type11
Type2
Type111
Instances
Types
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
??
??
??
??
??
Seed Entity
Seed Type
Type of
interest
Legend
Expert inputs
Enrichment problems
Property2
Relations HF - LF entities
Relations LF - LF entities
Typing of LF entities
Extraction of new LF entities
Property1
?? ?? ??
Finding attribute values
Emerging Knowledge Harvesting
Input (1): Domain Specific Types
Types selected by the expert
Relevant for the domain
Input (2): Seeds (emerging entities)
Known and selected by the domain expert
Belonging to an expert type
Thoroughly Described
# @ w
Objectives
(1) Discover candidate unknown emerging entities
(2) Determine the relevance of the candidate
(3) Determine the type of the candidate
Building Triples (Subject-Predicate-Object)
Relation
extraction
Subject and
object
extraction
Triples
composition
Beautiful
#AngelinaJolie on
the cover of @THR
wears Amen silk top
wear
(Verb -
Relation)
Triples
Objects
Relations
Subjects
Beautiful
#AngelinaJolie on the
cover of @THR wears
Amen silk top
#AngelinaJ
olie
(Subj.)
Amen silk
top
(Obj.)
Problem 2.
Social Spaces:
Volume, Consumption,
Presence, Flows
Foursquare
Checkins
Copyright©Milano-Hubproject@PolitecnicodiMilano
Not only space…
Model of social media and reality sensing
Model of social media and reality sensing
Model of social media and reality sensing
Flickr
Copyright©Milano-Hubproject@PolitecnicodiMilano
18
Cities into cities, by language
http://urbanscope.polimi.it
Foursquare
• Check-ins explicitly performed in venues all around the world
• Data set: Geo-localized Foursquare venues, collected through a query
every 50m with radius >50m over:
• Milan area: 20km x 17,5km
• Some numbers
• Total n° of venues: 90K (dirty)
• Total n° of valid venues: 43K
Google Places
Only in
the UI
(scraping)
Via API
Correlation Google Place - Foursquare
Dataset # obs Min 1Q Median 3Q Max
Grid 230 -06406 0.0744 0.3536 0.5529 0.8796
Place 283 -0.6406 0.0654 0.3569 0.5829 0.8796
Electricity Consumption data
• Electrical hubs + mobile phone calls
• Grid-based analysis
• Which locations do people visit from where?
Statistics about nationality
Event location per cluster of users
Approach
City-scale: mobile telephone and (gross-grain geo-located)
social media data
Street/square: people counting & profiling IoT
sensors
Point of Interest:
people counting
sensor, WiFi log analysis,
beacons and (fine grain geo-
located)
social media
Descriptive, predictive, privacy-preserving and, when needed, real-time analysis
of a variety of (fused) data sources
Problem 3.
Social Aspects of
Sw. Development
Collaborative activities on sw. development
• Development repositories
• Github
• Developer communities
• Interactions and contributions
• Networks (social?)
Machine learning on
network data,
Representation learning
In collaboration with UOC, Barcelona
• Roles of developers
Collaboration networks
• Cross-project
collaborations
• Networking
Problem 4.
Computational
Social Science
Politics, Debates and Other Societal Issues
• BREXIT
• US Political Observer
• Other cases
US Midterm
• Antonio Lopardo
Brexit
• Emre Calisir, now @ MIT Media Lab
33
Brexit
Radio Shows & Public Debates
• 60 stations real time radio transcripts
• Twitter data in some US states
• Collaboration with MIT & cortico.ai
News and News Sharing
• Understanding how and when people share pieces of news on
social network
• Profiling users against possible risks (fake news, superficial
behaviour)
Problem 5.
Content
Understanding
KB and Text
• How can I use KBs for improving…
• Topic analysis
• Other general NLP tasks
• Pre-trained models available (language models, …)
• https://www.aaai-make.info/
Problem 6.
Digital
Humanities for
the Future
Engagement for Future Visions
• Mission-oriented policies
• Gamification and user engagement for policy directions
Perspective
THESIS
Engage
SHARED VISION
New way of engaging
citizens
The evolution of the PE spectrum
Images are the new esperanto
Gamification
• The process of game-thinking and game mechanics to engage
users and solve problems
• Turning user experience into a game can produce behavior
change
KB and Text
• Possible futures
• The KB of science fiction
• Asimov, …
Alexa, Tell me a NEW Story
• NN-based approaches for generating new content
• Stories for children
• Jokes !?
Problem 7.
Data For
Moving
Mobility
Data Models for Gita
Further Problems
…
• ANN for solving differential equations ( with Harvard IACS)
• Conditional GANs for generating data for specific contexts (with
Harvard IACS)
THANKS!
QUESTIONS?
Marco Brambilla @marcobrambi marco.brambilla@polimi.it
http://datascience.deib.polimi.it http://home.deib.polimi.it/marcobrambi

More Related Content

PDF
Myths and challenges in knowledge extraction and analysis from human-generate...
PPTX
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
PPTX
Data Cleaning for social media knowledge extraction
PDF
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
PDF
Rogers studyingpoliticalissues mar2014_optimized_ii_
PDF
Rogers data days_2014_slides_opti
PPT
Geographic knowledge discovery (PhD Theme) by Roberto Zagal
PDF
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Myths and challenges in knowledge extraction and analysis from human-generate...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
Data Cleaning for social media knowledge extraction
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers data days_2014_slides_opti
Geographic knowledge discovery (PhD Theme) by Roberto Zagal
Making Decisions in a World Awash in Data: We’re going to need a different bo...

What's hot (19)

PPTX
#FluxFlow
PDF
The Birth of Social Media Methods
PPT
Social Network Analysis - Visualization
PDF
Critically Assembling Data, Processes & Things: Toward and Open Smart City
PPTX
Data Science of Messy Metrics
PDF
Filth and lies: analysing social media
PPTX
The language of social media
DOCX
Spammer detection and fake user Identification on Social Networks
PDF
Strategic perspectives 3
PPTX
MIT Program on Information Science Talk -- Ophir Frieder on Searching in Hars...
PDF
Using language to save the world: interactions between society, behaviour and...
PPTX
The Networked Creativity in the Censored Web 2.0
PDF
Profiling Big Data sources to assess their selectivity
PDF
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
PPTX
Redistricting and Voting Technology
PPTX
Predicting News Popularity by Mining Online Discussions
PPTX
Ethos and Pragmatics of Data Sharing
PPTX
Cognitive Models in Recommender Systems
#FluxFlow
The Birth of Social Media Methods
Social Network Analysis - Visualization
Critically Assembling Data, Processes & Things: Toward and Open Smart City
Data Science of Messy Metrics
Filth and lies: analysing social media
The language of social media
Spammer detection and fake user Identification on Social Networks
Strategic perspectives 3
MIT Program on Information Science Talk -- Ophir Frieder on Searching in Hars...
Using language to save the world: interactions between society, behaviour and...
The Networked Creativity in the Censored Web 2.0
Profiling Big Data sources to assess their selectivity
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Redistricting and Voting Technology
Predicting News Popularity by Mining Online Discussions
Ethos and Pragmatics of Data Sharing
Cognitive Models in Recommender Systems
Ad

Similar to Available Data Science M.Sc. Thesis Proposals (20)

PDF
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
PPTX
Computing for Human Experience: Sensors, Perception, Semantics, Social Comput...
PPT
Social Media Crawling and Mining Seminar (Motivation Part)
PPT
The impact of Big Data on next generation of smart cities
PPT
The impact of Big Data on next generation of smart cities
PDF
New Data `New Computation
PPT
Large-scale data analytics for smart cities
PDF
New and Emerging Forms of Data
PDF
Towards Cognitive Agents for BigData Discovery
PDF
Towards Smarter Inclusive Cities: Internet of Things, Web of Data & Citizen P...
PPTX
Citizen Sensor Data Mining, Social Media Analytics and Development Centric ...
PPT
Smart Cities and Data Analytics: Challenges and Opportunities
PDF
Wimmics Research Team 2015 Activity Report
PDF
AI in between online and offline discourse - and what has ChatGPT to do with ...
PDF
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
PDF
Computing for Human Experience: Semantics empowered Cyber-Physical, Social an...
PDF
Analyzing social media with Python and other tools (1/4)
PDF
Smart Society: Vision and Challenges
PPT
Where Does It Break?
PPTX
Accessing and Using Big Data to Advance Social Science Knowledge
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Computing for Human Experience: Sensors, Perception, Semantics, Social Comput...
Social Media Crawling and Mining Seminar (Motivation Part)
The impact of Big Data on next generation of smart cities
The impact of Big Data on next generation of smart cities
New Data `New Computation
Large-scale data analytics for smart cities
New and Emerging Forms of Data
Towards Cognitive Agents for BigData Discovery
Towards Smarter Inclusive Cities: Internet of Things, Web of Data & Citizen P...
Citizen Sensor Data Mining, Social Media Analytics and Development Centric ...
Smart Cities and Data Analytics: Challenges and Opportunities
Wimmics Research Team 2015 Activity Report
AI in between online and offline discourse - and what has ChatGPT to do with ...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
Computing for Human Experience: Semantics empowered Cyber-Physical, Social an...
Analyzing social media with Python and other tools (1/4)
Smart Society: Vision and Challenges
Where Does It Break?
Accessing and Using Big Data to Advance Social Science Knowledge
Ad

More from Marco Brambilla (20)

PDF
A GraphRAG approach for Energy Efficiency Q&A
PDF
Essential concepts of data architectures
PDF
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
PPTX
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
PDF
Exploring the Bi-verse. A trip across the digital and physical ecospheres
PPTX
Conversation graphs in Online Social Media
PPTX
Trigger.eu: Cocteau game for policy making - introduction and demo
PPTX
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
PPTX
Analyzing rich club behavior in open source projects
PDF
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
PPTX
Community analysis using graph representation learning on social networks
PPTX
Iterative knowledge extraction from social networks. The Web Conference 2018
PDF
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
PPTX
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
PPTX
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
PPTX
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
PDF
Big Data and Stream Data Analysis at Politecnico di Milano
PPTX
Web Science. An introduction
PPTX
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
PDF
Model driven software engineering in practice book - Chapter 9 - Model to tex...
A GraphRAG approach for Energy Efficiency Q&A
Essential concepts of data architectures
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Conversation graphs in Online Social Media
Trigger.eu: Cocteau game for policy making - introduction and demo
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Analyzing rich club behavior in open source projects
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Community analysis using graph representation learning on social networks
Iterative knowledge extraction from social networks. The Web Conference 2018
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
Big Data and Stream Data Analysis at Politecnico di Milano
Web Science. An introduction
Studying Multicultural Diversity of Cities and Neighborhoods through Social M...
Model driven software engineering in practice book - Chapter 9 - Model to tex...

Recently uploaded (20)

PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
master seminar digital applications in india
PDF
Complications of Minimal Access Surgery at WLH
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PDF
Classroom Observation Tools for Teachers
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Institutional Correction lecture only . . .
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Cell Types and Its function , kingdom of life
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
RMMM.pdf make it easy to upload and study
Microbial disease of the cardiovascular and lymphatic systems
master seminar digital applications in india
Complications of Minimal Access Surgery at WLH
human mycosis Human fungal infections are called human mycosis..pptx
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Classroom Observation Tools for Teachers
Pharmacology of Heart Failure /Pharmacotherapy of CHF
2.FourierTransform-ShortQuestionswithAnswers.pdf
Institutional Correction lecture only . . .
Module 4: Burden of Disease Tutorial Slides S2 2025
PPH.pptx obstetrics and gynecology in nursing
Cell Types and Its function , kingdom of life
Final Presentation General Medicine 03-08-2024.pptx
Anesthesia in Laparoscopic Surgery in India
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Week 4 Term 3 Study Techniques revisited.pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
RMMM.pdf make it easy to upload and study

Available Data Science M.Sc. Thesis Proposals