SlideShare a Scribd company logo
data & content design
Frieda Brioschi - frieda.brioschi@gmail.com
Emma Tracanella - emma.tracanella@gmail.com
AROUND DATA SCIENCE
LESSON 5 - 2020
data & content design
LESSON 5
THE COURSE
1. Introduction. What are data and information, why they matter
2. How to collect and organize data
3. Information classification
4. Data lingo
5. Around Data Science
6. Computer & Humans: how we perceive information
7. Visual communication of numerical data
8. Visual communication of non numerical data
9. Content type and effectiveness
10.Storytelling with data
11.Tools for analysis and data visualization
12.Artificial Intelligence demythologized
2
WITH YOUR DATA PROJECT
LET’S START
data & content design
LESSON 5
4
DESCRIBE YOUR PROJECT
Photo by William Iven on Unsplash
data & content design
LESSON 5
A COUPLE OF DIGRESSIONS
▸ storage issues
▸ http://blog.odsi.co.uk/wp-content/uploads/2013/08/History-of-computer-
data-storage.png.jpg
▸ the rise of data center
▸ computational power
▸ the Internet
5
data & content design
LESSON 5
MARGARET HAMILTON
6
data & content design
LESSON 5
DATA CENTER CLOUD (4.563 IN 2019)
7https://www.digitalic.it/tecnologia/data-center-cloud-numeri-e-diffusione-nel-mondo-litalia-tra-i-paesi-europei-che-ne-ospita-di-piu
BIG DATA
WHAT ARE
Photo by ev on Unsplash
data & content design
LESSON 5
DEFINITION
The term “big data” refers to data that is so large, fast or complex that it’s difficult or impossible to
process using traditional methods. The concept of big data gained momentum in the early 2000s
when industry analyst Doug Laney articulated the definition of big data as the three V’s:
▸ Volume: Organizations collect data from a variety of sources, including business transactions,
smart (IoT) devices, industrial equipment, videos, social media and more. In the past, storing it
would have been a problem.
▸ Velocity: With the growth in the Internet of Things, data streams in to businesses at an
unprecedented speed and must be handled in a timely manner, near-real time.
▸ Variety: Data comes in all types of formats – from structured, numeric data in traditional
databases to unstructured text documents, emails, videos, audios, stock ticker data and financial
transactions.
9
data & content design
LESSON 5
(ACCORDING TO SAS)
10
data & content design
LESSON 5
11
https://www.visualcapitalist.com/
big-data-keeps-getting-bigger/
data & content design
LESSON 5
CORRELATION
When two sets of data are strongly linked together we say they have a High Correlation.
▸ Correlation is Positive when the values increase together, and
▸ Correlation is Negative when one value decreases as the other increases
Correlation can have a value:
▸ 1 is a perfect positive correlation
▸ 0 is no correlation (the values don't seem linked at all)
▸ -1 is a perfect negative correlation
13
data & content design
LESSON 5
CORRELATION
Correlation is one of the most widely used statistical concepts.
Since the term "correlation" refers to a mutual relationship or association between
quantities, why is it a useful metric?
▸ Correlation can help in predicting one quantity from another
▸ Correlation can (but often does not) indicate the presence of a causal
relationship
▸ Correlation is used as a basic quantity and foundation for many other
modeling techniques
14
https://thenextweb.com/growth-quarters/2020/01/30/digital-trends-2020-every-single-stat-you-need-to-know-about-the-internet/
https://thenextweb.com/growth-quarters/2020/01/30/digital-trends-2020-every-single-stat-you-need-to-know-about-the-internet/
DATA
LINKED
data & content design
LESSON 5
AN EXAMPLE OF ONTOLOGY
http://mappings.dbpedia.org/server/ontology/classes/
18
data & content design
LESSON 5
LINKED DATA / LOD
19
Linked data is structured data which is interlinked with other data so it becomes
more useful through semantic queries.It builds upon standard Web technologies
but rather than using them to serve web pages only for human readers, it extends
them to share information in a way that can be read automatically by computers.
Part of the vision of linked data is for the Internet to become a global database.
Linked data may also be open data, in which case it is usually described as linked
open data (LOD).
▸ https://en.wikipedia.org/wiki/Linked_data
data & content design
LESSON 5
SCHEMA.ORG
http://schema.org/docs/full.html
20
data & content design
LESSON 5
GOOGLE KNOWLEDGE GRAPH
21
https://www.youtube.com/watch?v=mmQl6VGvX-c
data & content design
LESSON 5
WHY LINKED DATA MATTERS
Linked data is a method for publishing structured data using vocabularies like
schema.org that can be connected together and interpreted by machines. Using
linked data, statements encoded in triples can be spread across different
websites.
This enables data from different sources to be connected and queried.
▸ https://wordlift.io/blog/en/entity/linked-data/
22
data & content design
LESSON 5
23
Data-Informed Decision Making
To Making Better
Data-Informed
Decisions
data & content design
LESSON 5
24
Formulate
a focused
question
ASK
Data-Informed
DECISION
MAKING
PROCESS
Monitor the
outcome
ASSESS
Search for the
best available
data
ACQUIRE
Critically
appraise and
analyze the data
ANALYZE
Integrate the data
with your professional
expertise and be
conscious about your
mental models
APPLY
Decide and
communicate
ANNOUNCE
data & content design
LESSON 5
25
ASK
Turn the business questions into analytical question(s).
ACQUIRE
Find and source all relevant data. Remember to think
about the question systemically and include any
interrelated data that could be relevant. This includes
not only internal but external data and information too.
Ensure the sourced data is available, trusted, and in
the right form (extracted, profiled, tagged, cataloged,
standardized, treated for sensitivity, etc…)
ANALYZE
Create a measurement framework
to describe your data with KPIs.
Use exploratory analytics to find patterns and
trends and relationships that may exist and
not be obvious to start to drill into root cause.
?
data & content design
LESSON 5
26
ANNOUNCE
Announce your decision at the right level to ALL stakeholders
(direct, indirect, upstream, and downstream) by leveraging methodologies
like the ‘Rule of 3’ and the ‘Pyramid Principle’ in your storytelling
APPLY
Review and orientate yourself to the
information and data so far and apply your
personal experiences to it.
Challenge the data and look for information
and data to disprove it.
Review with a cognitively diverse team (or if
you are alone, be aware of your bias and
play devil’s advocate and reframe).
If applicable, leverage predictive analytics
to run simulations or similar to test
potential decisions
and solutions.
data & content design
LESSON 5
27
© 2019 QlikTech International AB. All rights reserved. Qlik®, Qlik Sense®, QlikView®, QlikTech®, Qlik Cloud®, Qlik DataMarket®, Qlik Analytics Platform®, Qlik NPrinting®, Qlik
Connectors®, Qlik GeoAnalytics®, Qlik Core®, Associative Difference®, Qlik Data Catalyst™, Qlik Associative Big Data Index™ and the QlikTech logos are trademarks of QlikTech
International AB which have been registered in multiple countries. Other marks and logos mentioned herein are trademarks or registered trademarks of their respective owners.
About Qlik®
Qlik is on a mission to create a data-literate world, where everyone can use data to solve their most challenging problems. Only Qlik’s end-to-end data
management and analytics platform brings together all of an organization’s data from any source, enabling people at any skill level to use their
curiosity to uncover new insights. Companies use Qlik products to see more deeply into customer behavior, reinvent business processes, discover new
revenue streams, and balance risk and reward. Qlik does business in more than 100 countries and serves over 48,000 customers around the world.
ASSESS
Setup a review mechanism to monitor the
impacts of the decision after it is made and
acted upon.
Leverage that review mechanism and
fail/fix/learn fast including improvements to
data, measurement frameworks, accountability,
decisions, and anything else relevant
To learn more about Data-Informed Decision Making and explore our free courses and resources, visit
qlik.com/GetDataLiterate.

More Related Content

PDF
Data mining and data aggregation basics
PDF
Visual communication of quantitative data (v. 2020 ITA)
PDF
Visual communication of qualitative data (v. 2020 ITA)
PDF
How to collect and organize data (v. ITA 2020)
PDF
Visual communication of qualitative data
PDF
Around Data Science
PDF
Around Data Science (v. 2021 ITA)
PDF
How we perceive information
Data mining and data aggregation basics
Visual communication of quantitative data (v. 2020 ITA)
Visual communication of qualitative data (v. 2020 ITA)
How to collect and organize data (v. ITA 2020)
Visual communication of qualitative data
Around Data Science
Around Data Science (v. 2021 ITA)
How we perceive information

What's hot (12)

PDF
How to collect and organize data (v. ITA 2021)
PDF
How to collect and organize data
PDF
Digital communication (v. 2021 ITA)
PDF
Data as the Fuel and Analytics as the Engine of the Digital Transformation: D...
PDF
How we perceive information (v. 2021 ITA)
PDF
The Future Of Data Visualization
DOCX
Map Reduce in Big fata
PDF
Production Processes of Official Statistics & Data Innovation Processes Augme...
PDF
Data science landscape in the insurance industry
PPTX
The data science revolution in insurance
PDF
Introduction to Data Visualization
PDF
A Statistician's View on Big Data and Data Science (Version 1)
How to collect and organize data (v. ITA 2021)
How to collect and organize data
Digital communication (v. 2021 ITA)
Data as the Fuel and Analytics as the Engine of the Digital Transformation: D...
How we perceive information (v. 2021 ITA)
The Future Of Data Visualization
Map Reduce in Big fata
Production Processes of Official Statistics & Data Innovation Processes Augme...
Data science landscape in the insurance industry
The data science revolution in insurance
Introduction to Data Visualization
A Statistician's View on Big Data and Data Science (Version 1)
Ad

Similar to Around Data Science (v. 2020 ITA) (20)

PDF
FAIR data_ Superior data visibility and reuse without warehousing.pdf
PDF
Analyst Webinar: Discover how a logical data fabric helps organizations avoid...
PPTX
Linked Open Data Principles, benefits of LOD for sustainable development
PPTX
Data centric business and knowledge graph trends
PDF
Introduction to Knowledge Graphs for Information Architects.pdf
PDF
The FAIR data movement and 22 Feb 2023.pdf
PPTX
5 Tips to Building a Successful Big Data Strategy
PDF
A Real World Case Study for Implementing an Enterprise Scale Data Fabric
PDF
GraphSummit Toronto: The Knowledge Graph Explosion
PDF
DAS Slides: Graph Databases — Practical Use Cases
PDF
Cloud Migration Strategies that Ensure Greater Value for the Business
PPTX
Data Collaboration Stack
PDF
Advanced Analytics and Machine Learning with Data Virtualization
PDF
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
PPTX
Data2030 Summit Data Megatrends Turner Sept 2022.pptx
PDF
Business_Analytics_Presentation_Luke_Caratan
PDF
Data Mesh - It's not about technology, it's about people
PDF
An introduction to open data
PDF
Make compliance fulfillment count double
PDF
Dive deep into your Data Pools
FAIR data_ Superior data visibility and reuse without warehousing.pdf
Analyst Webinar: Discover how a logical data fabric helps organizations avoid...
Linked Open Data Principles, benefits of LOD for sustainable development
Data centric business and knowledge graph trends
Introduction to Knowledge Graphs for Information Architects.pdf
The FAIR data movement and 22 Feb 2023.pdf
5 Tips to Building a Successful Big Data Strategy
A Real World Case Study for Implementing an Enterprise Scale Data Fabric
GraphSummit Toronto: The Knowledge Graph Explosion
DAS Slides: Graph Databases — Practical Use Cases
Cloud Migration Strategies that Ensure Greater Value for the Business
Data Collaboration Stack
Advanced Analytics and Machine Learning with Data Virtualization
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Data2030 Summit Data Megatrends Turner Sept 2022.pptx
Business_Analytics_Presentation_Luke_Caratan
Data Mesh - It's not about technology, it's about people
An introduction to open data
Make compliance fulfillment count double
Dive deep into your Data Pools
Ad

More from Frieda Brioschi (18)

PDF
Storytelling with data (v. 2021 ITA)
PDF
Visual communication of qualitative and quantitative data (v. 2021 ITA)
PDF
Data Lingo (v. ITA 2021)
PDF
Information Classification (v. ITA 2021)
PDF
What are data and information, why they matter (v. ITA 2021)
PDF
Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)
PDF
Digital communication (v. 2020 ITA)
PDF
Storytelling with data (v. 2020 ITA)
PDF
How we perceive information (v. 2020 ITA)
PDF
Data Lingo (v. ITA 2020)
PDF
Information Classification (v. ITA 2020)
PDF
What are data and information, why they matter (v. ITA 2020)
PDF
Storytelling with data
PDF
Visual communication of quantitative data
PDF
Data Lingo
PDF
Information Classification
PDF
What are data and information, why they matter
PDF
Communication for beginners (v. 2019 ita)
Storytelling with data (v. 2021 ITA)
Visual communication of qualitative and quantitative data (v. 2021 ITA)
Data Lingo (v. ITA 2021)
Information Classification (v. ITA 2021)
What are data and information, why they matter (v. ITA 2021)
Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)
Digital communication (v. 2020 ITA)
Storytelling with data (v. 2020 ITA)
How we perceive information (v. 2020 ITA)
Data Lingo (v. ITA 2020)
Information Classification (v. ITA 2020)
What are data and information, why they matter (v. ITA 2020)
Storytelling with data
Visual communication of quantitative data
Data Lingo
Information Classification
What are data and information, why they matter
Communication for beginners (v. 2019 ita)

Recently uploaded (20)

DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PDF
Trump Administration's workforce development strategy
PDF
advance database management system book.pdf
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
My India Quiz Book_20210205121199924.pdf
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PPTX
Introduction to pro and eukaryotes and differences.pptx
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Chinmaya Tiranga quiz Grand Finale.pdf
FORM 1 BIOLOGY MIND MAPS and their schemes
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
Paper A Mock Exam 9_ Attempt review.pdf.
Environmental Education MCQ BD2EE - Share Source.pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Trump Administration's workforce development strategy
advance database management system book.pdf
B.Sc. DS Unit 2 Software Engineering.pptx
My India Quiz Book_20210205121199924.pdf
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
LDMMIA Reiki Yoga Finals Review Spring Summer
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
Introduction to pro and eukaryotes and differences.pptx
Hazard Identification & Risk Assessment .pdf
Share_Module_2_Power_conflict_and_negotiation.pptx

Around Data Science (v. 2020 ITA)

  • 1. data & content design Frieda Brioschi - frieda.brioschi@gmail.com Emma Tracanella - emma.tracanella@gmail.com AROUND DATA SCIENCE LESSON 5 - 2020
  • 2. data & content design LESSON 5 THE COURSE 1. Introduction. What are data and information, why they matter 2. How to collect and organize data 3. Information classification 4. Data lingo 5. Around Data Science 6. Computer & Humans: how we perceive information 7. Visual communication of numerical data 8. Visual communication of non numerical data 9. Content type and effectiveness 10.Storytelling with data 11.Tools for analysis and data visualization 12.Artificial Intelligence demythologized 2
  • 3. WITH YOUR DATA PROJECT LET’S START
  • 4. data & content design LESSON 5 4 DESCRIBE YOUR PROJECT Photo by William Iven on Unsplash
  • 5. data & content design LESSON 5 A COUPLE OF DIGRESSIONS ▸ storage issues ▸ http://blog.odsi.co.uk/wp-content/uploads/2013/08/History-of-computer- data-storage.png.jpg ▸ the rise of data center ▸ computational power ▸ the Internet 5
  • 6. data & content design LESSON 5 MARGARET HAMILTON 6
  • 7. data & content design LESSON 5 DATA CENTER CLOUD (4.563 IN 2019) 7https://www.digitalic.it/tecnologia/data-center-cloud-numeri-e-diffusione-nel-mondo-litalia-tra-i-paesi-europei-che-ne-ospita-di-piu
  • 8. BIG DATA WHAT ARE Photo by ev on Unsplash
  • 9. data & content design LESSON 5 DEFINITION The term “big data” refers to data that is so large, fast or complex that it’s difficult or impossible to process using traditional methods. The concept of big data gained momentum in the early 2000s when industry analyst Doug Laney articulated the definition of big data as the three V’s: ▸ Volume: Organizations collect data from a variety of sources, including business transactions, smart (IoT) devices, industrial equipment, videos, social media and more. In the past, storing it would have been a problem. ▸ Velocity: With the growth in the Internet of Things, data streams in to businesses at an unprecedented speed and must be handled in a timely manner, near-real time. ▸ Variety: Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text documents, emails, videos, audios, stock ticker data and financial transactions. 9
  • 10. data & content design LESSON 5 (ACCORDING TO SAS) 10
  • 11. data & content design LESSON 5 11
  • 13. data & content design LESSON 5 CORRELATION When two sets of data are strongly linked together we say they have a High Correlation. ▸ Correlation is Positive when the values increase together, and ▸ Correlation is Negative when one value decreases as the other increases Correlation can have a value: ▸ 1 is a perfect positive correlation ▸ 0 is no correlation (the values don't seem linked at all) ▸ -1 is a perfect negative correlation 13
  • 14. data & content design LESSON 5 CORRELATION Correlation is one of the most widely used statistical concepts. Since the term "correlation" refers to a mutual relationship or association between quantities, why is it a useful metric? ▸ Correlation can help in predicting one quantity from another ▸ Correlation can (but often does not) indicate the presence of a causal relationship ▸ Correlation is used as a basic quantity and foundation for many other modeling techniques 14
  • 18. data & content design LESSON 5 AN EXAMPLE OF ONTOLOGY http://mappings.dbpedia.org/server/ontology/classes/ 18
  • 19. data & content design LESSON 5 LINKED DATA / LOD 19 Linked data is structured data which is interlinked with other data so it becomes more useful through semantic queries.It builds upon standard Web technologies but rather than using them to serve web pages only for human readers, it extends them to share information in a way that can be read automatically by computers. Part of the vision of linked data is for the Internet to become a global database. Linked data may also be open data, in which case it is usually described as linked open data (LOD). ▸ https://en.wikipedia.org/wiki/Linked_data
  • 20. data & content design LESSON 5 SCHEMA.ORG http://schema.org/docs/full.html 20
  • 21. data & content design LESSON 5 GOOGLE KNOWLEDGE GRAPH 21 https://www.youtube.com/watch?v=mmQl6VGvX-c
  • 22. data & content design LESSON 5 WHY LINKED DATA MATTERS Linked data is a method for publishing structured data using vocabularies like schema.org that can be connected together and interpreted by machines. Using linked data, statements encoded in triples can be spread across different websites. This enables data from different sources to be connected and queried. ▸ https://wordlift.io/blog/en/entity/linked-data/ 22
  • 23. data & content design LESSON 5 23 Data-Informed Decision Making To Making Better Data-Informed Decisions
  • 24. data & content design LESSON 5 24 Formulate a focused question ASK Data-Informed DECISION MAKING PROCESS Monitor the outcome ASSESS Search for the best available data ACQUIRE Critically appraise and analyze the data ANALYZE Integrate the data with your professional expertise and be conscious about your mental models APPLY Decide and communicate ANNOUNCE
  • 25. data & content design LESSON 5 25 ASK Turn the business questions into analytical question(s). ACQUIRE Find and source all relevant data. Remember to think about the question systemically and include any interrelated data that could be relevant. This includes not only internal but external data and information too. Ensure the sourced data is available, trusted, and in the right form (extracted, profiled, tagged, cataloged, standardized, treated for sensitivity, etc…) ANALYZE Create a measurement framework to describe your data with KPIs. Use exploratory analytics to find patterns and trends and relationships that may exist and not be obvious to start to drill into root cause. ?
  • 26. data & content design LESSON 5 26 ANNOUNCE Announce your decision at the right level to ALL stakeholders (direct, indirect, upstream, and downstream) by leveraging methodologies like the ‘Rule of 3’ and the ‘Pyramid Principle’ in your storytelling APPLY Review and orientate yourself to the information and data so far and apply your personal experiences to it. Challenge the data and look for information and data to disprove it. Review with a cognitively diverse team (or if you are alone, be aware of your bias and play devil’s advocate and reframe). If applicable, leverage predictive analytics to run simulations or similar to test potential decisions and solutions.
  • 27. data & content design LESSON 5 27 © 2019 QlikTech International AB. All rights reserved. Qlik®, Qlik Sense®, QlikView®, QlikTech®, Qlik Cloud®, Qlik DataMarket®, Qlik Analytics Platform®, Qlik NPrinting®, Qlik Connectors®, Qlik GeoAnalytics®, Qlik Core®, Associative Difference®, Qlik Data Catalyst™, Qlik Associative Big Data Index™ and the QlikTech logos are trademarks of QlikTech International AB which have been registered in multiple countries. Other marks and logos mentioned herein are trademarks or registered trademarks of their respective owners. About Qlik® Qlik is on a mission to create a data-literate world, where everyone can use data to solve their most challenging problems. Only Qlik’s end-to-end data management and analytics platform brings together all of an organization’s data from any source, enabling people at any skill level to use their curiosity to uncover new insights. Companies use Qlik products to see more deeply into customer behavior, reinvent business processes, discover new revenue streams, and balance risk and reward. Qlik does business in more than 100 countries and serves over 48,000 customers around the world. ASSESS Setup a review mechanism to monitor the impacts of the decision after it is made and acted upon. Leverage that review mechanism and fail/fix/learn fast including improvements to data, measurement frameworks, accountability, decisions, and anything else relevant To learn more about Data-Informed Decision Making and explore our free courses and resources, visit qlik.com/GetDataLiterate.