SlideShare a Scribd company logo
Previously known as
Think Big. Move Fast.
Template designed by
brought to you by
SolidQ
• Born in 2002 in USA and Spain
• Established in 2007 in Italy
• More than 1000 customers and more than 200 consultants worldwide
• Dedicated to Data Management on the Microsoft Platform
• Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors
• www.solidq.com
Davide Mauri
• 18 Years of experience on the SQL Server Platform
• Specialized in Data Solution Architecture, Database Design, Performance
Tuning, Business Intelligence
• Microsoft SQL Server MVP
• President of UGISS (Italian SQL Server UG)
• Mentor @ SolidQ
• Video, Book & Article Author
• Regular Speaker @ SQL Server events
• Projects, Consulting, Mentoring & Training
Data Science
Reinassance 2.0
“Companies are collecting
mountains of information about
you, to predict how
likely you are to buy a product,
and using that knowledge to
craft a marketing message
precisely calibrated to get you to
do so”
Data Science
• Extraction of knowledge from data
• So, what’s new?
• Nothing. Except that it’s now economic and fast.
• It’s now applicable to everything. And we have a lot of data produced everyday
that can be used to extract knowledge
Data Science
DecisionsKnowledgeInformationData
Data Science
• A Sum Of
• Statistics
• Mathematics
• Machine Learning
• Data Mining
• Computer Programming
• Data Engineering
• Visualization
• Data Warehousing
• High Performance Computing
• To support (Informed) Decision Making
• Data-Driven Decisions
Data Scientist
• IBM
• A data scientist represents an evolution from the business or data analyst role.
• The formal training is similar, with a solid foundation typically in computer science and
applications, modeling, statistics, analytics and math.
• What sets the data scientist apart is strong business acumen, coupled with the ability to
communicate findings to both business and IT leaders in a way that can influence how
an organization approaches a business challenge.
• It's almost like a Renaissance individual who really wants to learn and bring change to
an organization.
Algorithms
• Algorithms are the new gatekeepers
• http://www.slideshare.net/socialisten/algorithms-are-the-new-gatekeepers
• There is simply too much data for a human to analyze!
• They decide
• What we find
• What we see
• What we buy
• Data is the foundation upon which algorithm works
• Better Data lease Better Results
• Data-Driven Decisions will be a MUST in the next years!
• Data Scientists will help companies to leverage their most valuable asset: Data
Modern Data Environment
Master
Data
EDW
Data Mart
Big Data
Unstructured
Data
BI Environment
Analytics Environment
Structured
Data
Big Data
The 3 V
No, the 4 V!!!
No, no, the 5 V!!!!!
http://www.ibmbigdatahub.com/infographic/four-vs-big-data
Big Data
• Volume, Velocity, Variety, Veracity….V<your-v-here>
• Data sets with sizes beyond the ability of commonly used software tools
to capture, curate, manage, and process the data within a tolerable elapsed
time
• Grid Computing, Parallel Computing needed
• keep processing time reasonable
• provide scalability
Big Data Data
• Paradigm: “Store Now, Figure Out Later”
• Data is the new resource. Never throw it away!
• Unstructured Data
• Text Files
• Images
• Sounds
• Structured/Semi Structured Data
• Sensors
• Transactions
• Logs
Data Storage
• RDBMS
• SQL Server
• Hadoop
• HDInsight
• Hortonworks Data Platform
• Distributed File (Eco)System
• CSV
• JSON
• *.*
Data Storage
• Hadoop Ecosystem
http://hortonworks.com/hadoop-modern-data-architecture/
Data Science & Big Data
• Data Science != Big Data
• Data Science Not Only on Big Data
• Data Science can be applied to Big Data
• Data Science starts from Small Data
• 1) find the algorithm that extract knowledge
• 2) measure algorithm results and in terms of probability
Machine Learning
• Machine learning, a branch of artificial intelligence, concerns the construction
and study of systems that can learn from data. (Wikipedia)
• For example, a machine learning system could be trained on email messages to learn to
distinguish between spam and non-spam messages. After learning, it can then be used
to classify new email messages into spam and non-spam folders.
• Flavors
• Supervised
• Unsupervised
Data Analysis
• Common Data Scientists Tools
• R
• Weka
• Octave
• Scikit-Learn
• Common Data Scientists Languages
• Python
• Scala
• F#
Data Science Overview
Resources
• https://www.coursera.org/
• Data Scientist Specialization
• https://www.khanacademy.org/
• Math
• http://www.osservatori.net/business_intelligence
• Italian Big Data Market Analysis Resources
• http://www.solidq.com/consulting/
• Data Science Services
• Big Data / Business Intelligence / Data Warehousing
Previously known as
Think Big. Move Fast.

More Related Content

PDF
Ds01 data science
PPTX
NoSQL and Data Modeling for Data Modelers
PDF
Walmart Big Data Expo
PDF
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
PDF
Back to Square One: Building a Data Science Team from Scratch
PPTX
Next Big Thing In IT Space
PDF
democratization of data sql-konferenz
PDF
Building Data Science Teams
 
Ds01 data science
NoSQL and Data Modeling for Data Modelers
Walmart Big Data Expo
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
Back to Square One: Building a Data Science Team from Scratch
Next Big Thing In IT Space
democratization of data sql-konferenz
Building Data Science Teams
 

What's hot (20)

PDF
The Big Data Dream Team
PDF
7 Big Data Challenges and How to Overcome Them
PPTX
Big data
PPTX
Big-Data-Seminar-6-Aug-2014-Koenig
PPTX
Lunch & Learn Intro to Big Data
PDF
Making Big Data Easy for Everyone
PDF
Data Science: Harnessing Open Data for High Impact Solutions
PDF
Paving The Way To Data Driven
PDF
The Top 5 Factors to Consider When Choosing a Big Data Solution
PPTX
One Database Countless Possibilities for Mission-critical Applications
PDF
Intro to Data Science on Hadoop
PDF
The Emerging Data Lake IT Strategy
PPTX
What is big data
PDF
PASS Summit Data Storytelling with R Power BI and AzureML
PDF
What is Data Science
PPTX
Big Data Content Organization, Discovery, and Management
PPTX
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...
PDF
You're the New CDO, Now What?
PDF
Data catalog
The Big Data Dream Team
7 Big Data Challenges and How to Overcome Them
Big data
Big-Data-Seminar-6-Aug-2014-Koenig
Lunch & Learn Intro to Big Data
Making Big Data Easy for Everyone
Data Science: Harnessing Open Data for High Impact Solutions
Paving The Way To Data Driven
The Top 5 Factors to Consider When Choosing a Big Data Solution
One Database Countless Possibilities for Mission-critical Applications
Intro to Data Science on Hadoop
The Emerging Data Lake IT Strategy
What is big data
PASS Summit Data Storytelling with R Power BI and AzureML
What is Data Science
Big Data Content Organization, Discovery, and Management
Usama Fayyad talk at IIT Madras on March 27, 2015: BigData, AllData, Old Dat...
You're the New CDO, Now What?
Data catalog
Ad

Similar to Data Science Overview (20)

PPSX
Intro to Data Science Big Data
PPTX
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
PPTX
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
PDF
Building successful data science teams
PPTX
New professional careers in data
PPTX
intro to data science Clustering and visualization of data science subfields ...
PDF
Decoding Data Science
PDF
How to become a data scientist
PDF
From Data to Decisions_ A Complete Guide for New-Age Data Scientists.pdf
PPT
From Developer to Data Scientist
PPTX
The Power of Data Science by DICS INNOVATIVE.pptx
PDF
Data Science: lesson01_intro-to-ds-and-ml.pdf
PPTX
Career_Jobs_in_Data_Science.pptx
PDF
Untitled document.pdf
PPTX
Unit 1-FDS. .pptx
PDF
Data analytics career path
PDF
Data Analytics Career Paths
PDF
Getting started in Data Science (April 2017, Los Angeles)
PDF
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
PDF
What is Data Science? Daniel D Gutierrez
Intro to Data Science Big Data
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
Building successful data science teams
New professional careers in data
intro to data science Clustering and visualization of data science subfields ...
Decoding Data Science
How to become a data scientist
From Data to Decisions_ A Complete Guide for New-Age Data Scientists.pdf
From Developer to Data Scientist
The Power of Data Science by DICS INNOVATIVE.pptx
Data Science: lesson01_intro-to-ds-and-ml.pdf
Career_Jobs_in_Data_Science.pptx
Untitled document.pdf
Unit 1-FDS. .pptx
Data analytics career path
Data Analytics Career Paths
Getting started in Data Science (April 2017, Los Angeles)
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
What is Data Science? Daniel D Gutierrez
Ad

More from Davide Mauri (20)

PPTX
Azure serverless Full-Stack kickstart
PPTX
Agile Data Warehousing
PPTX
Dapper: the microORM that will change your life
PPTX
When indexes are not enough
PPTX
Building a Real-Time IoT monitoring application with Azure
PPTX
SSIS Monitoring Deep Dive
PPTX
Azure SQL & SQL Server 2016 JSON
PPTX
SQL Server & SQL Azure Temporal Tables - V2
PPTX
SQL Server 2016 Temporal Tables
PPTX
SQL Server 2016 What's New For Developers
PPTX
Azure Stream Analytics
PPTX
Azure Machine Learning
PPTX
Dashboarding with Microsoft: Datazen & Power BI
PPTX
Azure ML: from basic to integration with custom applications
PPTX
Event Hub & Azure Stream Analytics
PPTX
SQL Server 2016 JSON
PPTX
SSIS Monitoring Deep Dive
PPTX
Real Time Power BI
PPTX
AzureML - Creating and Using Machine Learning Solutions (Italian)
PPTX
Datarace: IoT e Big Data (Italian)
Azure serverless Full-Stack kickstart
Agile Data Warehousing
Dapper: the microORM that will change your life
When indexes are not enough
Building a Real-Time IoT monitoring application with Azure
SSIS Monitoring Deep Dive
Azure SQL & SQL Server 2016 JSON
SQL Server & SQL Azure Temporal Tables - V2
SQL Server 2016 Temporal Tables
SQL Server 2016 What's New For Developers
Azure Stream Analytics
Azure Machine Learning
Dashboarding with Microsoft: Datazen & Power BI
Azure ML: from basic to integration with custom applications
Event Hub & Azure Stream Analytics
SQL Server 2016 JSON
SSIS Monitoring Deep Dive
Real Time Power BI
AzureML - Creating and Using Machine Learning Solutions (Italian)
Datarace: IoT e Big Data (Italian)

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Advanced Soft Computing BINUS July 2025.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
KodekX | Application Modernization Development
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Approach and Philosophy of On baking technology
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Advanced Soft Computing BINUS July 2025.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Mobile App Security Testing_ A Comprehensive Guide.pdf
cuic standard and advanced reporting.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
KodekX | Application Modernization Development
Unlocking AI with Model Context Protocol (MCP)
Chapter 3 Spatial Domain Image Processing.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
NewMind AI Monthly Chronicles - July 2025
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Machine learning based COVID-19 study performance prediction
Approach and Philosophy of On baking technology
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Reach Out and Touch Someone: Haptics and Empathic Computing

Data Science Overview

  • 1. Previously known as Think Big. Move Fast.
  • 3. SolidQ • Born in 2002 in USA and Spain • Established in 2007 in Italy • More than 1000 customers and more than 200 consultants worldwide • Dedicated to Data Management on the Microsoft Platform • Books Authors, Conference Speakers, SQL Server MVPs and Regional Directors • www.solidq.com
  • 4. Davide Mauri • 18 Years of experience on the SQL Server Platform • Specialized in Data Solution Architecture, Database Design, Performance Tuning, Business Intelligence • Microsoft SQL Server MVP • President of UGISS (Italian SQL Server UG) • Mentor @ SolidQ • Video, Book & Article Author • Regular Speaker @ SQL Server events • Projects, Consulting, Mentoring & Training
  • 6. “Companies are collecting mountains of information about you, to predict how likely you are to buy a product, and using that knowledge to craft a marketing message precisely calibrated to get you to do so”
  • 7. Data Science • Extraction of knowledge from data • So, what’s new? • Nothing. Except that it’s now economic and fast. • It’s now applicable to everything. And we have a lot of data produced everyday that can be used to extract knowledge
  • 9. Data Science • A Sum Of • Statistics • Mathematics • Machine Learning • Data Mining • Computer Programming • Data Engineering • Visualization • Data Warehousing • High Performance Computing • To support (Informed) Decision Making • Data-Driven Decisions
  • 10. Data Scientist • IBM • A data scientist represents an evolution from the business or data analyst role. • The formal training is similar, with a solid foundation typically in computer science and applications, modeling, statistics, analytics and math. • What sets the data scientist apart is strong business acumen, coupled with the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge. • It's almost like a Renaissance individual who really wants to learn and bring change to an organization.
  • 11. Algorithms • Algorithms are the new gatekeepers • http://www.slideshare.net/socialisten/algorithms-are-the-new-gatekeepers • There is simply too much data for a human to analyze! • They decide • What we find • What we see • What we buy • Data is the foundation upon which algorithm works • Better Data lease Better Results • Data-Driven Decisions will be a MUST in the next years! • Data Scientists will help companies to leverage their most valuable asset: Data
  • 12. Modern Data Environment Master Data EDW Data Mart Big Data Unstructured Data BI Environment Analytics Environment Structured Data
  • 13. Big Data The 3 V No, the 4 V!!! No, no, the 5 V!!!!!
  • 15. Big Data • Volume, Velocity, Variety, Veracity….V<your-v-here> • Data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time • Grid Computing, Parallel Computing needed • keep processing time reasonable • provide scalability
  • 16. Big Data Data • Paradigm: “Store Now, Figure Out Later” • Data is the new resource. Never throw it away! • Unstructured Data • Text Files • Images • Sounds • Structured/Semi Structured Data • Sensors • Transactions • Logs
  • 17. Data Storage • RDBMS • SQL Server • Hadoop • HDInsight • Hortonworks Data Platform • Distributed File (Eco)System • CSV • JSON • *.*
  • 18. Data Storage • Hadoop Ecosystem http://hortonworks.com/hadoop-modern-data-architecture/
  • 19. Data Science & Big Data • Data Science != Big Data • Data Science Not Only on Big Data • Data Science can be applied to Big Data • Data Science starts from Small Data • 1) find the algorithm that extract knowledge • 2) measure algorithm results and in terms of probability
  • 20. Machine Learning • Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data. (Wikipedia) • For example, a machine learning system could be trained on email messages to learn to distinguish between spam and non-spam messages. After learning, it can then be used to classify new email messages into spam and non-spam folders. • Flavors • Supervised • Unsupervised
  • 21. Data Analysis • Common Data Scientists Tools • R • Weka • Octave • Scikit-Learn • Common Data Scientists Languages • Python • Scala • F#
  • 23. Resources • https://www.coursera.org/ • Data Scientist Specialization • https://www.khanacademy.org/ • Math • http://www.osservatori.net/business_intelligence • Italian Big Data Market Analysis Resources • http://www.solidq.com/consulting/ • Data Science Services • Big Data / Business Intelligence / Data Warehousing
  • 24. Previously known as Think Big. Move Fast.

Editor's Notes

  • #2: Last Changes: 2014-04-25 – DM – v1
  • #7: http://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/
  • #11: http://www-01.ibm.com/software/data/infosphere/data-scientist/
  • #15: http://www.ibmbigdatahub.com/infographic/four-vs-big-data
  • #23: http://nirvacana.com/thoughts/becoming-a-data-scientist/
  • #25: Last Changes: 2012-07-30 DM