SlideShare a Scribd company logo
Data
Science
Submitted By –
ANUJ KANWAR
17041873971
BTECH ECE (7TH SEM)
Data All Around !!
• Lots of data is being
collected and warehoused
1. – Scientific Experiments
2. Internet of Things
3. Web data
4. e-commerce
5. Financial transactions
6. bank/credit
transactions
7. Online trading and
purchasing –
8. Social Network – ……
many more!
What To Do With These Data?
1.Aggregation and Statistics –
Data warehousing and OLAP
2.• Indexing, Searching, and Querying –
Keyword based search
Pattern matching (XML/RDF)
3 • Knowledge discovery –
Data Mining –
Statistical Modeling
4• Data Driven –
Predictive Analytics
Deep Learning
What is Data Science ??
• An area that manages, manipulates,
extracts, and interprets knowledge
from tremendous amount of data
• Data science (DS) is a
multidisciplinary field of study with
goal to address the challenges in big
data
• Data science principles apply to all
data – big and small.
What is Data Science ??
• Theories and techniques from many fields and disciplines are used
to investigate and analyze a large amount of data to help decision
makers in many industries such as science, engineering, economics,
politics, finance, and education ​
– Computer Science​
• Pattern recognition, visualization, data warehousing, High
performance computing, Databases, AI​
– Mathematics​
• Mathematical Modeling​
– Statistics •
Statistical and Stochastic modeling,
Probability​
Component Traditional Analysis Traditional Software
Delivery
Data Science
Tools SAS, R, Excel, SQL, inhouse tools Java, source control,
Linux, continuous
integration, unit testing,
bug reports and project
management
R, Java, scientific Python
libraries, Excel, SQL,
Hadoop, Hive, machine
learning libraries, github
for source control and
issue management
Analytical Methods Regressions, classifications,
measuring prediction accuracy and
coverage/error, sampling
N/A Classification, clustering,
similarity detection,
recommenders,
unsupervised and
supervised learning and
other.
Team Structure Statisticians, Mathematicians,
Scientists
Developers, Project
Managers, Systems
Engineers
Mathematicians,
Statisticians, Scientists,
Developers, Systems
Engineers
Time Frame Either: • Usually on-going research
and discovery within a team in the
organization Or: • Specific project
Regular software release
cycle, continuous delivery,
etc.
Either: •
Discovery/learning phase
leading to product
Delivery
Machine Learning Data Science
Develop new (individual) models Explore many models, build and tune hybrids
Prove mathematical properties of models Understand empirical properties of models
Improve/validate on a few, relatively clean, small
datasets
Develop/use tools that can handle massive datasets
Publish a paper Take action!
Contrast : Machine Learning
Contrast : Data Engineering
Data Science Data Engineering
Approach Scientific (Exploration) Engineering (Development)
Problems Unbounded Bounded
Path to Solution Iterative, exploratory,
nonlinear
Mostly linear
Education More is better (PhD’s common) BS and/or self-trained
Presentation Skills Important Not as important
Research Experience Important Not as important
Programming Skills Not as important Important
Data Skills Important Important
Key Comparisons :
1. Business
2. Healthcare
3. E-Commerce
4.Banking
5.Transport
Applications :
Data Science : Case Study
Cancer Research
• Cancer is an incredibly complex disease; a
single tumor can have more than 100 billion
cells, and each cell can acquire mutations
individually. The disease is always changing,
evolving, and adapting.
• Employ the power of big data analytics
and high-performance computing.
• Leverage sophisticated pattern and
machine learning algorithms to identify
patterns that are potentially linked to cancer.
• Huge amount of data processing and
recognition.
Data Science Case Study
Retail Analytics :
Advantages and Disadvantages of Data Science :
Thank You

More Related Content

PPTX
Real-time applications of Data Science.pptx
PPTX
Data_Science_Applications_&_Use_Cases.pptx
PPTX
Data_Science_Applications_&_Use_Cases.pptx
PDF
AI for Marking Industry application for.pdf
PDF
Data_Science_Applications_&_Use_Cases.pdf
PDF
00-01 DSnDA.pdf
PPSX
Intro to Data Science Big Data
PPTX
Best Selenium certification course
Real-time applications of Data Science.pptx
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
AI for Marking Industry application for.pdf
Data_Science_Applications_&_Use_Cases.pdf
00-01 DSnDA.pdf
Intro to Data Science Big Data
Best Selenium certification course

Similar to Data Science PPT _basics of data science.pptx (20)

PDF
Data science training in hyd ppt converted (1)
PDF
Data science training in hyd pdf converted (1)
PDF
Data science training in hydpdf converted (1)
PPTX
Which institute is best for data science?
PPTX
Best Selenium certification course
PPTX
Data science training in hyd ppt (1)
PPTX
Data science training institute in hyderabad
PPTX
Data science training in Hyderabad
PPTX
Data science training Hyderabad
PPTX
Data science online training in hyderabad
PPTX
Data science training in hyd ppt (1)
PPTX
data science training and placement
PPTX
online data science training
PPTX
Data science online training in hyderabad
PPTX
data science online training in hyderabad
PPTX
Best data science training in Hyderabad
PDF
Data science training Hyderabad
PDF
Data Science Training and Placement
PPTX
Data Science for Every Student at RPI
PPTX
Data Science Training in Chandigarh h
Data science training in hyd ppt converted (1)
Data science training in hyd pdf converted (1)
Data science training in hydpdf converted (1)
Which institute is best for data science?
Best Selenium certification course
Data science training in hyd ppt (1)
Data science training institute in hyderabad
Data science training in Hyderabad
Data science training Hyderabad
Data science online training in hyderabad
Data science training in hyd ppt (1)
data science training and placement
online data science training
Data science online training in hyderabad
data science online training in hyderabad
Best data science training in Hyderabad
Data science training Hyderabad
Data Science Training and Placement
Data Science for Every Student at RPI
Data Science Training in Chandigarh h
Ad

Recently uploaded (20)

PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
OOP with Java - Java Introduction (Basics)
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
web development for engineering and engineering
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
composite construction of structures.pdf
PDF
Structs to JSON How Go Powers REST APIs.pdf
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
Sustainable Sites - Green Building Construction
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPT
Project quality management in manufacturing
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Embodied AI: Ushering in the Next Era of Intelligent Systems
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Model Code of Practice - Construction Work - 21102022 .pdf
OOP with Java - Java Introduction (Basics)
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
web development for engineering and engineering
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Strings in CPP - Strings in C++ are sequences of characters used to store and...
Foundation to blockchain - A guide to Blockchain Tech
composite construction of structures.pdf
Structs to JSON How Go Powers REST APIs.pdf
bas. eng. economics group 4 presentation 1.pptx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Arduino robotics embedded978-1-4302-3184-4.pdf
Sustainable Sites - Green Building Construction
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Project quality management in manufacturing
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Ad

Data Science PPT _basics of data science.pptx

  • 1. Data Science Submitted By – ANUJ KANWAR 17041873971 BTECH ECE (7TH SEM)
  • 2. Data All Around !! • Lots of data is being collected and warehoused 1. – Scientific Experiments 2. Internet of Things 3. Web data 4. e-commerce 5. Financial transactions 6. bank/credit transactions 7. Online trading and purchasing – 8. Social Network – …… many more!
  • 3. What To Do With These Data? 1.Aggregation and Statistics – Data warehousing and OLAP 2.• Indexing, Searching, and Querying – Keyword based search Pattern matching (XML/RDF) 3 • Knowledge discovery – Data Mining – Statistical Modeling 4• Data Driven – Predictive Analytics Deep Learning
  • 4. What is Data Science ?? • An area that manages, manipulates, extracts, and interprets knowledge from tremendous amount of data • Data science (DS) is a multidisciplinary field of study with goal to address the challenges in big data • Data science principles apply to all data – big and small.
  • 5. What is Data Science ?? • Theories and techniques from many fields and disciplines are used to investigate and analyze a large amount of data to help decision makers in many industries such as science, engineering, economics, politics, finance, and education ​ – Computer Science​ • Pattern recognition, visualization, data warehousing, High performance computing, Databases, AI​ – Mathematics​ • Mathematical Modeling​ – Statistics • Statistical and Stochastic modeling, Probability​
  • 6. Component Traditional Analysis Traditional Software Delivery Data Science Tools SAS, R, Excel, SQL, inhouse tools Java, source control, Linux, continuous integration, unit testing, bug reports and project management R, Java, scientific Python libraries, Excel, SQL, Hadoop, Hive, machine learning libraries, github for source control and issue management Analytical Methods Regressions, classifications, measuring prediction accuracy and coverage/error, sampling N/A Classification, clustering, similarity detection, recommenders, unsupervised and supervised learning and other. Team Structure Statisticians, Mathematicians, Scientists Developers, Project Managers, Systems Engineers Mathematicians, Statisticians, Scientists, Developers, Systems Engineers Time Frame Either: • Usually on-going research and discovery within a team in the organization Or: • Specific project Regular software release cycle, continuous delivery, etc. Either: • Discovery/learning phase leading to product Delivery
  • 7. Machine Learning Data Science Develop new (individual) models Explore many models, build and tune hybrids Prove mathematical properties of models Understand empirical properties of models Improve/validate on a few, relatively clean, small datasets Develop/use tools that can handle massive datasets Publish a paper Take action! Contrast : Machine Learning
  • 8. Contrast : Data Engineering Data Science Data Engineering Approach Scientific (Exploration) Engineering (Development) Problems Unbounded Bounded Path to Solution Iterative, exploratory, nonlinear Mostly linear Education More is better (PhD’s common) BS and/or self-trained Presentation Skills Important Not as important Research Experience Important Not as important Programming Skills Not as important Important Data Skills Important Important
  • 10. 1. Business 2. Healthcare 3. E-Commerce 4.Banking 5.Transport Applications :
  • 11. Data Science : Case Study Cancer Research • Cancer is an incredibly complex disease; a single tumor can have more than 100 billion cells, and each cell can acquire mutations individually. The disease is always changing, evolving, and adapting. • Employ the power of big data analytics and high-performance computing. • Leverage sophisticated pattern and machine learning algorithms to identify patterns that are potentially linked to cancer. • Huge amount of data processing and recognition.
  • 12. Data Science Case Study Retail Analytics :
  • 13. Advantages and Disadvantages of Data Science :