SlideShare a Scribd company logo
3
Most read
4
Most read
5
Most read
Data Science
Exploratory Data Analysis (EDA)
(Histogram)
Part-I
Exploratory Data Analysis (EDA)
Exploratory Data Analysis (EDA) is the process of examining and
visualizing data to understand its main features, uncover patterns, and
identify relationships between variables.
The main goal of Exploratory Data Analysis (EDA) is to gain insights into
the data, understand its underlying structure, and identify patterns, trends,
and anomalies. It helps in formulating hypotheses, guiding further analysis,
and making informed decisions about data preprocessing and modeling
strategies.
Please check the description box for the link to Machine Learning videos.
Aim (Importance) of EDA
Data Understanding: EDA helps in getting familiar with the data,
including its structure, distributions, and characteristics. This
understanding is essential for determining the appropriate analytical
approach and interpreting the results accurately.
Identifying Patterns and Relationships: EDA allows analysts to uncover
patterns, trends, and relationships between variables in the dataset. This
helps in generating hypotheses and guiding further analysis.
Detecting Anomalies and Outliers: EDA helps in identifying anomalies,
outliers, and errors in the data. Detecting and addressing these issues
early on can improve the quality and reliability of the analysis results.
Aim (Importance) of EDA
Guiding Feature Selection: In machine learning and predictive modeling
tasks, EDA helps in selecting relevant features and understanding their
importance in predicting the target variable.
Improving Data Quality: Through visualization and summary statistics,
EDA highlights data quality issues such as missing values,
inconsistencies, or data entry errors. Addressing these issues early on can
lead to more reliable analysis results.
Assessing Assumptions: By examining the data visually, analysts can
validate whether the data meets the assumptions required for specific
analyses.
Data Visualization Techniques in EDA
Histogram
distribution of a continuous
numerical data
Box Plots
distribution of a numerical
data
Scatter Plots
Relationship between two
continuous numerical
variables
Bar Plots
categorical or discrete data
Line Plots
visualize changes in one
continuous numerical
variable over time
Histogram
• A histogram is a graphical representation of the frequency distribution of
continuous series using rectangles. The x-axis of the graph represents the
class interval, and the y-axis shows the various frequencies corresponding
to different class intervals.
• A histogram is a two-dimensional diagram in which the width of the
rectangles shows the width of the class intervals, and the length of the
rectangles depicts the corresponding frequency. They provide insights
into the central tendency, spread, and shape of the data.
• The hist() function in Matplotlib is used to create histogram.
Visualization Techniques ,Exploratory Data Analysis(EDA), Histogram
Interpreting the Histogram:
• A symmetric histogram has a prominent mound in the center and similar
tapering to the left and right.If the histogram is symmetric, it suggests a
relatively even distribution of ages.
• Skewed histograms indicate that ages are more concentrated towards one
end of the spectrum. A distribution said to be positively skewed when the
tail on the right side of the histogram is longer than the left side (very few
higher score).
• For example, a histogram skewed to the right (positive skew) suggests a
larger proportion of younger individuals.
• Outliers may represent unusual cases, such as very young or very old
individuals, or data entry errors.
Skewness
Thanks for Watching!
Please check the description box for the link to
Machine Learning videos.

More Related Content

PPTX
Visualization Techniques- Box plot, Line Chart, Scatter plot, Bar chart.
PPTX
PPTX
Exploratory Data Analysis
PPTX
Graphical Representation of data
PPT
Types of graphs
PPTX
Type of data
PPTX
DATA PRESENTATION METHODS - 1.pptx
PPTX
Basics stat ppt-types of data
Visualization Techniques- Box plot, Line Chart, Scatter plot, Bar chart.
Exploratory Data Analysis
Graphical Representation of data
Types of graphs
Type of data
DATA PRESENTATION METHODS - 1.pptx
Basics stat ppt-types of data

What's hot (20)

PPTX
Types of graphs
PPT
Displaying data using charts and graphs
PPTX
Presentation of Data
PDF
Data management in Stata
PPTX
Bar Diagram (chart) in Statistics presentation
PPT
Introduction to Stata
PPTX
Descriptive statistics and use of excel
PPTX
Data analysis
PPTX
Principal component analysis
PDF
Data Visualization in Python
PPTX
Lect4 principal component analysis-I
PPTX
Statistical graphs
DOCX
Branches and application of statistics
PPTX
Bar chart, pie chart, histogram
PDF
Decision trees in Machine Learning
PPTX
Types of data and graphical representation
PPTX
Applications of statistics
PPTX
pie chart.pptx
PDF
Introduction to Stata
PPTX
Introduction to Statistics (Part -I)
Types of graphs
Displaying data using charts and graphs
Presentation of Data
Data management in Stata
Bar Diagram (chart) in Statistics presentation
Introduction to Stata
Descriptive statistics and use of excel
Data analysis
Principal component analysis
Data Visualization in Python
Lect4 principal component analysis-I
Statistical graphs
Branches and application of statistics
Bar chart, pie chart, histogram
Decision trees in Machine Learning
Types of data and graphical representation
Applications of statistics
pie chart.pptx
Introduction to Stata
Introduction to Statistics (Part -I)
Ad

Similar to Visualization Techniques ,Exploratory Data Analysis(EDA), Histogram (20)

PPTX
Exploratory Data Analysis.pptx for Data Analytics
PDF
Data presentation by nndd data presentation.pdf
PDF
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
PPTX
Exploratory Data Analysis (EDA) .pptx
PDF
ugc carelist journals ugc carelist journals
PPTX
Lect1.pptxdglsgldjtzjgd csjfsjtskysngfkgfhxvxfhhdhz
PPTX
Artificial Intelligence - Data Analysis, Creative & Critical Thinking and AI...
PPTX
EXPLORATORY DATA ANALYSIS IN STATISTICAL MODeLING.pptx
PDF
Exploratory Data Analysis - NIST eHandbook of Statistical Methods-out.pdf
PPTX
Diowane2003
PPTX
7 qc tools
PPTX
Presentation1.pptx
DOCX
Data presenatation
PPTX
Data Visualization Fundamentals power.pptx
PPTX
Exploratory Data Analysis and Machine Learning.pptx
PPTX
Introduction to Descriptive Statistics
PPTX
Data Representations
PPTX
Understanding the Primary Goal of Exploratory Data Analysis.pptx
DOCX
Data Mining Exploring DataLecture Notes for Chapter 3
Exploratory Data Analysis.pptx for Data Analytics
Data presentation by nndd data presentation.pdf
Graphicalrepresntationofdatausingstatisticaltools2019_210902_105156.pdf
Exploratory Data Analysis (EDA) .pptx
ugc carelist journals ugc carelist journals
Lect1.pptxdglsgldjtzjgd csjfsjtskysngfkgfhxvxfhhdhz
Artificial Intelligence - Data Analysis, Creative & Critical Thinking and AI...
EXPLORATORY DATA ANALYSIS IN STATISTICAL MODeLING.pptx
Exploratory Data Analysis - NIST eHandbook of Statistical Methods-out.pdf
Diowane2003
7 qc tools
Presentation1.pptx
Data presenatation
Data Visualization Fundamentals power.pptx
Exploratory Data Analysis and Machine Learning.pptx
Introduction to Descriptive Statistics
Data Representations
Understanding the Primary Goal of Exploratory Data Analysis.pptx
Data Mining Exploring DataLecture Notes for Chapter 3
Ad

More from Megha Sharma (20)

PPTX
Designing Printed Circuit boards, Software Choices, The Design Process
PPTX
Manufacturing PCB, Etching board, milling board, Third party manufacturing, a...
PPTX
Business Model, make thing, sell thing, subscription, customization, Key Reso...
PPTX
Funding an IOT startup, Venture Capital, Government funding, Crowdfunding, Le...
PPTX
Sketch, Iterate and Explore, Nondigital Methods.
PPTX
CNC Milling, Software, Repurposing and Recycling.
PPTX
3D printing, Types of 3D printing: FDM, Laser Sintering, Powder bed, LOM, DLP.
PPTX
Laser Cutting, Choosing a laser cutter, Software, Hinges and joints.
PPTX
Memory management, Types of memory, Making the most of your RAM.
PPTX
Performance and Battery Life, Libraries, Debugging.
PPTX
Prototyping Embedded Devices: Arduino, Developing on the Arduino.
PPTX
Raspberry-Pi, Developing on Raspberry Pi, Difference between Arduino & Raspbe...
PPTX
Open Source versus Closed Source in IOT in IOT
PPTX
Why closed? Why Open? Mixing open and closed source
PPTX
Model Performance Metrics. Accuracy, Precision, Recall
PPTX
Graceful Degradation and Affordance in IOT
PPTX
Web thinking connected device, Small Pieces Loosely joined.
PPTX
Production & Mass Personalization, Changing Embedded Platform, Physical proto...
PPTX
Whose data is it anyways? Public vs Private data collection.
PPTX
Thinking about Prototyping: Sketching, Familiarity, Cost versus Ease of proto...
Designing Printed Circuit boards, Software Choices, The Design Process
Manufacturing PCB, Etching board, milling board, Third party manufacturing, a...
Business Model, make thing, sell thing, subscription, customization, Key Reso...
Funding an IOT startup, Venture Capital, Government funding, Crowdfunding, Le...
Sketch, Iterate and Explore, Nondigital Methods.
CNC Milling, Software, Repurposing and Recycling.
3D printing, Types of 3D printing: FDM, Laser Sintering, Powder bed, LOM, DLP.
Laser Cutting, Choosing a laser cutter, Software, Hinges and joints.
Memory management, Types of memory, Making the most of your RAM.
Performance and Battery Life, Libraries, Debugging.
Prototyping Embedded Devices: Arduino, Developing on the Arduino.
Raspberry-Pi, Developing on Raspberry Pi, Difference between Arduino & Raspbe...
Open Source versus Closed Source in IOT in IOT
Why closed? Why Open? Mixing open and closed source
Model Performance Metrics. Accuracy, Precision, Recall
Graceful Degradation and Affordance in IOT
Web thinking connected device, Small Pieces Loosely joined.
Production & Mass Personalization, Changing Embedded Platform, Physical proto...
Whose data is it anyways? Public vs Private data collection.
Thinking about Prototyping: Sketching, Familiarity, Cost versus Ease of proto...

Recently uploaded (20)

PPTX
Lesson notes of climatology university.
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Pharma ospi slides which help in ospi learning
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
RMMM.pdf make it easy to upload and study
PDF
Insiders guide to clinical Medicine.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
master seminar digital applications in india
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
Lesson notes of climatology university.
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Pharma ospi slides which help in ospi learning
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Microbial disease of the cardiovascular and lymphatic systems
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
TR - Agricultural Crops Production NC III.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Microbial diseases, their pathogenesis and prophylaxis
RMMM.pdf make it easy to upload and study
Insiders guide to clinical Medicine.pdf
O7-L3 Supply Chain Operations - ICLT Program
Supply Chain Operations Speaking Notes -ICLT Program
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPH.pptx obstetrics and gynecology in nursing
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
VCE English Exam - Section C Student Revision Booklet
master seminar digital applications in india
Abdominal Access Techniques with Prof. Dr. R K Mishra

Visualization Techniques ,Exploratory Data Analysis(EDA), Histogram

  • 1. Data Science Exploratory Data Analysis (EDA) (Histogram) Part-I
  • 2. Exploratory Data Analysis (EDA) Exploratory Data Analysis (EDA) is the process of examining and visualizing data to understand its main features, uncover patterns, and identify relationships between variables. The main goal of Exploratory Data Analysis (EDA) is to gain insights into the data, understand its underlying structure, and identify patterns, trends, and anomalies. It helps in formulating hypotheses, guiding further analysis, and making informed decisions about data preprocessing and modeling strategies. Please check the description box for the link to Machine Learning videos.
  • 3. Aim (Importance) of EDA Data Understanding: EDA helps in getting familiar with the data, including its structure, distributions, and characteristics. This understanding is essential for determining the appropriate analytical approach and interpreting the results accurately. Identifying Patterns and Relationships: EDA allows analysts to uncover patterns, trends, and relationships between variables in the dataset. This helps in generating hypotheses and guiding further analysis. Detecting Anomalies and Outliers: EDA helps in identifying anomalies, outliers, and errors in the data. Detecting and addressing these issues early on can improve the quality and reliability of the analysis results.
  • 4. Aim (Importance) of EDA Guiding Feature Selection: In machine learning and predictive modeling tasks, EDA helps in selecting relevant features and understanding their importance in predicting the target variable. Improving Data Quality: Through visualization and summary statistics, EDA highlights data quality issues such as missing values, inconsistencies, or data entry errors. Addressing these issues early on can lead to more reliable analysis results. Assessing Assumptions: By examining the data visually, analysts can validate whether the data meets the assumptions required for specific analyses.
  • 5. Data Visualization Techniques in EDA Histogram distribution of a continuous numerical data Box Plots distribution of a numerical data Scatter Plots Relationship between two continuous numerical variables Bar Plots categorical or discrete data Line Plots visualize changes in one continuous numerical variable over time
  • 6. Histogram • A histogram is a graphical representation of the frequency distribution of continuous series using rectangles. The x-axis of the graph represents the class interval, and the y-axis shows the various frequencies corresponding to different class intervals. • A histogram is a two-dimensional diagram in which the width of the rectangles shows the width of the class intervals, and the length of the rectangles depicts the corresponding frequency. They provide insights into the central tendency, spread, and shape of the data. • The hist() function in Matplotlib is used to create histogram.
  • 8. Interpreting the Histogram: • A symmetric histogram has a prominent mound in the center and similar tapering to the left and right.If the histogram is symmetric, it suggests a relatively even distribution of ages. • Skewed histograms indicate that ages are more concentrated towards one end of the spectrum. A distribution said to be positively skewed when the tail on the right side of the histogram is longer than the left side (very few higher score). • For example, a histogram skewed to the right (positive skew) suggests a larger proportion of younger individuals. • Outliers may represent unusual cases, such as very young or very old individuals, or data entry errors.
  • 10. Thanks for Watching! Please check the description box for the link to Machine Learning videos.