SlideShare a Scribd company logo
Saikiruthika .G
Visualizing using Matplotlib
UNIT - 2
TOPICS
➢ Importing Matplotlib
➢ Simple line plots
➢ Simple scatter plots
➢ visualizing errors – density and
➢ contour plots
➢ Histograms - legends- colours- subplots -text and
annotation – customization
➢ Three dimensional plotting
➢ Geographic Data with Basemap
Importing Matplotlib
Matplotlib.pyplot - Matplotlib.pyplot is a module within the Matplotlib library
in Python, providing a state-based interface for creating various types of plots and
visualizations . It aims to mimic the plotting functionality found in MATLAB,
offering a user-friendly way to generate figures with minimal code.
● Preparing Data: Define or load the data to be visualized.
● Creating the Plot: Use plt functions to generate the desired plot type.
● Customizing the Plot (Optional): Add labels, titles, legends, and adjust
visual properties.
● Displaying or Saving the Plot: Use plt.show() or plt.savefig().
plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
For every x, y pair of arguments, there is an optional third argument which is the
format string that indicates the color and line type of the plot. The letters and
symbols of the format string are from MATLAB, and concatenate a color string
with a line style string.
Simple line plot
Creating a simple line plot in Matplotlib involves importing the
library, defining your data, and then using the plot() function.
● Import Matplotlib Pyplot module: This module provides a
convenient interface for creating plots, often aliased as plt.
Python
import matplotlib.pyplot as plt
● Define your data: Create lists or arrays for your x-axis and
y-axis values. These should have the same number of
Simple Scatter plot
A scatter plot is a type of graph that displays the relationship
between two numerical variables. It uses dots to represent
individual data points, with the position of each dot determined
by its corresponding values on the x and y axes.
Key characteristics of a scatter plot:
Two variables:
Scatter plots are designed to show the relationship between two
numerical variables.
Independent and dependent variables:
Usually, the independent variable is plotted on the x-axis, and
the dependent variable on the y-axis.
Data points as dots:
Each individual data point is represented by a dot or symbol on
the plot.
No correlation:
If the dots appear randomly scattered with no discernible
pattern, there is no correlation between variable.
APPLICATION OF SCATTER PLOT
1. Identifying Relationships and Correlations
Correlation - whether a relationship exists between two
variables and the nature of that relationship (positive, negative,
linear, non-linear, strong, or weak).
Trends - They reveal trends in data, showing how one variable
changes in relation to another.
Outlier Detection -
Scatter plots help in spotting unusual data points that
deviate significantly from the overall pattern.
2. Data Exploration and Pattern Recognition:
Visualizing Data: They provide a visual representation of
data, making it easier to grasp patterns and relationships that
might not be apparent from raw data or tables.
Hypothesis Generation:
By visualizing data, scatter plots can help in generating
hypotheses about potential relationships between variables,
which can be further investigated.
Visualizing error density and contour plot
Density and Contour plot
It is useful to display three-dimensional data in two dimensions
using contours or color- coded regions. Three Matplotlib
functions are used for this purpose.
a) plt.contour for contour plots,
b) plt.contour for filled contour plots,
c) plt.imshow for showing images.
import numpy as np
xlist = np.linspace(-3.0, 3.0, 3)
ylist = np.linspace(-3.0, 3.0, 4)
X, Y = np.meshgrid(xlist, ylist)
print(xlist)
print(ylist)
print(X) print (Y)
1. Contour plot :
A contour line or isoline of a function of two variables is a curve
along which the function has a constant value. It is a cross-section
of the three-dimensional graph of the function f(x, y) parallel to the
x, y plane.
HISTOGRAM LEGENDS
Histogram is an aggregated bar chart, with several possible
aggregation functions (e.g. sum, average, count) which can be
used to visualize data on categorical and data axes as well as
linear axes.
COLOR SUBPLOT TEXT ANNOTATION
1. Standalone text annotations can be added to figures using fig.add_annotation(),
with or without arrows, and they can be positioned absolutely within the figure, or
they can be positioned relative to the axes of 2d or 3d cartesian subplots i.e. in
data coordinates.
The differences between these two approaches are that
1. Text annotations can be positioned absolutely or relative to data coordinates
2d/3d cartesian subplots only.
2. Trace cannot be positioned absolutely but can be positioned relative to
data coordinates in subplot.
Text Annotations in python
Standalone text annotations can be added to figures using
fig.add_annotation(), with or without arrows, and they can be positioned
absolutely within the figure, or they can be positioned relative to the axes
of 2d or 3d cartesian subplots i.e. in data coordinates.
CUSTOMIZATION
To customize the Histogram use the Histogram Settings to
the right .
Y Axis: Change the axis metric to either Probability, Frequency
or Density.
Probability: The percentage of values contained within the bar.
Frequency: The number of values contained within the bar.
Three dimensional plotting
3D scatter plots are used to plot data points on three axes in the attempt to show the
relationship between three variables. Each row in the data table is represented by a marker
whose position depends on its values in the columns set on the X, Y, and Z axes.
Geographic data with basemap
Basemap is a toolkit under the Python visualization library Matplotlib. Its main
function is to draw 2D maps, which are important for visualizing spatial data.
Matplotlib can also be used to plot contours, images, vectors, lines or points in
transformed coordinates. Basemap includes the GSSH coastline dataset, as well
as datasets from GMT for rivers, states and national boundaries .
These datasets can be used to plot coastlines, rivers and political boundaries on a
map at several different resolutions .
For example, if we wanted to show all the different types of endangered plants within a
region, we would use a base map showing roads, provincial and state boundaries,
waterways and elevation. One added layer could be trees, another layer could be mosses
and lichens, another layer could be grasses.
UNIT-2.data exploration and visualization
UNIT-2.data exploration and visualization
Visualization with seaborn
Seaborn is widely used python library used for creating statistical data
visualizations. It is built on the top of Matplotlib and designed to work
with pandas , it helps in the process of making complex plots with fewer
lines of code.
Creating plots with seaborn
1. Line Plot - It is used to display the relationship between two numeric
Variables showing one variable changes over interval of time.
2. Scatter plot
It is used to visualize the relationship between two numerical variables. It can
draw a two dimensional graph.
3. Box plot
It is the visual representation of the group of depicting groups of numerical
data with their quartile against continuous /categorical data.
4. Bar plot
It represents an estimate of central tendency for a numeric variable
With heights of each rectangle and provides some indication of
Uncertainty around that estimate using error bars.
5 . Count plot
It display the number of occurrences of each category using bar to visualize
the distribution of categorical variables.
6. KDE Plot
Kernel density estimate is used for visualizing probability density of a
Continuous variable.
7. Swarm plot
It display individual data points without overlap along a categorical axis
provides a clear view of distribution density.

More Related Content

PPTX
a9bf73_Introduction to Matplotlib01.pptx
PPTX
Matplot Lib Practicals artificial intelligence.pptx
PDF
AI 11 - Data Visualizaton.pdf ......m.........
PPTX
3. Graphsmmm.pptxmmmmmmmmmmmmmmmmmmmmmmmmmmmm
PDF
Unit---4.pdf how to gst du paper in this day and age
PDF
12-IP.pdf
PDF
M4_DAR_part1. module part 4 analystics with r
PPTX
data analytics and visualization CO4_18_Data Types for Plotting.pptx
a9bf73_Introduction to Matplotlib01.pptx
Matplot Lib Practicals artificial intelligence.pptx
AI 11 - Data Visualizaton.pdf ......m.........
3. Graphsmmm.pptxmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Unit---4.pdf how to gst du paper in this day and age
12-IP.pdf
M4_DAR_part1. module part 4 analystics with r
data analytics and visualization CO4_18_Data Types for Plotting.pptx

Similar to UNIT-2.data exploration and visualization (20)

PPTX
Lecture_3.pptx
PPTX
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
PPTX
Python for Data Science
PPTX
Chart and graphs in R programming language
PPTX
UNIT_4_data visualization.pptx
PPTX
visual representation with BOX PLOT,BAR PLOTS
PDF
Data visualization pyplot
PDF
M2M_250327_22434hjjik7_250411_183538.pdf
PPTX
Python chart plotting using Matplotlib.pptx
PPTX
Basics of Educational Statistics (Graphs & its Types)
PPTX
Data Visualization 2020_21
PPTX
Visualization and Matplotlib using Python.pptx
PPTX
Diowane2003
PDF
Deepak_DAI101_Data_Anal_lec tures4_5.pdf
PDF
Me 443 4 plotting curves Erdi Karaçal Mechanical Engineer University of Gaz...
DOCX
UNIT-4.docx
PPTX
Different Types of Graphs - Copy.pptx
PDF
Four data models in GIS
PDF
The Graph Abstract Data Type-DATA STRUCTURE.pdf
PDF
Graph Analyses with Python and NetworkX
Lecture_3.pptx
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
Python for Data Science
Chart and graphs in R programming language
UNIT_4_data visualization.pptx
visual representation with BOX PLOT,BAR PLOTS
Data visualization pyplot
M2M_250327_22434hjjik7_250411_183538.pdf
Python chart plotting using Matplotlib.pptx
Basics of Educational Statistics (Graphs & its Types)
Data Visualization 2020_21
Visualization and Matplotlib using Python.pptx
Diowane2003
Deepak_DAI101_Data_Anal_lec tures4_5.pdf
Me 443 4 plotting curves Erdi Karaçal Mechanical Engineer University of Gaz...
UNIT-4.docx
Different Types of Graphs - Copy.pptx
Four data models in GIS
The Graph Abstract Data Type-DATA STRUCTURE.pdf
Graph Analyses with Python and NetworkX
Ad

Recently uploaded (20)

PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPT
ISS -ESG Data flows What is ESG and HowHow
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Introduction to machine learning and Linear Models
PDF
.pdf is not working space design for the following data for the following dat...
Introduction to Knowledge Engineering Part 1
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
ISS -ESG Data flows What is ESG and HowHow
Miokarditis (Inflamasi pada Otot Jantung)
Business Acumen Training GuidePresentation.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Clinical guidelines as a resource for EBP(1).pdf
climate analysis of Dhaka ,Banglades.pptx
Business Analytics and business intelligence.pdf
STUDY DESIGN details- Lt Col Maksud (21).pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Quality review (1)_presentation of this 21
Introduction-to-Cloud-ComputingFinal.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to machine learning and Linear Models
.pdf is not working space design for the following data for the following dat...
Ad

UNIT-2.data exploration and visualization

  • 1. Saikiruthika .G Visualizing using Matplotlib UNIT - 2
  • 2. TOPICS ➢ Importing Matplotlib ➢ Simple line plots ➢ Simple scatter plots ➢ visualizing errors – density and ➢ contour plots ➢ Histograms - legends- colours- subplots -text and annotation – customization ➢ Three dimensional plotting ➢ Geographic Data with Basemap
  • 3. Importing Matplotlib Matplotlib.pyplot - Matplotlib.pyplot is a module within the Matplotlib library in Python, providing a state-based interface for creating various types of plots and visualizations . It aims to mimic the plotting functionality found in MATLAB, offering a user-friendly way to generate figures with minimal code. ● Preparing Data: Define or load the data to be visualized. ● Creating the Plot: Use plt functions to generate the desired plot type. ● Customizing the Plot (Optional): Add labels, titles, legends, and adjust visual properties. ● Displaying or Saving the Plot: Use plt.show() or plt.savefig().
  • 4. plt.plot([1, 2, 3, 4], [1, 4, 9, 16]) For every x, y pair of arguments, there is an optional third argument which is the format string that indicates the color and line type of the plot. The letters and symbols of the format string are from MATLAB, and concatenate a color string with a line style string.
  • 5. Simple line plot Creating a simple line plot in Matplotlib involves importing the library, defining your data, and then using the plot() function. ● Import Matplotlib Pyplot module: This module provides a convenient interface for creating plots, often aliased as plt. Python import matplotlib.pyplot as plt ● Define your data: Create lists or arrays for your x-axis and y-axis values. These should have the same number of
  • 6. Simple Scatter plot A scatter plot is a type of graph that displays the relationship between two numerical variables. It uses dots to represent individual data points, with the position of each dot determined by its corresponding values on the x and y axes. Key characteristics of a scatter plot: Two variables: Scatter plots are designed to show the relationship between two numerical variables.
  • 7. Independent and dependent variables: Usually, the independent variable is plotted on the x-axis, and the dependent variable on the y-axis. Data points as dots: Each individual data point is represented by a dot or symbol on the plot. No correlation: If the dots appear randomly scattered with no discernible pattern, there is no correlation between variable.
  • 8. APPLICATION OF SCATTER PLOT 1. Identifying Relationships and Correlations Correlation - whether a relationship exists between two variables and the nature of that relationship (positive, negative, linear, non-linear, strong, or weak). Trends - They reveal trends in data, showing how one variable changes in relation to another. Outlier Detection - Scatter plots help in spotting unusual data points that deviate significantly from the overall pattern.
  • 9. 2. Data Exploration and Pattern Recognition: Visualizing Data: They provide a visual representation of data, making it easier to grasp patterns and relationships that might not be apparent from raw data or tables. Hypothesis Generation: By visualizing data, scatter plots can help in generating hypotheses about potential relationships between variables, which can be further investigated.
  • 10. Visualizing error density and contour plot Density and Contour plot It is useful to display three-dimensional data in two dimensions using contours or color- coded regions. Three Matplotlib functions are used for this purpose. a) plt.contour for contour plots, b) plt.contour for filled contour plots, c) plt.imshow for showing images.
  • 11. import numpy as np xlist = np.linspace(-3.0, 3.0, 3) ylist = np.linspace(-3.0, 3.0, 4) X, Y = np.meshgrid(xlist, ylist) print(xlist) print(ylist) print(X) print (Y)
  • 12. 1. Contour plot : A contour line or isoline of a function of two variables is a curve along which the function has a constant value. It is a cross-section of the three-dimensional graph of the function f(x, y) parallel to the x, y plane.
  • 13. HISTOGRAM LEGENDS Histogram is an aggregated bar chart, with several possible aggregation functions (e.g. sum, average, count) which can be used to visualize data on categorical and data axes as well as linear axes.
  • 14. COLOR SUBPLOT TEXT ANNOTATION 1. Standalone text annotations can be added to figures using fig.add_annotation(), with or without arrows, and they can be positioned absolutely within the figure, or they can be positioned relative to the axes of 2d or 3d cartesian subplots i.e. in data coordinates. The differences between these two approaches are that 1. Text annotations can be positioned absolutely or relative to data coordinates 2d/3d cartesian subplots only. 2. Trace cannot be positioned absolutely but can be positioned relative to data coordinates in subplot.
  • 15. Text Annotations in python Standalone text annotations can be added to figures using fig.add_annotation(), with or without arrows, and they can be positioned absolutely within the figure, or they can be positioned relative to the axes of 2d or 3d cartesian subplots i.e. in data coordinates.
  • 16. CUSTOMIZATION To customize the Histogram use the Histogram Settings to the right . Y Axis: Change the axis metric to either Probability, Frequency or Density. Probability: The percentage of values contained within the bar. Frequency: The number of values contained within the bar.
  • 17. Three dimensional plotting 3D scatter plots are used to plot data points on three axes in the attempt to show the relationship between three variables. Each row in the data table is represented by a marker whose position depends on its values in the columns set on the X, Y, and Z axes.
  • 18. Geographic data with basemap Basemap is a toolkit under the Python visualization library Matplotlib. Its main function is to draw 2D maps, which are important for visualizing spatial data. Matplotlib can also be used to plot contours, images, vectors, lines or points in transformed coordinates. Basemap includes the GSSH coastline dataset, as well as datasets from GMT for rivers, states and national boundaries . These datasets can be used to plot coastlines, rivers and political boundaries on a map at several different resolutions . For example, if we wanted to show all the different types of endangered plants within a region, we would use a base map showing roads, provincial and state boundaries, waterways and elevation. One added layer could be trees, another layer could be mosses and lichens, another layer could be grasses.
  • 21. Visualization with seaborn Seaborn is widely used python library used for creating statistical data visualizations. It is built on the top of Matplotlib and designed to work with pandas , it helps in the process of making complex plots with fewer lines of code. Creating plots with seaborn 1. Line Plot - It is used to display the relationship between two numeric Variables showing one variable changes over interval of time.
  • 22. 2. Scatter plot It is used to visualize the relationship between two numerical variables. It can draw a two dimensional graph. 3. Box plot It is the visual representation of the group of depicting groups of numerical data with their quartile against continuous /categorical data. 4. Bar plot It represents an estimate of central tendency for a numeric variable With heights of each rectangle and provides some indication of Uncertainty around that estimate using error bars.
  • 23. 5 . Count plot It display the number of occurrences of each category using bar to visualize the distribution of categorical variables. 6. KDE Plot Kernel density estimate is used for visualizing probability density of a Continuous variable. 7. Swarm plot It display individual data points without overlap along a categorical axis provides a clear view of distribution density.