2. TOPICS
➢ Importing Matplotlib
➢ Simple line plots
➢ Simple scatter plots
➢ visualizing errors – density and
➢ contour plots
➢ Histograms - legends- colours- subplots -text and
annotation – customization
➢ Three dimensional plotting
➢ Geographic Data with Basemap
3. Importing Matplotlib
Matplotlib.pyplot - Matplotlib.pyplot is a module within the Matplotlib library
in Python, providing a state-based interface for creating various types of plots and
visualizations . It aims to mimic the plotting functionality found in MATLAB,
offering a user-friendly way to generate figures with minimal code.
● Preparing Data: Define or load the data to be visualized.
● Creating the Plot: Use plt functions to generate the desired plot type.
● Customizing the Plot (Optional): Add labels, titles, legends, and adjust
visual properties.
● Displaying or Saving the Plot: Use plt.show() or plt.savefig().
4. plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
For every x, y pair of arguments, there is an optional third argument which is the
format string that indicates the color and line type of the plot. The letters and
symbols of the format string are from MATLAB, and concatenate a color string
with a line style string.
5. Simple line plot
Creating a simple line plot in Matplotlib involves importing the
library, defining your data, and then using the plot() function.
● Import Matplotlib Pyplot module: This module provides a
convenient interface for creating plots, often aliased as plt.
Python
import matplotlib.pyplot as plt
● Define your data: Create lists or arrays for your x-axis and
y-axis values. These should have the same number of
6. Simple Scatter plot
A scatter plot is a type of graph that displays the relationship
between two numerical variables. It uses dots to represent
individual data points, with the position of each dot determined
by its corresponding values on the x and y axes.
Key characteristics of a scatter plot:
Two variables:
Scatter plots are designed to show the relationship between two
numerical variables.
7. Independent and dependent variables:
Usually, the independent variable is plotted on the x-axis, and
the dependent variable on the y-axis.
Data points as dots:
Each individual data point is represented by a dot or symbol on
the plot.
No correlation:
If the dots appear randomly scattered with no discernible
pattern, there is no correlation between variable.
8. APPLICATION OF SCATTER PLOT
1. Identifying Relationships and Correlations
Correlation - whether a relationship exists between two
variables and the nature of that relationship (positive, negative,
linear, non-linear, strong, or weak).
Trends - They reveal trends in data, showing how one variable
changes in relation to another.
Outlier Detection -
Scatter plots help in spotting unusual data points that
deviate significantly from the overall pattern.
9. 2. Data Exploration and Pattern Recognition:
Visualizing Data: They provide a visual representation of
data, making it easier to grasp patterns and relationships that
might not be apparent from raw data or tables.
Hypothesis Generation:
By visualizing data, scatter plots can help in generating
hypotheses about potential relationships between variables,
which can be further investigated.
10. Visualizing error density and contour plot
Density and Contour plot
It is useful to display three-dimensional data in two dimensions
using contours or color- coded regions. Three Matplotlib
functions are used for this purpose.
a) plt.contour for contour plots,
b) plt.contour for filled contour plots,
c) plt.imshow for showing images.
12. 1. Contour plot :
A contour line or isoline of a function of two variables is a curve
along which the function has a constant value. It is a cross-section
of the three-dimensional graph of the function f(x, y) parallel to the
x, y plane.
13. HISTOGRAM LEGENDS
Histogram is an aggregated bar chart, with several possible
aggregation functions (e.g. sum, average, count) which can be
used to visualize data on categorical and data axes as well as
linear axes.
14. COLOR SUBPLOT TEXT ANNOTATION
1. Standalone text annotations can be added to figures using fig.add_annotation(),
with or without arrows, and they can be positioned absolutely within the figure, or
they can be positioned relative to the axes of 2d or 3d cartesian subplots i.e. in
data coordinates.
The differences between these two approaches are that
1. Text annotations can be positioned absolutely or relative to data coordinates
2d/3d cartesian subplots only.
2. Trace cannot be positioned absolutely but can be positioned relative to
data coordinates in subplot.
15. Text Annotations in python
Standalone text annotations can be added to figures using
fig.add_annotation(), with or without arrows, and they can be positioned
absolutely within the figure, or they can be positioned relative to the axes
of 2d or 3d cartesian subplots i.e. in data coordinates.
16. CUSTOMIZATION
To customize the Histogram use the Histogram Settings to
the right .
Y Axis: Change the axis metric to either Probability, Frequency
or Density.
Probability: The percentage of values contained within the bar.
Frequency: The number of values contained within the bar.
17. Three dimensional plotting
3D scatter plots are used to plot data points on three axes in the attempt to show the
relationship between three variables. Each row in the data table is represented by a marker
whose position depends on its values in the columns set on the X, Y, and Z axes.
18. Geographic data with basemap
Basemap is a toolkit under the Python visualization library Matplotlib. Its main
function is to draw 2D maps, which are important for visualizing spatial data.
Matplotlib can also be used to plot contours, images, vectors, lines or points in
transformed coordinates. Basemap includes the GSSH coastline dataset, as well
as datasets from GMT for rivers, states and national boundaries .
These datasets can be used to plot coastlines, rivers and political boundaries on a
map at several different resolutions .
For example, if we wanted to show all the different types of endangered plants within a
region, we would use a base map showing roads, provincial and state boundaries,
waterways and elevation. One added layer could be trees, another layer could be mosses
and lichens, another layer could be grasses.
21. Visualization with seaborn
Seaborn is widely used python library used for creating statistical data
visualizations. It is built on the top of Matplotlib and designed to work
with pandas , it helps in the process of making complex plots with fewer
lines of code.
Creating plots with seaborn
1. Line Plot - It is used to display the relationship between two numeric
Variables showing one variable changes over interval of time.
22. 2. Scatter plot
It is used to visualize the relationship between two numerical variables. It can
draw a two dimensional graph.
3. Box plot
It is the visual representation of the group of depicting groups of numerical
data with their quartile against continuous /categorical data.
4. Bar plot
It represents an estimate of central tendency for a numeric variable
With heights of each rectangle and provides some indication of
Uncertainty around that estimate using error bars.
23. 5 . Count plot
It display the number of occurrences of each category using bar to visualize
the distribution of categorical variables.
6. KDE Plot
Kernel density estimate is used for visualizing probability density of a
Continuous variable.
7. Swarm plot
It display individual data points without overlap along a categorical axis
provides a clear view of distribution density.