SlideShare a Scribd company logo
12/09/2024 22UIT303-Data science 1
UNIT-V
DATA VISUALIZATION
12/09/2024 22UIT303-Data science 2
Syllabus-UNIT-V
Importing Matplotlib – Simple Line Plots – Simple Scatter Plots –
Visualizing Errors – Density and Contour Plots – Histograms – Legends –
Colors – Subplots – Text and Annotation – Customization – Three-
Dimensional Plotting - Geographic Data with Base map - Visualization
with Seaborn.
12/09/2024 22UIT303-Data science 3
Importing Matplotlib
• Matplotlib is a cross-platform, data visualization and graphical plotting library for Python and its
numerical extension NumPy.
• Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations
in Python.
• Matplotlib is a plotting library for the Python programming language. It allows to make quality
charts in few lines of code. Most of the other python plotting library are build on top of Matplotlib.
• The library is currently limited to 2D output, but it still provides you with the means to express
graphically the data patterns.
12/09/2024 22UIT303-Data science 4
Visualizing Information: Starting with Graph
•Data visualization is the presentation of quantitative information in a graphical form. In other
words, data visualizations turn large and small datasets into visuals that are easier for the human brain
to understand and process.
•Good data visualizations are created when communication, data science, and design collide. Data
visualizations done right offer key insights into complicated datasets in ways that are meaningful and
intuitive.
•A graph is simply a visual representation of numeric data. MatPlotLib supports a large number of
graph and chart types.
•Matplotlib is a popular Python package used to build plots. Matplotlib can also be used to make 3D
plots and animations.
•Line plots can be created in Python with Matplotlib's pyplot library. To build a line plot, first
import Matplotlib. It is a standard convention to import Matplotlib's pyplot library as plt.
12/09/2024 22UIT303-Data science 5
•import matplotlib.pyplot as plt
•plt.plot([1,2,3],[5,7,4])
•plt.show()
12/09/2024 22UIT303-Data science 6
Simple Line Plots
12/09/2024 22UIT303-Data science 7
Simple Line Plots
•More than one line can be in the plot. To add another line, just call the plot (x,y) function again.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-1, 1, 50)
y1 = 2*x+ 1
y2 = 2**x + 1
plt.figure(num = 3, figsize=(8, 5))
plt.plot(x, y2)
plt.plot(x, y1,
linewidth=1.0,
linestyle='--'
)
plt.show()
12/09/2024 22UIT303-Data science 8
Example 5.1.1: Write a simple python program that draws a line graph where x = [1,2,3,4] and
y = [1,4,9,16] and gives both axis label as "X-axis" and "Y-axis".
import matplotlib.pyplot as plt
import numpy as np
# define data values
x = np.array([1, 2, 3, 4]) # X-axis points
y = x*2 # Y-axis points
print("Values of :")
print("Values of Y):")
print (Y)
plt.plot(X, Y)
# Set the x axis label of the current axis.
plt.xlabel('x-axis')
# Set the y axis label of the current axis.
plt.ylabel('y-axis')
# Set a title
plt.title('Draw a line.')
# Display the figure.
plt.show()
12/09/2024 22UIT303-Data science 9
Setting the Axis, Ticks, Grids
• The axes define the x and y plane of the graphic. The x axis runs horizontally, and the y axis runs
vertically.
• An axis is added to a plot layer. Axis can be thought of as sets of x and y axis that lines and bars
are drawn on.
•An Axis contains daughter attributes like axis labels, tick labels, and line thickness.
• The following code shows how to obtain access to the axes for a plot :
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8])
# left, bottom, width, height (range 0 to 1)
axes.plot(x, y, 'r')
axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('title');
12/09/2024 22UIT303-Data science 10
# import required modules
import matplotlib.pyplot as plt
import numpy as np
import math
# assign coordinates
x = np.arange(0, math.pi*2, 0.05)
y = np.sin(x)
ax = plt.axes()
plt.xlabel("x-axis")
plt.ylabel("y-axis")
# depict illustration
plt.plot(x, y, color="lime")
# setting ticks for x-axis
ax.set_xticks([0, 2, 4, 6])
# setting ticks for y-axis
ax.set_yticks([-1, 0, 1])
# setting label for y tick
ax.set_yticklabels(["sin(-90deg)", "sin(0deg)", "sin(90deg)"])
plt.show()
12/09/2024 22UIT303-Data science 11
Defining the Line Appearance and Working with Line Style
• Line styles help differentiate graphs by drawing the lines in various ways. Following line
style is used by Matplotlib.
• Matplotlib has an additional parameter to control the colour and style of the plot.
plt.plot(xa, ya 'g')
plt.plot(xa, ya 'r--')
12/09/2024 22UIT303-Data science 12
from matplotlib import pyplot as plt
import numpy as np
xa = np.linspace(0, 5, 20)
ya = xa**2
plt.plot(xa, ya, 'g')
ya = 3*xa
plt.plot(xa, ya, 'r--')
plt.show()
12/09/2024 22UIT303-Data science 13
Adding Markers
• Markers add a special symbol to each data point in a line graph. Unlike line style and color,
markers tend to be a little less susceptible to accessibility and printing issues.
• Basically, the matplotlib tries to have identifiers for the markers which look similar to the marker:
1. Triangle-shaped: v, <, > Λ
2. Cross-like: *,+, 1, 2, 3, 4
3. Circle-like: 0,., h, p, H, 8
12/09/2024 22UIT303-Data science 14
Using Labels, Annotations and Legends
• To fully document your graph, you usually have to resort to labels, annotations, and legends. Each
of these elements has a different purpose, as follows:
1. Label: Make it easy for the viewer to know the name or kind of data illustrated
2. Annotation: Help extend the viewer's knowledge of the data, rather than simply identify it.
3. Legend: Provides cues to make identification of the data group easier.
12/09/2024 22UIT303-Data science 15
import matplotlib.pyplot as plt
plt.xlabel('Entries')
plt.ylabel('Values')
plt.plot(range(1,11), values)
plt.show()
Following example shows how to add
annotation to a graph:
import matplotlib.pyplot as plt
W = 4
h = 3
d = 70
plt.figure(figsize=(w, h), dpi=d)
plt.axis([0, 5, 0, 5])
x = [0, 3, 5]
y = [1, 4, 3.5]
label_x = 1
label_y = 4arrow_x = 3
arrow_y= 4
arrow_properties=dict(
facecolor="black", width=0.5,
headwidth=4, shrink=0.1)
plt.annotate("maximum", xy=(arrow_x,
arrow_y),
xytext=(label_x, label_y),
arrowprops arrow_properties)
plt.plot(x, y)
plt.savefig("out.png")
12/09/2024 22UIT303-Data science 16
Creating a legend
•A legend documents the individual elements of a plot.
•Each line is presented in a table that contains a label for it so that people can differentiate
between each line.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-10, 9, 20)
y = x ** 3
Z = x ** 2
figure = plt.figure()
axes = figure.add_axes([0,0,1,1])
axes.plot(x, z, label="Square Function")
axes.plot(x, y, label="Cube Function")
axes.legend()
12/09/2024 22UIT303-Data science 17
Scatter Plots
• A scatter plot is a visual representation of how two variables relate to each other.
•we can use scatter plots to explore the relationship between two variables.
import matplotlib. pyplot as plt
#X axis values:
x = [2,3,7,29,8,5,13,11,22,33]
# Y axis values:
y = [4,7,55,43,2,4,11,22,33,44]
# Create scatter plot:
plt.scatter(x, y)
plt.show()
12/09/2024 22UIT303-Data science 18
Example: We can create a simple scatter plot in Python by passing x
and y values to plt.scatter():
# scatter_plotting.py
import matplotlib. pyplot as plt
plt. style. use('fivethirtyeight')
x = [2, 4, 6, 6, 9, 2, 7, 2, 6, 1, 8, 4, 5, 9, 1, 2, 3, 7, 5, 8, 1, 3]
y = [7, 8, 2, 4, 6, 4, 9, 5, 9, 3, 6, 7, 2, 4, 6, 7, 1, 9, 4, 3, 6, 9]
plt. scatter(x, y)
plt. show()
12/09/2024 22UIT303-Data science 19
Creating Advanced Scatterplots
•Scatterplots are especially important for data science because they can show data patterns that aren't obvious
when viewed in other ways.
import matplotlib.pyplot as plt
x_axis1 =[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y_axis1 =[5, 16, 34, 56, 32, 56, 32, 12, 76, 89]
x_axis2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y_axis2 = [53, 6, 46, 36, 15, 64, 73, 25, 82, 9]
plt.title("Prices over 10 years")
plt.scatter(x_axis1, y_axis1, color = 'darkblue', marker='x', label="item 1")
plt.scatter(x_axis2, y_axis2, color='darkred', marker='x', label="item 2")
plt.xlabel("Time (years)")
plt.ylabel("Price (dollars)")
plt.grid(True)
plt.legend()
plt.show()
12/09/2024 22UIT303-Data science 20
Visualizing Errors
12/09/2024 22UIT303-Data science 21
Visualizing Errors
• Error bars are included in Matplotlib line plots and graphs.
•Error is the difference between the calculated value and
actual value.
• Without error bars, bar graphs provide the perception that a
measurable or determined number is defined to a high level of
efficiency.
•The method matplotlib.pyplot.errorbar() draws y vs. x as
planes and/or indicators with error bars associated.
• Adding the error bar in Matplotlib, Python.
plt.errorbar(x, y, yerr = 2, capsize=3)
Where:
x = The data of the X axis.
Y = The data of the Y axis.
yerr = The error value of the Y axis.
Each point has its own error value.
xerr = The error value of the X axis.
capsize = The size of the lower and
upper lines of the error bar
12/09/2024 22UIT303-Data science 22
import matplotlib.pyplot as plt
x = 1
y = 20
y_error = 20*0.10 ## El 10% de error
plt.errorbar(x,y, yerr = y_error, capsize=3)
plt.show()
12/09/2024 22UIT303-Data science 23
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(1,8)
y = np.array([20,10,45,32,38,21,27])
y_error = y * 0.10 ##El 10%
plt.errorbar(x, y, yerr = y_error,
linestyle="None", fmt="ob", capsize=3, ecolor="k")
plt.show()
• Parameters of the errorbar :
a) yerr is the error value in each point.
b) linestyle, here it indicate that we will not plot a line.
c) fmt, is the type of marker, in this case is a point ("o") blue ("b").
d) capsize, is the size of the lower and upper lines of the error bar.
e) ecolor, is the color of the error bar. The default color is the marker color.
12/09/2024 22UIT303-Data science 24
Density and Contour Plots
12/09/2024 22UIT303-Data science 25
Density and Contour Plots
• It is useful to display three-dimensional data in two dimensions using contours or color-
coded regions.
Three Matplotlib functions are used for this purpose. They are :
a) plt.contour for contour plots,
b) plt.contourf for filled contour plots,
c) plt.imshow for showing images.
12/09/2024 22UIT303-Data science 26
1. Contour plot
• A contour line or isoline of a function of two variables is a curve along which the function
has a constant value.
•It is a cross-section of the three-dimensional graph of the function f(x, y) parallel to the x,
y plane.
• Contour lines are used in Geography and Meteorology.
•In cartography, a contour line joins points of equal height above a given level, such as
mean sea level.
• A contour line of a function with two variables is a curve which connects points with the
same values.
12/09/2024 22UIT303-Data science 27
import numpy as np
xlist = np.linspace(-3.0, 3.0, 3)
ylist = np.linspace(-3.0, 3.0, 4)
X, Y = np.meshgrid(xlist, ylist
print(xlist)
print(ylist)
print(X)
print(Y)
12/09/2024 22UIT303-Data science 28
Changing the colours and the line style
import matplotlib.pyplot as plt
plt.figure()
cp = plt.contour(X, Y, Z, colors='black', linestyles='dashed')
plt.clabel(cp, inline=True,
fontsize=10)
plt.title('Contour Plot')
plt.xlabel('x (cm))
plt.ylabel('y (cm)')
plt.show()
12/09/2024 22UIT303-Data science 29
• When creating a contour plot, we can also specify the color map. There are different classes of color maps.
Matplotlib gives the following guidance :
a) Sequential: Change in lightness and often saturation of color incrementally, often using a single hue;
should be used for representing information that has ordering.
b) Diverging: Change in lightness and possibly saturation of two different colors that meet in the middle at
an unsaturated color; should be used when the information being plotted has a critical middle value, such as
topography or when the data deviates around zero.
c) Cyclic : Change in lightness of two different colors that meet in the middle and beginning/end at an
unsaturated color; should be used for values that wrap around at the endpoints, such as phase angle, wind
direction, or time of day.
d) Qualitative: Often are miscellaneous colors; should be used to represent information which does not have
ordering or relationships.
• This data has both positive and negative values, which zero representing a node for the wave function. There
are three important display options for contour plots: the undisplaced shape key, the scale factor, and the contour
scale.
12/09/2024 22UIT303-Data science 31
Histogram
12/09/2024 22UIT303-Data science 32
Histogram
•In a histogram, the data are grouped into ranges (e.g. 10 - 19, 20 - 29) and then plotted as
connected bars.
•Each bar represents a range of data.
•The width of each bar is proportional to the width of each category, and the height is
proportional to the frequency or percentage of that category.
•It provides a visual interpretation of numerical data by showing the number of data points
that fall within a specified range of values called "bins".
12/09/2024 22UIT303-Data science 33
• Histograms can display a large amount of data
and the frequency of the data values.
The median and distribution of the data can be
determined by a histogram.
In addition, it can show any outliers or gaps in
the data.
• Matplotlib provides a dedicated function to
compute and display histograms: plt.hist()
12/09/2024 22UIT303-Data science 34
Code for Histogram
import numpy as np
import matplotlib.pyplot as plt
x = 40* np.random.randn(50000)
plt.hist(x, 20, range=(-50, 50), histtype='stepfilled',
align='mid', color=‘g', label="Test Data')
plt.legend()
plt.title(' Histogram')
plt.show()
12/09/2024 22UIT303-Data science 35
Legend
12/09/2024 22UIT303-Data science 36
Legend
• Plot legends give meaning to a visualization, assigning labels to the various plot
elements.
•Legends are found in maps - describe the pictorial language or symbology of the map.
•Legends are used in line graphs to explain the function or the values underlying the
different lines of the graph.
• Matplotlib has native support for legends.
•Legends can be placed in various positions:
•A legend can be placed inside or outside the chart and the position can be moved.
•The legend() method adds the legend to the plot.
12/09/2024 22UIT303-Data science 37
import matplotlib.pyplot as plt
import numpy as np
y = [2,4,6,8,10,12,14,16,18,20]
y2 = [10,11,12,13,14,15,16,17,18,19]
x = np.arange(10)
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(x, y, label='$y = numbers')
ax.plot(x, y2, label='$y2 = other numbers')
plt.title('Legend inside')
ax.legend()
plt.show()
12/09/2024 22UIT303-Data science 38
•If we add a label to the plot function, the value will be used as the label in the legend
command.
•There is another argument that we can add to the legend function:
•We can define the location of the legend inside of the axes plot with the parameter "loc".
•If we add a label to the plot function, the values will be used in the legend command:
12/09/2024 22UIT303-Data science 39
from polynomials import Polynomial
import numpy as np
import matplotlib.pyplotasplt
p=Polynomial(-0.8,2.3,0.5,1,0.2)
p_der=p.derivative()
fig, ax=plt.subplots()
X=np.linspace (-2,3,50, endpoint=True)
F=p(X)
F_derivative=p_der(X)
ax.plot(X,F,label="p")
ax.plot(X,F_derivative,label="derivation of p")
ax.legend(loc='upper left')
12/09/2024 22UIT303-Data science 40
Matplotlib legend on bottom
import matplotlib.pyplot as plt
import numpy as np
y1 = [2,4,6,8,10,12,14,16,18,20]
y2 = [10,11,12,13,14,15,16,17,18,19]
x = np.arange(10)
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(x, y, label='$y = numbers')
ax.plot(x, y2, label='$y2= = other numbers')
plt.title('Legend inside')
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), shadow=True, ncol=2)
plt.show()
12/09/2024 22UIT303-Data science 41
Subplots
12/09/2024 22UIT303-Data science 42
Subplots
• Subplots mean groups of axes that can exist in a single matplotlib figure.
•subplots() function in the matplotlib library, helps in creating multiple layouts of
subplots.
•It provides control over all the individual plots that are created.
• subplots() without arguments returns a Figure and a single Axes.
•This is actually the simplest and recommended way of creating a single Figure and Axes.
12/09/2024 22UIT303-Data science 43
fig, ax = plt.subplots()
ax.plot(x,y)
ax.set_title('A single plot')
• There are 3 different ways (at least) to create plots (called axes) in matplotlib.
They are:
 plt.axes(),
 figure.add_axis() and
 plt.subplots()
• plt.axes(): The most basic method of creating an axes is to use the plt.axes function. It takes optional argument for figure
coordinate system. These numbers represent [bottom, left, width, height] in the figure coordinate system, which ranges from
0 at the bottom left of the figure to 1 at the top right of the figure.
• Plot just one figure with (x,y) coordinates: plt.plot(x, y).
• By calling subplot(n,m,k), we subdidive the figure into n rows and m columns and specify that plotting should be done on
the subplot number k. Subplots are numbered row by row, from left to right.
12/09/2024 22UIT303-Data science 44
import matplotlib.pyplotasplt
import numpy as np
from math import pi
plt.figure(figsize=(8,4)) # set dimensions of the figure
x=np.linspace (0,2*pi,100)
for i in range(1,7):
plt.subplot(2,3,i) # create subplots on a grid with 2 rows and 3 columns
plt.xticks([]) # set no ticks on x-axis
plt.yticks([]) # set no ticks on y-axis
plt.plot(np.sin(x), np.cos(i*x))
plt.title('subplot'+'(2,3,' + str(i)+')')
plt.show()
12/09/2024 22UIT303-Data science 45
Text and Annotation
12/09/2024 22UIT303-Data science 46
Text and Annotation
• When drawing large and complex plots in Matplotlib, we need a way of labelling certain portion or
points of interest on the graph.
•To do so, Matplotlib provides us with the "Annotation" feature which allows us to plot arrows and
text labels on the graphs to give them more meaning.
• There are four important parameters that you must always use with annotate().
a) text: This defines the text label. Takes a string as a value.
b) xy: The place where you want your arrowhead to point to. In other words, the place you want to
annotate. This is a tuple containing two values, x and y.
c) xytext: The coordinates for where you want to text to display.
d) arrowprops: A dictionary of key-value pairs which define various properties for the arrow, such as
color, size and arrowhead type.
12/09/2024 22UIT303-Data science 47
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.arange(0.0, 5.0, 0.01)
y =np.sin(2* np.pi *x)
# Annotation
ax.annotate('Local Max',
xy = (3.3, 1),
xytext (3, 1.8),
arrowprops = dict(facecolor = 'green'))
ax.set_ylim(-2, 2)
plt.plot(x, y)
plt.show()
12/09/2024 22UIT303-Data science 48
import plotly.graph_objects as go
fig=go.Figure()
fig.add_trace(go.Scatter(x=[0,1,2,3,4,5,6,7,8],
y=[0,1,3,2,4,3,4,6,5]))
fig.add_trace(go.Scatter(x=[0,1,2,3,4,5,6,7,8],
y=[0,4,5,1,2,2,3,4,2]))
fig.add_annotation(x=2,y=5,text="Text annotation
with arrow",showarrow=True,arrowhead=1)
fig.add_annotation(x=4,y=4,
text="Text annotation without
arrow",showarrow=False,yshift = 10)
fig.update_layout(showlegend=False)
fig.show()
12/09/2024 22UIT303-Data science 49
Customization
12/09/2024 22UIT303-Data science 50
Customization
• A tick is a short line on an axis.
•For category axes, ticks separate each category.
•For value axes, ticks mark the major divisions and show the exact point on an axis that
the axis label defines.
•Ticks are always the same color and line style as the axis.
• Ticks are the markers denoting data points on axes.
•Matplotlib's default tick locators and formatters are designed to be generally sufficient in
many common situations.
•Position and labels of ticks can be explicitly mentioned to suit specific requirements.
12/09/2024 22UIT303-Data science 51
• Ticks come in two types: major and minor.
a) Major ticks separate the axis into major units. On
category axes, major ticks are the only ticks available. On
value axes, one major tick appears for every major axis
division.
b) Minor ticks subdivide the major tick units. They can
only appear on value axes. One minor tick appears for
every minor axis division
12/09/2024 22UIT303-Data science 52
•By default, major ticks appear for value axes. xticks is a method, which can be
used to get or to set the current tick locations and the labels.
• The following program creates a plot with both major and minor tick marks,
customized to be thicker and wider than the default, with the major tick marks point
into and out of the plot area.
12/09/2024 22UIT303-Data science 53
from matplotlib.ticker import AutoMinorLocator
ax =sinplot()
# Give plot a gray background like ggplot.
ax.set_facecolor('#EBEBEB')
# Remove border around plot.
[ax.spines[side].set_visible(True) for side in ax.spines]
# Style the grid.
ax.grid(which='major', color='white', linewidth=1.2)
ax.grid(which='minor', color='white', linewidth=0.6)
# Show the minor ticks and grid.
ax.minorticks_on()
# Now hide the minor ticks (but leave the gridlines).
ax.tick_params(which='minor', bottom=True, left=True)
# Only show minor gridlines once in between major
gridlines.
ax.xaxis.set_minor_locator(AutoMinorLocator(5))
ax.yaxis.set_minor_locator(AutoMinorLocator(5))
12/09/2024 22UIT303-Data science 54
Three Dimensional
Plotting
12/09/2024 22UIT303-Data science 55
Three Dimensional Plotting
•Matplotlib is the most popular choice for data visualization.
•While initially developed for plotting 2-D charts like histograms, bar charts, scatter plots,
line plots, etc.,
•Matplotlib has extended its capabilities to offer 3D plotting modules as well.
First import the library :
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
12/09/2024 22UIT303-Data science 56
•The second import of the Axes3D class is required for enabling 3D projections.
•It is, otherwise, not used anywhere else.
•Create figure and axes
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111, projection='3d')
12/09/2024 22UIT303-Data science 57
Example :
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
fig=plt.figure(figsize=(8,8))
ax=plt.axes(projection='3d')
ax.grid()
t=np.arange(0,10*np.pi, np.pi/50)
x=np.sin(t)
y=np.cos(t)
ax.plot3D(x,y,t)
ax.set_title('3D Parametric Plot')
# Set axes label
ax.set_xlabel('x',labelpad=20)
ax.set_ylabel('y', labelpad=20)
ax.set_zlabel('t', labelpad=20)
plt.show()
12/09/2024 22UIT303-Data science 58
Geographic Data with
Basemap
12/09/2024 22UIT303-Data science 59
Geographic Data with Basemap
• Basemap is a toolkit under the Python visualization library Matplotlib. Its main function is to draw 2D
maps, which are important for visualizing spatial data. Basemap itself does not do any plotting, but provides
the ability to transform coordinates into one of 25 different map projections.
• Matplotlib can also be used to plot contours, images, vectors, lines or points in transformed coordinates.
Basemap includes the GSSH coastline dataset, as well as datasets from GMT for rivers, states and national
boundaries.
• These datasets can be used to plot coastlines, rivers and political boundaries on a map at several different
resolutions. Basemap uses the Geometry Engine-Open Source (GEOS) library at the bottom to clip
coastline and boundary features to the desired map projection area. In addition, basemap provides the ability
to read shapefiles.
• Basemap cannot be installed using pip install basemap. If Anaconda is installed, you can install basemap
using canda install basemap.
12/09/2024 22UIT303-Data science 60
a) contour(): Draw contour lines.
b) contourf(): Draw filled contours.
c) imshow(): Draw an image.
d) pcolor(): Draw a pseudocolor plot.
e) pcolormesh(): Draw a pseudocolor plot (faster version for regular meshes).
f) plot(): Draw lines and/or markers.
g) scatter(): Draw points with markers.
h) quiver(): Draw vectors.(draw vector map, 3D is surface map)
i) barbs(): Draw wind barbs (draw wind plume map)
j) drawgreatcircle(): Draw a great circle (draws a great circle route)
12/09/2024 22UIT303-Data science 61
Basemap basic usage
import warnings
warnings.filterwarnings('ignore')
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
map = Basemap()
map.drawcoastlines()
# plt.show()
plt.savefig('test.png')
12/09/2024 22UIT303-Data science 62
Visualization with Seaborn
12/09/2024 22UIT303-Data science 63
Visualization with Seaborn
• Seaborn is a Python data visualization library based on Matplotlib. It provides a high-
level interface for drawing attractive and informative statistical graphics. Seaborn is an
open- source Python library.
• Seaborn helps you explore and understand your data. Its plotting functions operate on
dataframes and arrays containing whole datasets and internally perform the necessary
semantic mapping and statistical aggregation to produce informative plots.
• Its dataset-oriented, declarative API. User should focus on what the different elements of
your plots mean, rather than on the details of how to draw them.
12/09/2024 22UIT303-Data science 64
Keys features:
a) Seaborn is a statistical plotting library
b) It has beautiful default styles
c) It also is designed to work very well with Pandas dataframe objects.
Seaborn works easily with dataframes and the Pandas library. The graphs created can also be customized
easily.
Functionality that seaborn offers:
a) A dataset-oriented API for examining relationships between multiple variables
b) Convenient views onto the overall structure of complex datasets
c) Specialized support for using categorical variables to show observations or aggregate statistics
d) Options for visualizing univariate or bivariate distributions and for comparing them between subsets of
data
e) Automatic estimation and plotting of linear regression models for different kinds of dependent variables
f) High-level abstractions for structuring multi-plot grids that let you easily build complex visualizations
g) Concise control over matplotlib figure styling with several built-in themes
h) Tools for choosing color palettes that faithfully reveal patterns in your data.
12/09/2024 22UIT303-Data science 65
Plot a Scatter Plot in Seaborn :
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.read_csv('worldHappiness2016.csv').
sns.scatterplot(data= df, x = "Economy (GDP per
Capita)", y =“Hapiness Score”)
plt.show()
12/09/2024 22UIT303-Data science 66

More Related Content

PDF
711118749-FDS-UNIT-5-PPT.pdf is used to the engineering students
PPTX
Introduction to Pylab and Matploitlib.
PPTX
Matplotlib.pptx for data analysis and visualization
PPTX
Python Visualization API Primersubplots
PPTX
matplotlib.pptxdsfdsfdsfdsdsfdsdfdsfsdf cvvf
PDF
Chapter3_Visualizations2.pdf
PPTX
Unit III for data science engineering.pptx
PPTX
UNIT_4_data visualization.pptx
711118749-FDS-UNIT-5-PPT.pdf is used to the engineering students
Introduction to Pylab and Matploitlib.
Matplotlib.pptx for data analysis and visualization
Python Visualization API Primersubplots
matplotlib.pptxdsfdsfdsfdsdsfdsdfdsfsdf cvvf
Chapter3_Visualizations2.pdf
Unit III for data science engineering.pptx
UNIT_4_data visualization.pptx

Similar to UNIT-5-II IT-DATA VISUALIZATION TECHNIQUES (20)

PPTX
Matplotlib yayyyyyyyyyyyyyin Python.pptx
PDF
12-IP.pdf
PDF
Introduction to Data Visualization,Matplotlib.pdf
PPTX
MatplotLib.pptx
PPTX
Matplotlib_Presentation jk jdjklskncncsjkk
PPTX
Introduction to matplotlib
DOCX
Data visualization using py plot part i
PPTX
Matplot Lib Practicals artificial intelligence.pptx
PPTX
Unit3-v1-Plotting and Visualization.pptx
PPTX
PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYA
PPTX
Python chart plotting using Matplotlib.pptx
PPTX
Python for Data Science
PPTX
Visualization and Matplotlib using Python.pptx
PDF
Data Visualization using matplotlib
PPTX
data analytics and visualization CO4_18_Data Types for Plotting.pptx
PPTX
a9bf73_Introduction to Matplotlib01.pptx
PPTX
Introduction to Matplotlib Library in Python.pptx
PDF
Lecture 34 & 35 -Data Visualizationand itd.pdf
PDF
Python matplotlib cheat_sheet
PDF
Matplotlib Review 2021
Matplotlib yayyyyyyyyyyyyyin Python.pptx
12-IP.pdf
Introduction to Data Visualization,Matplotlib.pdf
MatplotLib.pptx
Matplotlib_Presentation jk jdjklskncncsjkk
Introduction to matplotlib
Data visualization using py plot part i
Matplot Lib Practicals artificial intelligence.pptx
Unit3-v1-Plotting and Visualization.pptx
PYTHON-Chapter 4-Plotting and Data Science PyLab - MAULIK BORSANIYA
Python chart plotting using Matplotlib.pptx
Python for Data Science
Visualization and Matplotlib using Python.pptx
Data Visualization using matplotlib
data analytics and visualization CO4_18_Data Types for Plotting.pptx
a9bf73_Introduction to Matplotlib01.pptx
Introduction to Matplotlib Library in Python.pptx
Lecture 34 & 35 -Data Visualizationand itd.pdf
Python matplotlib cheat_sheet
Matplotlib Review 2021
Ad

Recently uploaded (20)

PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Well-logging-methods_new................
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
composite construction of structures.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Digital Logic Computer Design lecture notes
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
PPT on Performance Review to get promotions
PPTX
Construction Project Organization Group 2.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
web development for engineering and engineering
DOCX
573137875-Attendance-Management-System-original
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Well-logging-methods_new................
UNIT-1 - COAL BASED THERMAL POWER PLANTS
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
composite construction of structures.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
CYBER-CRIMES AND SECURITY A guide to understanding
Digital Logic Computer Design lecture notes
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPT on Performance Review to get promotions
Construction Project Organization Group 2.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
web development for engineering and engineering
573137875-Attendance-Management-System-original
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Ad

UNIT-5-II IT-DATA VISUALIZATION TECHNIQUES

  • 1. 12/09/2024 22UIT303-Data science 1 UNIT-V DATA VISUALIZATION
  • 2. 12/09/2024 22UIT303-Data science 2 Syllabus-UNIT-V Importing Matplotlib – Simple Line Plots – Simple Scatter Plots – Visualizing Errors – Density and Contour Plots – Histograms – Legends – Colors – Subplots – Text and Annotation – Customization – Three- Dimensional Plotting - Geographic Data with Base map - Visualization with Seaborn.
  • 3. 12/09/2024 22UIT303-Data science 3 Importing Matplotlib • Matplotlib is a cross-platform, data visualization and graphical plotting library for Python and its numerical extension NumPy. • Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. • Matplotlib is a plotting library for the Python programming language. It allows to make quality charts in few lines of code. Most of the other python plotting library are build on top of Matplotlib. • The library is currently limited to 2D output, but it still provides you with the means to express graphically the data patterns.
  • 4. 12/09/2024 22UIT303-Data science 4 Visualizing Information: Starting with Graph •Data visualization is the presentation of quantitative information in a graphical form. In other words, data visualizations turn large and small datasets into visuals that are easier for the human brain to understand and process. •Good data visualizations are created when communication, data science, and design collide. Data visualizations done right offer key insights into complicated datasets in ways that are meaningful and intuitive. •A graph is simply a visual representation of numeric data. MatPlotLib supports a large number of graph and chart types. •Matplotlib is a popular Python package used to build plots. Matplotlib can also be used to make 3D plots and animations. •Line plots can be created in Python with Matplotlib's pyplot library. To build a line plot, first import Matplotlib. It is a standard convention to import Matplotlib's pyplot library as plt.
  • 5. 12/09/2024 22UIT303-Data science 5 •import matplotlib.pyplot as plt •plt.plot([1,2,3],[5,7,4]) •plt.show()
  • 6. 12/09/2024 22UIT303-Data science 6 Simple Line Plots
  • 7. 12/09/2024 22UIT303-Data science 7 Simple Line Plots •More than one line can be in the plot. To add another line, just call the plot (x,y) function again. import matplotlib.pyplot as plt import numpy as np x = np.linspace(-1, 1, 50) y1 = 2*x+ 1 y2 = 2**x + 1 plt.figure(num = 3, figsize=(8, 5)) plt.plot(x, y2) plt.plot(x, y1, linewidth=1.0, linestyle='--' ) plt.show()
  • 8. 12/09/2024 22UIT303-Data science 8 Example 5.1.1: Write a simple python program that draws a line graph where x = [1,2,3,4] and y = [1,4,9,16] and gives both axis label as "X-axis" and "Y-axis". import matplotlib.pyplot as plt import numpy as np # define data values x = np.array([1, 2, 3, 4]) # X-axis points y = x*2 # Y-axis points print("Values of :") print("Values of Y):") print (Y) plt.plot(X, Y) # Set the x axis label of the current axis. plt.xlabel('x-axis') # Set the y axis label of the current axis. plt.ylabel('y-axis') # Set a title plt.title('Draw a line.') # Display the figure. plt.show()
  • 9. 12/09/2024 22UIT303-Data science 9 Setting the Axis, Ticks, Grids • The axes define the x and y plane of the graphic. The x axis runs horizontally, and the y axis runs vertically. • An axis is added to a plot layer. Axis can be thought of as sets of x and y axis that lines and bars are drawn on. •An Axis contains daughter attributes like axis labels, tick labels, and line thickness. • The following code shows how to obtain access to the axes for a plot : fig = plt.figure() axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1) axes.plot(x, y, 'r') axes.set_xlabel('x') axes.set_ylabel('y') axes.set_title('title');
  • 10. 12/09/2024 22UIT303-Data science 10 # import required modules import matplotlib.pyplot as plt import numpy as np import math # assign coordinates x = np.arange(0, math.pi*2, 0.05) y = np.sin(x) ax = plt.axes() plt.xlabel("x-axis") plt.ylabel("y-axis") # depict illustration plt.plot(x, y, color="lime") # setting ticks for x-axis ax.set_xticks([0, 2, 4, 6]) # setting ticks for y-axis ax.set_yticks([-1, 0, 1]) # setting label for y tick ax.set_yticklabels(["sin(-90deg)", "sin(0deg)", "sin(90deg)"]) plt.show()
  • 11. 12/09/2024 22UIT303-Data science 11 Defining the Line Appearance and Working with Line Style • Line styles help differentiate graphs by drawing the lines in various ways. Following line style is used by Matplotlib. • Matplotlib has an additional parameter to control the colour and style of the plot. plt.plot(xa, ya 'g') plt.plot(xa, ya 'r--')
  • 12. 12/09/2024 22UIT303-Data science 12 from matplotlib import pyplot as plt import numpy as np xa = np.linspace(0, 5, 20) ya = xa**2 plt.plot(xa, ya, 'g') ya = 3*xa plt.plot(xa, ya, 'r--') plt.show()
  • 13. 12/09/2024 22UIT303-Data science 13 Adding Markers • Markers add a special symbol to each data point in a line graph. Unlike line style and color, markers tend to be a little less susceptible to accessibility and printing issues. • Basically, the matplotlib tries to have identifiers for the markers which look similar to the marker: 1. Triangle-shaped: v, <, > Λ 2. Cross-like: *,+, 1, 2, 3, 4 3. Circle-like: 0,., h, p, H, 8
  • 14. 12/09/2024 22UIT303-Data science 14 Using Labels, Annotations and Legends • To fully document your graph, you usually have to resort to labels, annotations, and legends. Each of these elements has a different purpose, as follows: 1. Label: Make it easy for the viewer to know the name or kind of data illustrated 2. Annotation: Help extend the viewer's knowledge of the data, rather than simply identify it. 3. Legend: Provides cues to make identification of the data group easier.
  • 15. 12/09/2024 22UIT303-Data science 15 import matplotlib.pyplot as plt plt.xlabel('Entries') plt.ylabel('Values') plt.plot(range(1,11), values) plt.show() Following example shows how to add annotation to a graph: import matplotlib.pyplot as plt W = 4 h = 3 d = 70 plt.figure(figsize=(w, h), dpi=d) plt.axis([0, 5, 0, 5]) x = [0, 3, 5] y = [1, 4, 3.5] label_x = 1 label_y = 4arrow_x = 3 arrow_y= 4 arrow_properties=dict( facecolor="black", width=0.5, headwidth=4, shrink=0.1) plt.annotate("maximum", xy=(arrow_x, arrow_y), xytext=(label_x, label_y), arrowprops arrow_properties) plt.plot(x, y) plt.savefig("out.png")
  • 16. 12/09/2024 22UIT303-Data science 16 Creating a legend •A legend documents the individual elements of a plot. •Each line is presented in a table that contains a label for it so that people can differentiate between each line. import matplotlib.pyplot as plt import numpy as np x = np.linspace(-10, 9, 20) y = x ** 3 Z = x ** 2 figure = plt.figure() axes = figure.add_axes([0,0,1,1]) axes.plot(x, z, label="Square Function") axes.plot(x, y, label="Cube Function") axes.legend()
  • 17. 12/09/2024 22UIT303-Data science 17 Scatter Plots • A scatter plot is a visual representation of how two variables relate to each other. •we can use scatter plots to explore the relationship between two variables. import matplotlib. pyplot as plt #X axis values: x = [2,3,7,29,8,5,13,11,22,33] # Y axis values: y = [4,7,55,43,2,4,11,22,33,44] # Create scatter plot: plt.scatter(x, y) plt.show()
  • 18. 12/09/2024 22UIT303-Data science 18 Example: We can create a simple scatter plot in Python by passing x and y values to plt.scatter(): # scatter_plotting.py import matplotlib. pyplot as plt plt. style. use('fivethirtyeight') x = [2, 4, 6, 6, 9, 2, 7, 2, 6, 1, 8, 4, 5, 9, 1, 2, 3, 7, 5, 8, 1, 3] y = [7, 8, 2, 4, 6, 4, 9, 5, 9, 3, 6, 7, 2, 4, 6, 7, 1, 9, 4, 3, 6, 9] plt. scatter(x, y) plt. show()
  • 19. 12/09/2024 22UIT303-Data science 19 Creating Advanced Scatterplots •Scatterplots are especially important for data science because they can show data patterns that aren't obvious when viewed in other ways. import matplotlib.pyplot as plt x_axis1 =[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] y_axis1 =[5, 16, 34, 56, 32, 56, 32, 12, 76, 89] x_axis2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] y_axis2 = [53, 6, 46, 36, 15, 64, 73, 25, 82, 9] plt.title("Prices over 10 years") plt.scatter(x_axis1, y_axis1, color = 'darkblue', marker='x', label="item 1") plt.scatter(x_axis2, y_axis2, color='darkred', marker='x', label="item 2") plt.xlabel("Time (years)") plt.ylabel("Price (dollars)") plt.grid(True) plt.legend() plt.show()
  • 20. 12/09/2024 22UIT303-Data science 20 Visualizing Errors
  • 21. 12/09/2024 22UIT303-Data science 21 Visualizing Errors • Error bars are included in Matplotlib line plots and graphs. •Error is the difference between the calculated value and actual value. • Without error bars, bar graphs provide the perception that a measurable or determined number is defined to a high level of efficiency. •The method matplotlib.pyplot.errorbar() draws y vs. x as planes and/or indicators with error bars associated. • Adding the error bar in Matplotlib, Python. plt.errorbar(x, y, yerr = 2, capsize=3) Where: x = The data of the X axis. Y = The data of the Y axis. yerr = The error value of the Y axis. Each point has its own error value. xerr = The error value of the X axis. capsize = The size of the lower and upper lines of the error bar
  • 22. 12/09/2024 22UIT303-Data science 22 import matplotlib.pyplot as plt x = 1 y = 20 y_error = 20*0.10 ## El 10% de error plt.errorbar(x,y, yerr = y_error, capsize=3) plt.show()
  • 23. 12/09/2024 22UIT303-Data science 23 import matplotlib.pyplot as plt import numpy as np x = np.arange(1,8) y = np.array([20,10,45,32,38,21,27]) y_error = y * 0.10 ##El 10% plt.errorbar(x, y, yerr = y_error, linestyle="None", fmt="ob", capsize=3, ecolor="k") plt.show() • Parameters of the errorbar : a) yerr is the error value in each point. b) linestyle, here it indicate that we will not plot a line. c) fmt, is the type of marker, in this case is a point ("o") blue ("b"). d) capsize, is the size of the lower and upper lines of the error bar. e) ecolor, is the color of the error bar. The default color is the marker color.
  • 24. 12/09/2024 22UIT303-Data science 24 Density and Contour Plots
  • 25. 12/09/2024 22UIT303-Data science 25 Density and Contour Plots • It is useful to display three-dimensional data in two dimensions using contours or color- coded regions. Three Matplotlib functions are used for this purpose. They are : a) plt.contour for contour plots, b) plt.contourf for filled contour plots, c) plt.imshow for showing images.
  • 26. 12/09/2024 22UIT303-Data science 26 1. Contour plot • A contour line or isoline of a function of two variables is a curve along which the function has a constant value. •It is a cross-section of the three-dimensional graph of the function f(x, y) parallel to the x, y plane. • Contour lines are used in Geography and Meteorology. •In cartography, a contour line joins points of equal height above a given level, such as mean sea level. • A contour line of a function with two variables is a curve which connects points with the same values.
  • 27. 12/09/2024 22UIT303-Data science 27 import numpy as np xlist = np.linspace(-3.0, 3.0, 3) ylist = np.linspace(-3.0, 3.0, 4) X, Y = np.meshgrid(xlist, ylist print(xlist) print(ylist) print(X) print(Y)
  • 28. 12/09/2024 22UIT303-Data science 28 Changing the colours and the line style import matplotlib.pyplot as plt plt.figure() cp = plt.contour(X, Y, Z, colors='black', linestyles='dashed') plt.clabel(cp, inline=True, fontsize=10) plt.title('Contour Plot') plt.xlabel('x (cm)) plt.ylabel('y (cm)') plt.show()
  • 29. 12/09/2024 22UIT303-Data science 29 • When creating a contour plot, we can also specify the color map. There are different classes of color maps. Matplotlib gives the following guidance : a) Sequential: Change in lightness and often saturation of color incrementally, often using a single hue; should be used for representing information that has ordering. b) Diverging: Change in lightness and possibly saturation of two different colors that meet in the middle at an unsaturated color; should be used when the information being plotted has a critical middle value, such as topography or when the data deviates around zero. c) Cyclic : Change in lightness of two different colors that meet in the middle and beginning/end at an unsaturated color; should be used for values that wrap around at the endpoints, such as phase angle, wind direction, or time of day. d) Qualitative: Often are miscellaneous colors; should be used to represent information which does not have ordering or relationships. • This data has both positive and negative values, which zero representing a node for the wave function. There are three important display options for contour plots: the undisplaced shape key, the scale factor, and the contour scale.
  • 31. 12/09/2024 22UIT303-Data science 32 Histogram •In a histogram, the data are grouped into ranges (e.g. 10 - 19, 20 - 29) and then plotted as connected bars. •Each bar represents a range of data. •The width of each bar is proportional to the width of each category, and the height is proportional to the frequency or percentage of that category. •It provides a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values called "bins".
  • 32. 12/09/2024 22UIT303-Data science 33 • Histograms can display a large amount of data and the frequency of the data values. The median and distribution of the data can be determined by a histogram. In addition, it can show any outliers or gaps in the data. • Matplotlib provides a dedicated function to compute and display histograms: plt.hist()
  • 33. 12/09/2024 22UIT303-Data science 34 Code for Histogram import numpy as np import matplotlib.pyplot as plt x = 40* np.random.randn(50000) plt.hist(x, 20, range=(-50, 50), histtype='stepfilled', align='mid', color=‘g', label="Test Data') plt.legend() plt.title(' Histogram') plt.show()
  • 35. 12/09/2024 22UIT303-Data science 36 Legend • Plot legends give meaning to a visualization, assigning labels to the various plot elements. •Legends are found in maps - describe the pictorial language or symbology of the map. •Legends are used in line graphs to explain the function or the values underlying the different lines of the graph. • Matplotlib has native support for legends. •Legends can be placed in various positions: •A legend can be placed inside or outside the chart and the position can be moved. •The legend() method adds the legend to the plot.
  • 36. 12/09/2024 22UIT303-Data science 37 import matplotlib.pyplot as plt import numpy as np y = [2,4,6,8,10,12,14,16,18,20] y2 = [10,11,12,13,14,15,16,17,18,19] x = np.arange(10) fig = plt.figure() ax = plt.subplot(111) ax.plot(x, y, label='$y = numbers') ax.plot(x, y2, label='$y2 = other numbers') plt.title('Legend inside') ax.legend() plt.show()
  • 37. 12/09/2024 22UIT303-Data science 38 •If we add a label to the plot function, the value will be used as the label in the legend command. •There is another argument that we can add to the legend function: •We can define the location of the legend inside of the axes plot with the parameter "loc". •If we add a label to the plot function, the values will be used in the legend command:
  • 38. 12/09/2024 22UIT303-Data science 39 from polynomials import Polynomial import numpy as np import matplotlib.pyplotasplt p=Polynomial(-0.8,2.3,0.5,1,0.2) p_der=p.derivative() fig, ax=plt.subplots() X=np.linspace (-2,3,50, endpoint=True) F=p(X) F_derivative=p_der(X) ax.plot(X,F,label="p") ax.plot(X,F_derivative,label="derivation of p") ax.legend(loc='upper left')
  • 39. 12/09/2024 22UIT303-Data science 40 Matplotlib legend on bottom import matplotlib.pyplot as plt import numpy as np y1 = [2,4,6,8,10,12,14,16,18,20] y2 = [10,11,12,13,14,15,16,17,18,19] x = np.arange(10) fig = plt.figure() ax = plt.subplot(111) ax.plot(x, y, label='$y = numbers') ax.plot(x, y2, label='$y2= = other numbers') plt.title('Legend inside') ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), shadow=True, ncol=2) plt.show()
  • 41. 12/09/2024 22UIT303-Data science 42 Subplots • Subplots mean groups of axes that can exist in a single matplotlib figure. •subplots() function in the matplotlib library, helps in creating multiple layouts of subplots. •It provides control over all the individual plots that are created. • subplots() without arguments returns a Figure and a single Axes. •This is actually the simplest and recommended way of creating a single Figure and Axes.
  • 42. 12/09/2024 22UIT303-Data science 43 fig, ax = plt.subplots() ax.plot(x,y) ax.set_title('A single plot') • There are 3 different ways (at least) to create plots (called axes) in matplotlib. They are:  plt.axes(),  figure.add_axis() and  plt.subplots() • plt.axes(): The most basic method of creating an axes is to use the plt.axes function. It takes optional argument for figure coordinate system. These numbers represent [bottom, left, width, height] in the figure coordinate system, which ranges from 0 at the bottom left of the figure to 1 at the top right of the figure. • Plot just one figure with (x,y) coordinates: plt.plot(x, y). • By calling subplot(n,m,k), we subdidive the figure into n rows and m columns and specify that plotting should be done on the subplot number k. Subplots are numbered row by row, from left to right.
  • 43. 12/09/2024 22UIT303-Data science 44 import matplotlib.pyplotasplt import numpy as np from math import pi plt.figure(figsize=(8,4)) # set dimensions of the figure x=np.linspace (0,2*pi,100) for i in range(1,7): plt.subplot(2,3,i) # create subplots on a grid with 2 rows and 3 columns plt.xticks([]) # set no ticks on x-axis plt.yticks([]) # set no ticks on y-axis plt.plot(np.sin(x), np.cos(i*x)) plt.title('subplot'+'(2,3,' + str(i)+')') plt.show()
  • 44. 12/09/2024 22UIT303-Data science 45 Text and Annotation
  • 45. 12/09/2024 22UIT303-Data science 46 Text and Annotation • When drawing large and complex plots in Matplotlib, we need a way of labelling certain portion or points of interest on the graph. •To do so, Matplotlib provides us with the "Annotation" feature which allows us to plot arrows and text labels on the graphs to give them more meaning. • There are four important parameters that you must always use with annotate(). a) text: This defines the text label. Takes a string as a value. b) xy: The place where you want your arrowhead to point to. In other words, the place you want to annotate. This is a tuple containing two values, x and y. c) xytext: The coordinates for where you want to text to display. d) arrowprops: A dictionary of key-value pairs which define various properties for the arrow, such as color, size and arrowhead type.
  • 46. 12/09/2024 22UIT303-Data science 47 import matplotlib.pyplot as plt import numpy as np fig, ax = plt.subplots() x = np.arange(0.0, 5.0, 0.01) y =np.sin(2* np.pi *x) # Annotation ax.annotate('Local Max', xy = (3.3, 1), xytext (3, 1.8), arrowprops = dict(facecolor = 'green')) ax.set_ylim(-2, 2) plt.plot(x, y) plt.show()
  • 47. 12/09/2024 22UIT303-Data science 48 import plotly.graph_objects as go fig=go.Figure() fig.add_trace(go.Scatter(x=[0,1,2,3,4,5,6,7,8], y=[0,1,3,2,4,3,4,6,5])) fig.add_trace(go.Scatter(x=[0,1,2,3,4,5,6,7,8], y=[0,4,5,1,2,2,3,4,2])) fig.add_annotation(x=2,y=5,text="Text annotation with arrow",showarrow=True,arrowhead=1) fig.add_annotation(x=4,y=4, text="Text annotation without arrow",showarrow=False,yshift = 10) fig.update_layout(showlegend=False) fig.show()
  • 49. 12/09/2024 22UIT303-Data science 50 Customization • A tick is a short line on an axis. •For category axes, ticks separate each category. •For value axes, ticks mark the major divisions and show the exact point on an axis that the axis label defines. •Ticks are always the same color and line style as the axis. • Ticks are the markers denoting data points on axes. •Matplotlib's default tick locators and formatters are designed to be generally sufficient in many common situations. •Position and labels of ticks can be explicitly mentioned to suit specific requirements.
  • 50. 12/09/2024 22UIT303-Data science 51 • Ticks come in two types: major and minor. a) Major ticks separate the axis into major units. On category axes, major ticks are the only ticks available. On value axes, one major tick appears for every major axis division. b) Minor ticks subdivide the major tick units. They can only appear on value axes. One minor tick appears for every minor axis division
  • 51. 12/09/2024 22UIT303-Data science 52 •By default, major ticks appear for value axes. xticks is a method, which can be used to get or to set the current tick locations and the labels. • The following program creates a plot with both major and minor tick marks, customized to be thicker and wider than the default, with the major tick marks point into and out of the plot area.
  • 52. 12/09/2024 22UIT303-Data science 53 from matplotlib.ticker import AutoMinorLocator ax =sinplot() # Give plot a gray background like ggplot. ax.set_facecolor('#EBEBEB') # Remove border around plot. [ax.spines[side].set_visible(True) for side in ax.spines] # Style the grid. ax.grid(which='major', color='white', linewidth=1.2) ax.grid(which='minor', color='white', linewidth=0.6) # Show the minor ticks and grid. ax.minorticks_on() # Now hide the minor ticks (but leave the gridlines). ax.tick_params(which='minor', bottom=True, left=True) # Only show minor gridlines once in between major gridlines. ax.xaxis.set_minor_locator(AutoMinorLocator(5)) ax.yaxis.set_minor_locator(AutoMinorLocator(5))
  • 53. 12/09/2024 22UIT303-Data science 54 Three Dimensional Plotting
  • 54. 12/09/2024 22UIT303-Data science 55 Three Dimensional Plotting •Matplotlib is the most popular choice for data visualization. •While initially developed for plotting 2-D charts like histograms, bar charts, scatter plots, line plots, etc., •Matplotlib has extended its capabilities to offer 3D plotting modules as well. First import the library : import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D
  • 55. 12/09/2024 22UIT303-Data science 56 •The second import of the Axes3D class is required for enabling 3D projections. •It is, otherwise, not used anywhere else. •Create figure and axes fig = plt.figure(figsize=(4,4)) ax = fig.add_subplot(111, projection='3d')
  • 56. 12/09/2024 22UIT303-Data science 57 Example : import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D import numpy as np fig=plt.figure(figsize=(8,8)) ax=plt.axes(projection='3d') ax.grid() t=np.arange(0,10*np.pi, np.pi/50) x=np.sin(t) y=np.cos(t) ax.plot3D(x,y,t) ax.set_title('3D Parametric Plot') # Set axes label ax.set_xlabel('x',labelpad=20) ax.set_ylabel('y', labelpad=20) ax.set_zlabel('t', labelpad=20) plt.show()
  • 57. 12/09/2024 22UIT303-Data science 58 Geographic Data with Basemap
  • 58. 12/09/2024 22UIT303-Data science 59 Geographic Data with Basemap • Basemap is a toolkit under the Python visualization library Matplotlib. Its main function is to draw 2D maps, which are important for visualizing spatial data. Basemap itself does not do any plotting, but provides the ability to transform coordinates into one of 25 different map projections. • Matplotlib can also be used to plot contours, images, vectors, lines or points in transformed coordinates. Basemap includes the GSSH coastline dataset, as well as datasets from GMT for rivers, states and national boundaries. • These datasets can be used to plot coastlines, rivers and political boundaries on a map at several different resolutions. Basemap uses the Geometry Engine-Open Source (GEOS) library at the bottom to clip coastline and boundary features to the desired map projection area. In addition, basemap provides the ability to read shapefiles. • Basemap cannot be installed using pip install basemap. If Anaconda is installed, you can install basemap using canda install basemap.
  • 59. 12/09/2024 22UIT303-Data science 60 a) contour(): Draw contour lines. b) contourf(): Draw filled contours. c) imshow(): Draw an image. d) pcolor(): Draw a pseudocolor plot. e) pcolormesh(): Draw a pseudocolor plot (faster version for regular meshes). f) plot(): Draw lines and/or markers. g) scatter(): Draw points with markers. h) quiver(): Draw vectors.(draw vector map, 3D is surface map) i) barbs(): Draw wind barbs (draw wind plume map) j) drawgreatcircle(): Draw a great circle (draws a great circle route)
  • 60. 12/09/2024 22UIT303-Data science 61 Basemap basic usage import warnings warnings.filterwarnings('ignore') from mpl_toolkits.basemap import Basemap import matplotlib.pyplot as plt map = Basemap() map.drawcoastlines() # plt.show() plt.savefig('test.png')
  • 61. 12/09/2024 22UIT303-Data science 62 Visualization with Seaborn
  • 62. 12/09/2024 22UIT303-Data science 63 Visualization with Seaborn • Seaborn is a Python data visualization library based on Matplotlib. It provides a high- level interface for drawing attractive and informative statistical graphics. Seaborn is an open- source Python library. • Seaborn helps you explore and understand your data. Its plotting functions operate on dataframes and arrays containing whole datasets and internally perform the necessary semantic mapping and statistical aggregation to produce informative plots. • Its dataset-oriented, declarative API. User should focus on what the different elements of your plots mean, rather than on the details of how to draw them.
  • 63. 12/09/2024 22UIT303-Data science 64 Keys features: a) Seaborn is a statistical plotting library b) It has beautiful default styles c) It also is designed to work very well with Pandas dataframe objects. Seaborn works easily with dataframes and the Pandas library. The graphs created can also be customized easily. Functionality that seaborn offers: a) A dataset-oriented API for examining relationships between multiple variables b) Convenient views onto the overall structure of complex datasets c) Specialized support for using categorical variables to show observations or aggregate statistics d) Options for visualizing univariate or bivariate distributions and for comparing them between subsets of data e) Automatic estimation and plotting of linear regression models for different kinds of dependent variables f) High-level abstractions for structuring multi-plot grids that let you easily build complex visualizations g) Concise control over matplotlib figure styling with several built-in themes h) Tools for choosing color palettes that faithfully reveal patterns in your data.
  • 64. 12/09/2024 22UIT303-Data science 65 Plot a Scatter Plot in Seaborn : import matplotlib.pyplot as plt import seaborn as sns import pandas as pd df = pd.read_csv('worldHappiness2016.csv'). sns.scatterplot(data= df, x = "Economy (GDP per Capita)", y =“Hapiness Score”) plt.show()