Multimedia Lectures.pdf

Multimedia – An Introduction
Lesson1
Assistant Prof. Dr. Ayad A. Salam
College of Education for women
University of Baghdad

2
To discuss…
 Course Issues
 Multimedia – Definitions
 Multimedia System
 Data Stream & continuous media
 Streaming Media
 Multimedia - Applications

3
Course Issues
 Text Books
1) Ze-Nian Li & Mark S. Drew, "Fundamentals of Multimedia",
Pearson Education, 2004
2) Susanne Weixel & Jennifer Fulton,
“Multimedia BASICS”, 2nd Edition, 2010
In Addition to reference books, Additional readings & the class notes.

4
Course Issues
 Coverage
Topics Number of
Lectures
Introduction to Multimedia & Media Basics 2
Digital Images representation and processing, basic
relationships between pixels
3
Colors , color science, Human visual system HVS, color
models in image
3
Spatial Filtering 3
Video, Types of video signals, Analog video, Digital
video
2
Audio, Sound, Types of Audio 3
Data compression, some basic methods 4
TOTAL 20

5
Course Issues
 Evaluation
– Test-1 - 13%
– Test-2 - 13%
– Test-3 - Optional.
– Quizzes - 9%
– Lab / Assignments /Project – - 15%
– In Class representation - 5% Extra (Optional)
– Final Exam - 50%

CLASSROOM ETIQUETTE
 At all times be considerate to your classmates and to your
instructor.
 Come to class on time, ready to ask questions about previous
lessons/assignments.
 Ask pertinent questions; contribute to discussions; avoid
"private" conversations that distract the instructor and other
students.
 Any student that disrupts the class will lose the lecture and/or be
asked to leave the room.
 Remember that the instructor is the one to end the class– do not
prepare to leave early.
6

KEYS TO SUCCESS
 Have a positive attitude about learning and the class.
 Attend all class sessions and be punctual.
 Complete reading assignments and handouts before beginning
lab.
 Do your own work. Work with your assigned partner. Ask for
help when needed.
 Don’t expect to understand every topic the first time it is
presented; review often; spend as much time as necessary to
master the material.
 Be flexible.
 Enjoy the class!
7

8
Multimedia- Definitions
– Multi - many; much; multiple
– Medium - a substance regarded as the means of
transmission of a force or effect; a channel or system of
communication, information, or entertainment
(Webster Dictionary )
 So, Multimedia???
– The terms Multi & Medium don’t seem fit well
– Term Medium needs more clarification !!!

9
 Medium
– Means for distribution and
presentation of
information
 Classification based on
perception (text, audio,
video) is appropriate for
defining multimedia
Criteria for the
classification of medium
Storage
Presentation Representation
Perception
Information
Exchange
Transmission

10
 Time always takes separate dimension in the media
representation
 Based on time-dimension in the representation space, media
can be
– Time-independent (Discrete)
• Text, Graphics
– Time dependent (Continuous)
• Audio, Video
• Video, sequence of frames (images) presented to the user periodically.
• Time dependent aperiodic media is not continuous!!
– Discrete & Continuous here has no connection with internal
representation !! (relates to the viewers impression…)

11
 Multimedia is any combination of digitally manipulated
text, art, sound, animation and video.
 A more strict version of the definition of multimedia
do not allow just any combination of media.
 It requires
– Both continuous & discrete media to be utilized
– Significant level of independence between media being
used
 The less stricter version of the definition is used in
practice.

12
 Multimedia elements are composed into a project
using authoring tools.
 Multimedia Authoring tools are those programs that
provide the capability for creating a complete
multimedia presentations by linking together objects
such as a paragraph of text (song), an illustration, an
audio, with appropriate interactive user control.

13
 By defining the objects' relationships to each other,
and by sequencing them in an appropriate order,
authors (those who use authoring tools) can produce
attractive and useful graphical applications.
 To name a few authoring tools
– Photoshop
– Macromedia Flash
– Macromedia Director
– Authorware
 The hardware and the software that govern the
limits of what can happen are multimedia platform or
environment

14
 Multimedia is interactive when the end-user is
allowed to control what and when the elements are
delivered.
 Interactive Multimedia is Hypermedia, when the
end-user is provided with the structure of linked
elements through which he/she can navigate.

15
 Multimedia is linear, when it is not interactive and
the users just sit and watch as if it is a movie.
 Multimedia is nonlinear, when the users are given
the navigational control and can browse the
contents at will.

16
Multimedia System
 Following the dictionary definitions, Multimedia
system is any system that supports more than a single kind of
media
– Implies any system processing text and image will be a
multimedia system!!!
– Note, the definition is quantitative. A qualitative
definition would be more appropriate.
– The kind of media supported should be considered,
rather the number of media

17
Multimedia System
 A multimedia system is characterized by computer-
controlled, integrated production, manipulation,
storage and communication of independent
information, which is encoded at least through a
continuous (time-dependent) and a discrete (time-
independent) medium.

18
Data streams
 Data Stream is any sequence of individual packets
transmitted in a time-dependent fashion
– Packets can carry information of either continuous or
discrete media
 Transmission modes
– Asynchronous
• Packets can reach receiver as fast as possible.
• Suited for discrete media
• Additional time constraints must be imposed for continuous
media

19
Data streams
– Synchronous
• Defines maximum end-to-end delay
• Packets can be received at an arbitrarily earlier time
• For retrieving uncompressed video at data rate
140Mbits/s & maximal end-to-end delay 1 second the
receiver should have temporary storage 17.5 Mbytes
– Isochronous
• Defines maximum & minimum end-to-end delay
• Storage requirements at the receiver reduces

20
Streaming Media
 Popular approach to continuous media over the
internet
 Playback at users computer is done while the media
is being transferred (no waiting till complete
download!!!)
 You can find streaming in
– Internet radio stations
– Distance learning
– Movie Trailers

22
Multimedia- Applications
Multimedia plays major role in following areas
– Instruction
– Business
– Advertisements
– Training materials
– Presentations
– Customer support services
– Entertainment
– Interactive Games

23
– Enabling Technology
– Accessibility to web based materials
– Teaching-learning disabled children & adults
– Fine Arts & Humanities
– Museum tours
– Art exhibitions
– Presentations of literature

24
In Medicine
Source:
Cardiac Imaging,
YALE centre for
advanced cardiac
imaging

25
In training

26
Public awareness
campaign
Source
Interactive Multimedia Project
Department of food science&
nutrition, Colorado State Univ

Digital Image Representation & Processing
Lesson-2

28
To discuss…
 Image fundamentals
 Image Formation
 1-bit & 8 bit image
 Color image
 Color Lookup Table

29
Digital Image
 Image can be defined as a 2-D function f(x,y), where x and y
are spatial coordinates and the amplitude of f at any pair of
coordinates (x,y) is called the intensity/gray level of the image at
that point
– When the image is gray scale, intensity values represent the range of
shades from black to white.
– For a color image the intensity values are represented as a
combination of R, G, B
 Can be considered as comprising x x y number of elements
(picture elements, image elements, pels, pixels), each of which has a
location and value.

30
Image Formation
 Pixel values are proportional to the energy/ electromagnetic
waves radiated from the source
– It implies this value cannot be negative, ranges from 0 to +ve infinity
 Function f(x,y) characterized by components
– Illumination i(x,y), value ranging from 0 to infinity
– Reflectance r(x,y), value ranging from 0 to 1
– f(x,y)= i(x,y) x r(x,y)
 f(x,y) lies between Lmin to Lmax scaled to [0,L-1], where 0
representing black and L-1 representing white, the
intermediate values are the shades of gray from black to
white

11-Jan-07 Multimedia Computing (EA C473) 33
 256 gray levels (8bits/pixel) 32 gray levels (5 bits/pixel) 16 gray levels (4 bits/pixel)
 8 gray levels (3 bits/pixel) 4 gray levels (2 bits/pixel) 2 gray levels (1 bit/pixel)
Image Formation [4]

34
Image Formation[5]
 Output of image sensors are continuous voltage
waveform, digitization is necessary for further
processing
 Digitizing the coordinate positions are called sampling
 Digitizing the amplitude values are called quantization
– Number of gray levels will be in an integer power of 2
– L=2k , [0…L-1]
– Number of bits needed to store an image b=M x N
 Image is k bit image if it has 2k gray levels
– 8 bit image has 256 gray levels

35
1-bit image
 Simplest type of image
 Each pixel consist of only ON / OFF information
 Called 1-bit monochrome (since no color) image
 Suitable for simple graphics & text
– JBIG (Joint Bi-level Image experts Group ), A
compression standard for binary image

36
8-bit image
 Gray levels between 0 to 255
(black to white)
 Image resolution refers the number
of pixels in an image. The
higher the resolution, the more
pixels in the image. Higher
resolution allows for more detail
and subtle color transitions in an
image
 Shown is 512x512 byte image

37
Color image
 24- bit color image
– Each pixel is represented by 3 bytes, RGB
– Each R, G, B are in the range 0-255
– 256 x 256 x 256 possible colors
– If space is a concern, reasonably accurate color image can
be obtained by quantizing the color information
 8- bit color image
– Carefully chosen 256 colors represent the image
– We get information can be received from the color
histogram

38
Color image [2]
 For 640 x 480 image represented with
– 24 bits requires 921.6 kbytes
– 8 bit requires 300 kbytes
 The 8-bit color image stores only the index of the
color, the file header will contain the mapping
information.
 The table where the color information for all the
256 indices is called color lookup table (LUT)

Basic Relationships Between
Pixels
Lesson-3

40
To discuss…
 Neighborhood
 Adjacency
 Connectivity
 Paths
 Regions and boundaries

41
Neighbors of a Pixel
 Any pixel p(x, y) has two vertical and two
horizontal neighbors, given by
(x+1, y), (x-1, y), (x, y+1), (x, y-1)
 This set of pixels are called the 4-neighbors
of P, and is denoted by N4(P).
 Each of them are at a unit distance from P.

42
Neighbors of a Pixel [2]
 The four diagonal neighbors of p(x,y) are
given by,
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1 ,y-1)
This set is denoted by ND(P).
 Each of them are at Euclidean distance of
1.414 from P.

 The points ND(P) and N4(P) are together
known as 8-neighbors of the point P,
denoted by N8(P).
 Some of the points in the N4, ND and N8
may fall outside image when P lies on
the border of image.
43

44
 Neighbors of a pixel
a. 4-neighbors of a pixel
p are its vertical and
horizontal neighbors
denoted by N4(p)
b. 8-neighbors of a pixel
p are its vertical
horizontal and 4
diagonal neighbors
denoted by N8(p)

45
 •N4 - 4-neighbors
 •ND - diagonal neighbors
 •N8 - 8-neighbors (N4 U ND)

46
Adjacency
 Two pixels are connected if they are
neighbors and their gray levels
satisfy some specified criterion of
similarity.
 For example, in a binary image two
pixels are connected if they are 4-
neighbors and have same value (0/1).

47
Adjacency
 Let V be set of gray levels values used to define
adjacency.
 4-adjacency: Two pixels p and q with values from V
are 4- adjacent if q is in the set N4(p).
 8-adjacency: Two pixels p and q with values from V
are 8- adjacent if q is in the set N8(p).
 m-adjacency: Two pixels p and q with values from V
are m-adjacent if,
– q is in N4(p).
– q is in ND(p) and the set [ N4(p) ∩ N4(q) ] is empty
(has no pixels whose values are from V).

48
Connectivity :
 To determine whether the pixels are
adjacent in some sense.
 Let V be the set of gray-level values
used to define connectivity; then
Two pixels p, q that have values
from the set V are:
a. 4-connected, if q is in the set N4(p)
b. 8-connected, if q is in the set N8(p)
c. m-connected, iff
i. q is in N4(p) or
ii. q is in ND(p) and the set
[ N4(p) ∩ N4(q) ] is empty

50
Adjacency/Connectivity
 Pixel p is adjacent to pixel q if they are
connected.
 Two image subsets S1 and S2 are adjacent if
some pixel in S1 is adjacent to some pixel in
S2

Paths & Path lengths
 A path from pixel p with coordinates (x, y) to pixel
q with coordinates (s, t) is a sequence of distinct
pixels with coordinates:
(x0, y0), (x1, y1), (x2, y2) … (xn, yn),
where (x0, y0)=(x, y) and (xn, yn)=(s, t);
(xi, yi) is adjacent to (xi-1, yi-1)
Here n is the length of the path.
 We can define 4-, 8-, and m-paths based on type
of adjacency used.
51

Connected Components
 If p and q are pixels of an image subset S then p
is connected to q in S if there is a path from p to q
consisting entirely of pixels in S.
 For every pixel p in S, the set of pixels in S that
are connected to p is called a connected
component of S.
 If S has only one connected component then S is
called Connected Set.
52

Regions and Boundaries
 A subset R of pixels in an image is called
a Region of the image if R is a connected
set.
 The boundary of the region R is the set of
pixels in the region that have one or more
neighbors that are not in R.
53

Distance Measures
 Given pixels p, q and z with
coordinates (x, y), (s, t), (u, v)
respectively, the distance function D
has following properties:
a. D(p, q) ≥0 , [D(p, q) = 0, iff p = q]
b. D(p, q) = D(q, p)
c. D(p, z) ≤ D(p, q) + D(q, z)
54

 The following are the
different Distance
measures:
a. Euclidean Distance :
De(p, q) =SQRT [(x-s)2 + (y-t)2]
b. City Block Distance:
D4(p, q) = |x-s| + |y-t|
c. Chess Board Distance:
D8(p, q) = max(|x-s|, |y-t|)
55

Relationship between pixels
 Arithmetic/Logic Operations:
- Addition : p + q
– Subtraction: p – q
– Multiplication: p*q
– Division: p/q
– AND: p AND q
– OR : p OR q
– Complement: NOT(q)
56

Neighborhood based arithmetic/Logic
Value assigned to a pixel at position ‘e’ is a function
of its neighbors and a set of window functions.
57

Arithmetic/Logic Operations
 Tasks done using neighborhood
processing:
– Smoothing / averaging
– Noise removal / filtering
– Edge detection
– Contrast enhancement
58

Color
Lesson-4

60
To discuss…
 Color Science
 Human Visual Perception
 Color Models in image

61
Color Science – Light & Spectra
 Light is an electromagnetic wave
– It’s color is characterized by it’s wavelength
 Most light sources produce contributions over many
wavelengths, contributions fall in the visible
wavelength can be seen
 λ Vs Spectral power curve is called spectral power
distribution E(λ)
 Light from 400 to 700 nanometer (10-9 meter)

62
Color Science – Light & Spectra
 Red light has longer wavelength in the visible light
& blue the shorter
 The shorter the wavelength, higher the vibration &
energy
 Red photons carry around 1.8eV & blue 3.1eV (1
electron volt = 1.60217646 × 10-19 joules) unit of
energy
 The RGB in the image files are converted to analog
& drive the electron guns of CRT (Cathode Ray
Tube)

63
Color Science – Vision & Sensitivity [2]
 Eye is most sensitive to
the middle of the
visible spectrum
 Let us denote the
spectral sensitivity of R,
G, B cones as a vector
q(λ)
 q(λ)=[qR(λ), qG(λ), qB(λ)]T

64
Color Science – Vision & Sensitivity [3]
 The sensitivity of each cones can be specified as
– R=§E(λ) qR(λ) d λ ---------- (1)
– G=§E(λ) qG(λ) d λ ---------- (2)
– B=§E(λ) qB(λ) d λ ---------- (3)
» § - integral
 Equations 1, 2, 3 quantify the signals transmitted to
the brain

Human Visual Perception
 Human perception encompasses both the
physiological and psychological aspects.
 We will focus more on physiological aspects,
which are more easily quantifiable and hence,
analyzed.

Human Visual Perception
 Why study visual perception?
– Image processing algorithms are designed
based on how our visual system works.
– In image compression, we need to know
what information is not perceptually
important and can be ignored.
– In image enhancement, we need to know
what types of operations that are likely to
improve an image visually.

The Human Visual System
 The human visual system consists of two
primary components – the eye and the brain,
which are connected by the optic nerve.
– Eye – receiving sensor (camera, scanner).
– Brain – information processing unit
(computer system).
– Optic nerve – connection cable (physical
wire).

 This is how human visual system works:
– Light energy is focused by the lens of the
eye into sensors and retina.
– The sensors respond to the light by an
electrochemical reaction that sends an
electrical signal to the brain (through the
optic nerve).
– The brain uses the signals to create
neurological patterns that we perceive as
images.

 The visible light is an electromagnetic wave
with wavelength range of about 380 to 825
nanometers.
– However, response above 700 nanometers
is minimal.
 We cannot “see” many parts of the
electromagnetic spectrum.

 The visible spectrum can be divided into
three bands:
– Blue (400 to 500 nm).
– Green (500 to 600 nm).
– Red (600 to 700 nm).
 The sensors are distributed across retina.
 3 kinds of cones are more sensitive to R, G & B
present in the ratios 40:20:1

 There are two types of sensors: rods and
cones.
 Rods:
– For night vision.
– See only brightness (gray level) and not
color.
– Distributed across retina.
– Medium and low level resolution.

 Cones:
– For daylight vision.
– Sensitive to color.
– Concentrated in the central region of eye.
– High resolution capability (differentiate
small changes).

 Blind spot:
– No sensors.
– Place for optic nerve.
– We do not perceive it as a blind spot
because the brain fills in the missing visual
information.
 Why does an object should be in center field
of vision in order to perceive it in fine detail?
– This is where the cones are concentrated.

 Cones have higher resolution than rods
because they have individual nerves tied to
each sensor.
 Rods have multiple sensors tied to each
nerve.
 Rods react even in low light but see only a
single spectral band. They cannot distinguish
color.

 There are three types of cones. Each
responding to different wavelengths of light
energy.
 The colors that we perceive are the combined
result of the response of the three cones.

81
Color Science – Other Color Coordinate Systems
 CMY (Cyan-Magenta-Yellow)
 HSL(Hue-Saturation-Lightness)
 HSV(Hue-Saturation-Value)
 HSI(Hue-Saturation-Intensity)
 HCI(Hue-Chroma-Intensity)
 HVC(Hue-Value-Intensity)
 HSD(Hue-Saturation-Darkness)

82
Color Models for Image – RGB Vs CMY
 Additive Vs Subtractive Models
 Additive model
– Used in computer displays, Uses light to display color,
Colors result from transmitted light
– Red+Green+Blue=White
 Subtractive Models
– Used in printed materials, Uses ink to display color,
Colors result from reflected light
– Cyan+Magenta+Yellow=Black

83
Color Models for Image – RGB Vs CMY [2]

84
RGB & CMY Cubes

85
 Conversion From RGB to CMY
 Conversion From CMY to RGB

86
Color Models for Image – CMYK
 Eliminating amounts of yellow, magenta, and cyan
that would have added to a dark neutral (black) and
replacing them with black ink
 Four-color printing uses black ink(K) in addition to
the subtractive primaries yellow, magenta, and cyan.
 Reasons for Black addition includes
– CMY Mixture rarely produces pure black
– Text is typically printed in black and includes fine detail
– Cost saving : Unit amount of black ink rather than three
unit amounts of CMY

87
Color Models for Image – CMYK[2]
 Used especially in the printing of images
+ + =
+
≈

Spatial Filtering
Lesson 5

Background
 Filter term in “Digital image processing”
is referred to the subimage
 There are others term to call subimage
such as mask, kernel, template, or
window
 The value in a filter subimage are
referred as coefficients, rather than
pixels.
89

Basics of Spatial Filtering
 The concept of filtering has its roots in
the use of the Fourier Transform for
signal processing in the so-called
frequency domain.
 Spatial filtering term is the filtering
operations that are performed directly
on the pixels of an image
90

Mechanics of spatial filtering
 The process consists simply of moving
the filter mask from point to point in an
image.
 At each point (x,y) the response of the
filter at that point is calculated using a
predefined relationship
91

Linear spatial filtering
f(x-1,y-1) f(x-1,y) f(x-1,y+1)
f(x,y-1) f(x,y) f(x,y+1)
f(x+1,y-1) f(x+1,y) f(x+1,y+1) w(-1,-1) w(-1,0) w(-1,1)
w(0,-1) w(0,0) w(0,1)
w(1,-1) w(1,0) w(1,1)
The result is the sum of
products of the mask
coefficients with the
corresponding pixels
directly under the mask
Pixels of image
Mask coefficients
w(-1,-1) w(-1,0) w(-1,1)
w(0,-1) w(0,0) w(0,1)
w(1,-1) w(1,0) w(1,1)
)
1
,
1
(
)
1
,
1
(
)
,
1
(
)
0
,
1
(
)
1
,
1
(
)
1
,
1
(
)
1
,
(
)
1
,
0
(
)
,
(
)
0
,
0
(
)
1
,
(
)
1
,
0
(
)
1
,
1
(
)
1
,
1
(
)
,
1
(
)
0
,
1
(
)
1
,
1
(
)
1
,
1
(


























y
x
f
w
y
x
f
w
y
x
f
w
y
x
f
w
y
x
f
w
y
x
f
w
y
x
f
w
y
x
f
w
y
x
f
w

)
,
( y
x
f
92

Note: Linear filtering
 The coefficient w(0,0) coincides with image
value f(x,y), indicating that the mask is
centered at (x,y) when the computation of
sum of products takes place.
 For a mask of size mxn, we assume that m-
2a+1 and n=2b+1, where a and b are
nonnegative integer. Then m and n are odd.
93

Linear filtering
 In general, linear filtering of an image f
of size MxN with a filter mask of size
mxn is given by the expression:


 




a
a
s
b
b
t
t
y
s
x
f
t
s
w
y
x
g )
,
(
)
,
(
)
,
(
94

Discussion
 The process of linear filtering similar to
a frequency domain concept called
“convolution”







mn
i
i
i
mn
mn z
w
z
w
z
w
z
w
R
1
2
2
1
1 ...







9
1
9
9
2
2
1
1 ...
i
i
i z
w
z
w
z
w
z
w
R
Simplify expression
w1 w2 w3
w4 w5 w6
w7 w8 w9
Where the w’s are mask coefficients, the z’s are the value
of the image gray levels corresponding to those coefficients
95

Nonlinear spatial filtering
 Nonlinear spatial filters also operate on
neighborhoods, and the mechanics of
sliding a mask past an image are the
same as was just outlined.
 The filtering operation is based
conditionally on the values of the pixels
in the neighborhood under
consideration
96

Smoothing Spatial Filters
 Smoothing filters are used for blurring
and for noise reduction.
– Blurring is used in preprocessing steps,
such as removal of small details from an
image prior to object extraction, and
bridging of small gaps in lines or curves
– Noise reduction can be accomplished by
blurring
97

Type of smoothing filtering
 There are 2 ways of smoothing spatial
filters
– Smoothing Linear Filters
– Order-Statistics Filters
98

Smoothing Linear Filters
 Linear spatial filter is simply the average
of the pixels contained in the
neighborhood of the filter mask.
 Sometimes called “averaging filters”.
 The idea is replacing the value of every
pixel in an image by the average of the
gray levels in the neighborhood defined
by the filter mask.
99

Two 3x3 Smoothing Linear Filters
1 1 1
1 1 1
1 1 1
1 2 1
2 4 2
1 2 1

9
1 
16
1
Standard average Weighted average
100

5x5 Smoothing Linear Filters
1 1 1
1 1 1
1 1 1
1
1
1
1
1
1
1 1 1 1 1
1 1 1 1 1

?
1

25
1
101

Smoothing Linear Filters
 The general implementation for filtering
an MxN image with a weighted
averaging filter of size mxn is given by
the expression



 


 



 a
a
s
b
b
t
a
a
s
b
b
t
t
s
w
t
y
s
x
f
t
s
w
y
x
g
)
,
(
)
,
(
)
,
(
)
,
(
102

Result of Smoothing Linear Filters
[3x3] [5x5] [7x7]
Original Image
103

Order-Statistics Filters
 Order-statistics filters are nonlinear spatial
filters whose response is based on ordering
(ranking) the pixels contained in the image
area encompassed by the filter, and then
replacing the value of the center pixel with the
value determined by the ranking result.
 Best-known “median filter”
104

Process of Median filter
 Corp region of
neighborhood
 Sort the values of
the pixel in our
region
 In the MxN mask
the median is MxN
div 2 +1
10 15 20
20 100 20
20 20 25
10, 15, 20, 20, 20, 20, 20, 25, 100
5th
105

Result of median filter
Noise from Glass effect Remove noise by median filter
106

Sharpening Spatial Filters
 The principal objective of sharpening is
to highlight fine detail in an image or to
enhance detail that has been blurred,
either in error or as an natural effect of a
particular method of image acquisition.
107

Introduction
 The image blurring is accomplished in
the spatial domain by pixel averaging in
a neighborhood.
 Since averaging is analogous to
integration.
 Sharpening could be accomplished by
spatial differentiation.
108

Foundation
 We are interested in the behavior of
these derivatives in areas of constant
gray level(flat segments), at the onset
and end of discontinuities(step and
ramp discontinuities), and along gray-
level ramps.
 These types of discontinuities can be
noise points, lines, and edges.
109

Definition for a first derivative
 Must be zero in flat segments
 Must be nonzero at the onset of a gray-
level step or ramp; and
 Must be nonzero along ramps.
110

Definition for a second derivative
 Must be zero in flat areas;
 Must be nonzero at the onset and end
of a gray-level step or ramp;
 Must be zero along ramps of constant
slope
111

Definition of the 1st-order
derivative
 A basic definition of the first-order derivative
of a one-dimensional function f(x) is
)
(
)
1
( x
f
x
f
x
f





112

Definition of the 2nd-order
derivative
 We define a second-order derivative as the
difference
).
(
2
)
1
(
)
1
(
2
2
x
f
x
f
x
f
x
f







113

Gray-level profile
6
6
0 1 2 3
0 0 2 2 2 2 2
3 3 3 3 3
0 0 0 0 0 0 0 0 7 7 5 5
7
6
5
4
3
2
1
0
114

Derivative of image profile
0 0 0 1 2 3 2 0 0 2 2 6 3 3 2 2 3 3 0 0 0 0 0 0 7 7 6 5 5 3
0 0 1 1 1-1-2 0 2 0 4-3 0-1 0 1 0-3 0 0 0 0 0-7 0-1-1 0-2
0-1 0 0-2-1 2 2-2 4-7 3-1 1 1-1-3 3 0 0 0 0-7 7-1 0 1-2
first
second
115

Analyze
 The 1st-order derivative is nonzero
along the entire ramp, while the 2nd-
order derivative is nonzero only at the
onset and end of the ramp.
 The response at and around the point is
much stronger for the 2nd- than for the
1st-order derivative
1st make thick edge and 2nd make thin edge
116

The Laplacian (2nd order derivative)
 Shown by Rosenfeld and Kak[1982] that the
simplest isotropic derivative operator is the
Laplacian is defined as
2
2
2
2
2
y
f
x
f
f







117

Discrete form of derivative
)
,
(
2
)
,
1
(
)
,
1
(
2
2
y
x
f
y
x
f
y
x
f
x
f







f(x+1,y)
f(x,y)
f(x-1,y)
f(x,y+1)
f(x,y)
f(x,y-1)
)
,
(
2
)
1
,
(
)
1
,
(
2
2
y
x
f
y
x
f
y
x
f
y
f







118

2-Dimentional Laplacian
 The digital implementation of the 2-Dimensional
Laplacian is obtained by summing 2 components
2
2
2
2
2
x
f
x
f
f







)
,
(
4
)
1
,
(
)
1
,
(
)
,
1
(
)
,
1
(
2
y
x
f
y
x
f
y
x
f
y
x
f
y
x
f
f 









1
1
-4 1
1
119

Laplacian
1
1
-4 1
1
0 0
0 0
0
0
-4 0
0
1 1
1 1
1
1
-8 1
1
1 1
1 1
120

Laplacian
-1
-1
4 -1
-1
0 0
0 0
0
0
4 0
0
-1 -1
-1 -1
-1
-1
8 -1
-1
-1 -1
-1 -1
121

Implementation








)
,
(
)
,
(
)
,
(
)
,
(
)
,
( 2
2
y
x
f
y
x
f
y
x
f
y
x
f
y
x
g If the center coefficient is negative
If the center coefficient is positive
Where f(x,y) is the original image
is Laplacian filtered image
g(x,y) is the sharpen image
)
,
(
2
y
x
f

122

Implementation
Filtered = Conv(image,mask)
124

Implementation
filtered = filtered - Min(filtered)
filtered = filtered * (255.0/Max(filtered))
125

Implementation
sharpened = image + filtered
sharpened = sharpened - Min(sharpened )
sharpened = sharpened * (255.0/Max(sharpened ))
126

Algorithm
 Using Laplacian filter to original image
 And then add the image result from step
1 and the original image
127

Simplification
 We will apply two steps to be one mask
)
,
(
4
)
1
,
(
)
1
,
(
)
,
1
(
)
,
1
(
)
,
(
)
,
( y
x
f
y
x
f
y
x
f
y
x
f
y
x
f
y
x
f
y
x
g 









)
1
,
(
)
1
,
(
)
,
1
(
)
,
1
(
)
,
(
5
)
,
( 







 y
x
f
y
x
f
y
x
f
y
x
f
y
x
f
y
x
g
-1
-1
5 -1
-1
0 0
0 0
-1
-1
9 -1
-1
-1 -1
-1 -1
128

Unsharp masking
 A process to sharpen images consists of
subtracting a blurred version of an image from
the image itself. This process, called unsharp
masking, is expressed as
)
,
(
)
,
(
)
,
( y
x
f
y
x
f
y
x
fs 

)
,
( y
x
fs
)
,
( y
x
f
)
,
( y
x
f
Where denotes the sharpened image
obtained by unsharp masking, and is a
blurred version of
129

High-boost filtering
 A high-boost filtered image, fhb is defined at any
point (x,y) as
1
)
,
(
)
,
(
)
,
( 

 A
where
y
x
f
y
x
Af
y
x
fhb
)
,
(
)
,
(
)
,
(
)
1
(
)
,
( y
x
f
y
x
f
y
x
f
A
y
x
fhb 



)
,
(
)
,
(
)
1
(
)
,
( y
x
f
y
x
f
A
y
x
f s
hb 


This equation is applicable general and does not
state explicity how the sharp image is obtained
130

High-boost filtering and
Laplacian
 If we choose to use the Laplacian, then we
know fs(x,y)








)
,
(
)
,
(
)
,
(
)
,
(
2
2
y
x
f
y
x
Af
y
x
f
y
x
Af
fhb
If the center coefficient is negative
If the center coefficient is positive
-1
-1
A+4 -1
-1
0 0
0 0
-1
-1
A+8 -1
-1
-1 -1
-1 -1
131

The Gradient (1st order
derivative)
 First Derivatives in image processing are
implemented using the magnitude of the
gradient.
 The gradient of function f(x,y) is

























y
f
x
f
G
G
f
y
x
132

Gradient
 The magnitude of this vector is given by
y
x
y
x G
G
G
G
f
mag 



 2
2
)
(
-1 1
1
-1
Gx
Gy
This mask is simple, and no
isotropic. Its result only horizontal
and vertical.
133

Robert’s Method
 The simplest approximations to a first-order
derivative that satisfy the conditions stated in
that section are
2
6
8
2
5
9 )
(
)
( z
z
z
z
f 




z1 z2 z3
z4 z5 z6
z7 z8 z9
Gx = (z9-z5) and Gy = (z8-z6)
6
8
5
9 z
z
z
z
f 




134

Robert’s Method
 These mask are referred to as the
Roberts cross-gradient operators.
-1 0
0 1
-1
0
0
1
135

Sobel’s Method
 Mask of even size are awkward to apply.
 The smallest filter mask should be 3x3.
 The difference between the third and first
rows of the 3x3 Image region
approximate derivative in x-direction, and
the difference between the third and first
column approximate derivative in y-
direction.
136

Sobel’s Method
 Using this equation
)
2
(
)
2
(
)
2
(
)
2
( 7
4
1
9
6
3
3
2
1
9
8
7 z
z
z
z
z
z
z
z
z
z
z
z
f 












-1 -2 -1
0 0 0
1 2 1 1
-2
1
0
0
0
-1
2
-1
137

Video
Lesson-6

139
To discuss…
 Types of video signals
 Analog Video
 Digital Video

140
Types of Video Signals
Video Signals can be classified as
1. Composite Video
2. S-Video
3. Component Video

141
Types - Composite Video
 Used in broadcast TV’s
 Compatible with B/W TV
 Chrominance ( I & Q or U & V)
& Luminance signals are
mixed into a single carrier
wave, which can be separated
at the receiver end
 Mixing of signals leads
interference & create dot crawl
Male F-Connector, Connecting
co-axial cable with the device
Dot Crawl, due to interference
in composite video

142
Types - S-Video
 S stands Super / Separated
 Uses 2 wires, one for luminance & the other for chrominance
signals
 Humans are able to differentiate spatial resolution in gray-
scale images with a much higher acuity than for the color
part of color images.
 As a result, we can send less accurate color information than
must be sent for intensity information

143
Types - Component Video
 Each primary is sent as a
separate video signal.
– The primaries can either be
RGB or a luminance-
chrominance transformation of
them (e.g., YIQ, YUV).
– Best color reproduction
– Requires more bandwidth and
good synchronization of the
three components

144
Analog Video
 Represented as a continuous (time varying) signal

145
Analog Video [2]
Interlaced Scan
With interlaced scan, the
odd and even lines are
displaced in time from each
other.

147
NTSC (National Television System Committee)
 It uses the familiar 4:3 aspect ratio (i.e., the ratio of picture
width to its height)
 Uses 525 scan lines per frame at 30 frames per second (fps).
 NTSC follows the interlaced scanning system, and each frame
is divided into two fields, with 262.5 lines/field.
 Thus the horizontal sweep frequency is 525x 29.97 =15,734
lines/sec, so that each line is swept out in 63.6 µ sec
(1/15.734 x 103 sec )
 63.6 µ sec = 10.9 µ sec for Horizontal retrace + 52.7 µ sec active
line signal
 For the active line signal during which image data is
displayed

148
NTSC (National Television System Committee) [2]
• 20 lines at the beginning of
every field is for Vertical retrace
control information leaving 485
lines per frame
• 1/6 of the raster at the left side is
blanked for horizontal retrace and
sync. The non-blanking pixels are
called active pixels.
•Pixels often fall in-between the
scan lines. NTSC TV is only
capable of showing about 340
(visually distinct) lines

149
 NTSC video is an analog signal with no fixed
horizontal resolution
 Pixel clock is used to divide each horizontal line of
video into samples. Different video formats
provide different numbers of samples per line
 Uses YIQ Color Model
 Quadrature Modulation is used to combine I & Q
to produce a single chroma signal

150
 Fsc is 3.58MHz
 Composite signal is formed by
 The available bandwidth is 6MHz, in which the
audio is signal centered at 5.75MHz and the lower
spectrum carries picture information

151
PAL (Phase Alternating Line)
 Widely used in Western Europe, China, India, and
many other parts of the world.
 Uses 625 scan lines per frame, at 25 frames/second,
with a 4:3 aspect ratio and interlaced fields
 Uses the YUV color model
 Uses an 8 MHz channel and allocates a bandwidth
of 5.5 MHz to Y, and 1.8 MHz each to U and V.

152
Digital Video
 Advantages over analog:
– Direct random access --> good for nonlinear video
editing
– No problem for repeated recording
– No need for blanking and sync pulse
 Almost all digital video uses component video

153
Chroma Subsampling
 The human eye responds more
precisely to brightness information
than it does to color, chroma
subsampling (decimating) takes
advantage of this.
– In a 4:4:4 scheme, each 8×8 matrix of
RGB pixels converts to three YCrCb
8×8 matrices: one for luminance (Y)
and one for each of the two
chrominance bands (Cr and Cb) 8x8 : 8x8 : 8x8
4 : 4 : 4

154
Chroma Subsampling [2]
– A 4:2:2 scheme also creates
one 8×8 luminance matrix
but decimates every two
horizontal pixels to create
each chrominance-matrix
entry. Thus reducing the
amount of data to 2/3rds of a
4:4:4 scheme. 4 : 2 : 2

155
– Ratios of 4:2:0 decimate
chrominance both
horizontally and vertically,
resulting in four Y, one
Cr, and one Cb 8×8
matrix for every four 8×8
pixel-matrix sources. This
conversion creates half the
data required in a 4:4:4
chroma ratio
4 : 1 : 1
4 : 2 : 0

156
Luma sample
Chroma sample
4:2:0

157
 The 4:1:1 and 4:2:0 are used in JPEG and MPEG
 256-level gray-scale JPEG images aren't usually
much smaller than their 24-bit color counterparts,
because most JPEG implementations aggressively
subsample the color information. Color data
therefore represents a small percentage of the total
file size

158
High Definition TV (HDTV)
 The main thrust of HDTV (High Definition TV) is
not to increase the definition in each unit area, but
rather to increase the visual field especially in its
width.
– The first generation of HDTV was based on an analog
technology developed by Sony and NHK in Japan in the
late 1970s.
– Uncompressed HDTV will demand more than 20 MHz
bandwidth, which will not fit in the current 6 MHz or 8
MHz channels
– More than one channels even after compression.

159
High Definition TV (HDTV) [3]
 The salient difference between conventional TV and
HDTV:
– HDTV has a much wider aspect ratio of 16:9 instead of
4:3.
– HDTV moves toward progressive (non-interlaced) scan.
The rationale is that interlacing introduces serrated edges
to moving objects and flickers along horizontal edges.

AUDIO
Lesson-7

161
To discuss…
 What is sound?
– Waveforms and attributes of
sound
 Capturing digital audio
– Sampling
 MIDI (Musical Instrument Digital Interface)

What is Sound?
 Sound comprises the spoken word, voices,
music and even noise.
 Sound is a pressure wave, taking
continuous values
 It is a complex relationship involving a
vibrating object (sound source), a
transmission medium (usually air), a
receiver (ear) and a perceptor (brain).
Example banging drum.
162

 Increase / decrease in pressure can be
measured in amplitude, which can be
digitized
 Measure the amplitude at equally
spaced time intervals (sampling) and
represent it with one of finite digital
values (quantization)
 Sampling frequency refers the rate at
which the sampling is performed
163

Waveforms
 Sound waves are manifest as waveforms
 A waveform that repeats itself at regular intervals is
called a periodic waveform
 Waveforms that do not exhibit regularity are called
noise
 The unit of regularity is called a cycle
 This is known as Hertz (or Hz) after Heinrich Hertz
 One cycle = 1 Hz
 Sometimes written as kHz or kiloHertz (1 kHz =
1000 Hz)
164

Waveforms
165
distanc
e
along
wave
Cycl
e
Time for one
cycle

The characteristics of sound waves
 Sound is described in terms of two characteristics:
 Frequency
 Amplitude (or loudness)
 Frequency
 the rate at which sound is measured
 Number of cycles per second or Hertz (Hz)
 Determines the pitch of the sound as
heard by our ears
 The higher frequency, the clearer and
sharper the sound  the higher pitch of
sound
166

 Amplitude
 Sound’s intensity or loudness
 The louder the sound, the larger amplitude.
167

168
distanc
e
along
wave
Cycl
e
Time for one
cycle
Amplitud
e
pitc
h

Example waveforms
169
Piano
Pan flute
Snare drum

Capture and playback of digital audio
170
Air pressure
variations
Captured via
microphone
Air pressure
variations
ADC
Signal is
converted into
binary
(discrete form)
0101001101
0110101111
Analogue
to Digital
Converter
DAC
Conver
ts back
into
voltage
Digital to
Analogue
Converter

The Analogue to Digital Converter
(ADC)
 An ADC is a device that converts analogue signals
into digital signals
 An analogue signal is a continuous value
 It can have any single value on an infinite scale
 A digital signal is a discrete value
 It has a finite value (usually an integer)
 An ADC is synchronised to some clock
171

The Analogue to Digital Converter
(ADC)
 It will monitor the continuous analogue signal at a set
rate and convert what it sees into a discrete value at
that specific moment in time
 The process to convert the analogue to digital sound
is called Sampling. Use PCM (Pulse Code
Modulation)
172

Digital sampling
Sampling frequency
173

Digital sampling
Sampling frequency
174

Sampling
 Two parameters:
Sampling Rate
 Frequency of sampling
 Measure in Hertz
 The higher sampling rate, higher quality sound but size
storage is big.
 Standard Sampling rate:
- 44.1 KHz for CD Audio
- 22.05 KHz
- 11.025 KHz for spoken
- 5.1025 KHz for audio effect
175

Sampling
Sample Size
The resolution of a sample is the number of bits it
uses to store a given amplitude value, e.g.
 8 bits (256 different values)
 16 bits (65536 different values)
 A higher resolution will give higher quality but will require
more memory (or disk storage)
176

Quantisation
177
 Samples are usually represented the audio
sample as a integers(discrete number) or digital
Sample points
0
15

Calculating the size of digital audio
178
 The formula is as follows:
 The answer will be in bytes
 Where:
 sampling rate is in Hz
 Duration/time is in seconds
 resolution is in bits (1 for 8 bits, 2 for 16 bits)
 number of channels = 1 for mono, 2 for stereo,
etc.
8
channels
of
number
resolution
duration
rate 



Calculating the size of digital audio
179
 Example:
Calculate the file size for 1 minute, 44.1 KHz, 16 bits,
stereo sound
 Where:
 sampling rate is 44,100 Hz
 Duration/time is 60 seconds
 resolution is 16 bits
 number of channels for stereo is 2
8
channels
of
number
resolution
duration
rate 


44100 * 60 * 16 *2 / 8

Mono Recording
• Mono simply indicates the use of a single channel.
Mono includes the use of a single microphone used
to record a sound, which is then played through a
single channel through a speaker.
• The easiest way to check if a sound is a mono
recording is through a set of headphones,
incidentally, you can easily distinguish whether or not
the sound plays through one headphone and not the
other.
• Mono recording was typically used before the
development of stereo recording.
180

Mono Recording
Advantages
• Mono file sizes are around half the size of their
stereo counterparts
• High resolution files can be recorded with
relatively low file sizes
• High resolution files can be recorded around
the same size file as a low resolution stereo file
• It is much easier to mix mono sounds than it is
with stereo
• Mono sounds are much easier to manipulate in
editing programmes
• Everyone hears the exact same signal
• Mono systems are suitable for reinforcing a
sound
Disadvantages
 There is no sound perspective
 It is impossible to tell whether or not a sound
has been recorded from a distance
 Films that use mono sounds do not provide as
much as an impact than if the film was recorded
using stereo sounds

Stereo Recording
 Whereas mono recording has one independent audio channel, stereo has two channels.
 Signals that are reproduced through stereo recording have an exact correlation with each
other, so when the sound is played back through either speakers or headphones, the sound
is a mirrored representation of the original recording.
 Stereo recording would be useful in situations that require the use of sound perspective, for
instance the clear location of instruments on a stage.
 The stereo systems must have an equal cover over the two audio channels.

Stereo Recording
Advantages
 Provides sound perspective
 Gives an idea of the direction the sound is
coming from, or how it has been recorded
 Provides better experience when listening to
songs or films
 It is possible to tell whether or not the sound
has been recorded from a distance
 Offers the possibility of multi-track recordings
Disadvantages
• Since stereo files use two audio channels
instead of one, the files sizes are going to be a
lot bigger
• High resolution stereo files are relatively big
files
• Mono sound files can be recorded at high
resolutions for half the file sizes of stereo files
• Stereo files are harder to edit than mono files
as there are two channels to work with
• The sound is played equally over two channels,
therefore if one channel is broken, the sound
quality is not only played through one speaker,
it is also halved or some of the audio is missed
out altogether
• Stereo is a lot more expensive to set up

Audio formats
 Depend on O/S. For examples:
 AIFF (Audio Interchange File Format)
 SOU
 For Macintosh
 .WAV
 Waveform file format. For Windows/Microsoft
 .VOC
 Sound Blaster Card
184

What is WAV?
 WAV is an audio file format that was developed by
Microsoft.
 It is so wide spread today that it is called a standard
PC audio file format.
 A Wave file is identified by a file name extension of
WAV (.wav).
 Used primarily in PCs, the Wave file format has been
accepted as a viable interchange medium for other
computer platforms, such as Macintosh.
185

What is WAV?
 This allows content developers to freely move audio
files between platforms for processing,
 For example. The Wave file format stores information
about
– the file's number of tracks (mono or stereo),
– sample rate
– bit depth
186

MIDI (Musical Instrument Digital
Interface)
 MIDI is a standard for specifying a musical
performance
 Rather than send raw digital audio, it sends
instructions to musical instruments telling them what
note to play, at what volume, using what sound, etc.
 The synthesiser that receives the MIDI events is
responsible for generating the actual sounds.
Example: Keyboard Piano
187

MIDI Versus Wav
 Quality recording, MIDI depend to the tools
 Audio .wav easier to create compare than MIDI
 MIDI Advantages
 Small File Size
 Size Storage also small
188

Advantages and Disadvantages of
using audio
Sound adds life to any multimedia application and plays
important role in effective marketing presentations.
 Advantages
 Ensure important information is noticed
 Add interest
 Can communicate more directly than other media
 Disadvantages
 Easily overused
 Requires special equipment for quality production
 Not as memorable as visual media
189

Multimedia Lectures.pdf

More Related Content

Similar to Multimedia Lectures.pdf (20)

More from DiaaMustafa2 (7)

Recently uploaded (20)

Multimedia Lectures.pdf