Chapter_03 Multiple Random Variables.pptx

probability and Random process
Chapter :3
Multiple random variables
Eng. Abdirahman Farah Ali
C.raxmaanfc@gmail.com
1

Introduction
Joint Distributions: Two Random Variables
• In probability theory, a random variable is a variable whose value is subject to randomness, meaning it can take
on different values with different probabilities. When dealing with multiple random variables, we are interested
in the probability distribution of the joint outcomes of those variables.
• For example, consider rolling two dice. Let X be the value on the first die and Y be the value on the second die.
Both X and Y are random variables since they can take on different values with different probabilities. The joint
distribution of X and Y is given by the probabilities of all possible pairs of outcomes, such as (1,1), (1,2), (1,3),
and so on.
• Another example is the height and weight of a population. Let X be the height of a randomly selected person and
Y be their weight. Both X and Y are random variables, and the joint distribution of X and Y can provide
information about the relationship between height and weight in the population.
• In general, the joint distribution of multiple random variables can be used to analyze the relationship between
those variables and to make predictions about future outcomes based on past data.
• We will first discuss probability mass function, and joint distributions of discrete random variables, and then
extend the results to continuous random variables.

PROBABILITY MASS FUNCTION
Probability mass function is a function that gives the probability that a discrete random variable
is exactly equal to some value. Sometimes it is also known as the discrete density function. The
probability mass function is often the primary means of defining a discrete probability
distribution, and such functions exist for either scalar or multivariate random variables whose
domain is discrete.
A probability mass function differs from a probability density
function (PDF) in that the latter is associated with continuous rather than
discrete random variables. A PDF must be integrated over an interval to
yield a probability

Important
• Joint probability is the probability of two events occurring simultaneously.
• Marginal probability is the probability of an event irrespective of the outcome of another
variable.
• Conditional probability is the probability of one event occurring in the presence of a
second event.

Marginal probability
Example 3.1
• To explain what marginal probability is, we need a contingency table. A contingency table is a table in which we show the frequency for 2 variables.
• One variable is used to categorize rows and the other is used to categorize columns.
• Suppose a company specializes in training students to pass a test.
• The company had 200 students last year.
• The following contingency table shows their pass rates broken down by sex.
• The marginal probabilities are shown along the right side and along the bottom of the table below.
• x

Joint Probability Mass Function (PMF)
• Remember that for a discrete random variable X, we define the PMF as PX(x)=P(X=x). Now, if
we have two random variables X and Y, and we would like to study them jointly, we define the
joint probability mass function as follows:
The joint probability mass function of two discrete random
variables XX and YY is defined as
PXY(x,y)=P(X=x,Y=y).

Example 3.2
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
Consider two random variables X and
Y with joint PMF given in Table 3.1
Table 3.1 Joint PMF of X and Y.
A. Find P(X=0, Y ≤ 1.
B. Find the marginal PMFs of X and Y.
C. Find P(Y=1|X=0).
Figure below shows
Pyx (x,y).
Joint PMF of X and Y

Functions of Two Continuous Random Variables
• Let X and Y be two continuous random variables, and let S denote the
two-dimensional support of X and Y. Then, the function f(x,y) is a
joint probability density function (abbreviated p.d.f.) if it satisfies the
following three conditions:
1.f(x,y) ≥ 0
2.∫∫f(x,y)dxdy = 1
3.P[(X,Y) A] = ∫∫Af(x,y)dxdy
∈
• where {(X,Y) A} is an event in the xy-plane
∈

• So far, our attention in this lesson has been directed towards the joint
probability distribution of two or more discrete random variables.
Now, we'll turn our attention to continuous random variables.
• Along the way, always in the context of continuous random variables,
we'll look at formal definitions of joint probability density functions,
marginal probability density functions, expectation and independence.
• We'll also apply each definition to a particular example.

Chapter_03 Multiple Random Variables.pptx

Example 3.3
∫
0
1
∫
0
1
4 𝑥𝑦 𝑑𝑥𝑑𝑦

Activity 3.2
• Let X and Y be two jointly continuous random variables with joint PDF
• Solution
Find E[XY2
].

Introduction
• Covariance is a measure of how two variables change together. The terms covariance and
correlation is very similar to each other in probability theory and statistics.
• Both the terms describe the extent to which a random variable or a set of random
variables can deviate from the expected value.
• Covariance and correlation are two mathematical concepts used in statistics. Both terms
are used to describe how two variables relate to each other.
• Covariance can be positive, negative, or zero. A positive covariance means that the two
variables tend to increase or decrease together.
• A negative covariance means that the two variables tend to move in opposite directions.

• A zero covariance means that the two variables are not related.
• In statistics, it is frequent that we come across these two terms known as covariance and
correlation.
• The two terms are often used interchangeably. These two ideas are similar, but not the same.
Both are used to determine the linear relationship and measure the dependency between two
random variables.
• Covariance is when two variables vary with each other, whereas Correlation is when the
change in one variable results in the change in another variable.

Where,
xi = data value of x
yi = data value of y
= mean of x
x
̄
= mean of y
ȳ
N = number of data values.

Example 3.4
X Y
10 40
12 48
14 56
8 32
Solution
Step 1: Calculate Mean of X and Y
Mean of X ( μx ) : (10+12+14+8) / 4 = 11
Mean of Y(μy) =( 40+48+56+32)/4 = 44
Step 2
Substitute the above values in the formula
Cov(x,y) =
Cov(x,y) = = =26.6
Hence Co-variance for the above data is 26.6
xi –x̅- yi – ȳ
10 – 11 = -1 40 – 44 = – 4
12 – 11 = 1 48 – 44 = 4
14 – 11 = 3 56 – 44 = 12
8 – 11 = -3 32 – 44 = -12

Activity 3.3
Suppose we have collected data on the heights and weights of 5 people, as shown in the table below:
calculate the covariance between height and weight, we first need to calculate the sample means for
each variable:

Solutions
• To calculate the covariance between height and weight, we first need to calculate the sample means for each
variable:
• Step 1
• = (170 + 160 + 180 + 165 + 175)/5 = 170.
x
̄
• = (70 + 60 + 80 + 75 + 85)/5 = 74
ȳ
• Step 2
• Next, we calculate the covariance using the formula:
• cov(X,Y) = ∑(xi - )(yi - )/(n-1)
x
̄ ȳ
• The covariance between height and weight is +62.5, indicating a positive relationship between the two variables.

Correlation
• Correlation is a step ahead of covariance as it quantifies the relationship
between two random variables. In simple terms, it is a unit measure of
how these variables change with respect to each other (normalized
covariance value).
• Unlike covariance, the correlation has an upper and lower cap on a range.
It can only take values between +1 and -1.
• A correlation of +1 indicates that random variables have a direct and
strong relationship.
• On the other hand, the correlation of -1 indicates that there is a strong
inverse relationship, and an increase in one variable will lead to an equal
and opposite decrease in the other variable.
• 0 means that the two numbers are independent.

To do so we have to normalize the covariance by dividing it with the product of the standard
deviations of the two variables, thus providing a correlation between the two variables.
The main result of a correlation is called the correlation coefficient.

Example-
we have sample data
X Y
10 40
12 48
14 56
8 32
Step 1: Calculate Mean of X and Y
Mean of X ( μx ) : 10+12+14+8 / 4 = 11
Mean of Y(μy) = 40+48+56+32/4 = 44
xi –x
̅ yi – ȳ
10 – 11 = -1 40 – 44 = – 4
12 – 11 = 1 48 – 44 = 4
14 – 11 = 3 56 – 44 = -12
8 – 11 = -3 32 – 44 =- 12
Step 2: Substitute the values in the formula
Before substitution we have to find standard deviation of x
and y
Lets take the data for X as mentioned in the table that is
10,12,14,8 to find standard deviation
Cov(x,y) =
Cov(x,y) = = =26.6

• Before substitution we have to find standard deviation of x and y
Lets take the data for X as mentioned in the table that is 10,12,14,8
• To find standard deviation
• Step 1: Find the mean of x that is x
̄
• 10+14+12+8 /4 = 11
• Step 2: Find each number deviation: Subtract each score with mean to get
• Step 3: Square the mean deviation obtained
• Step 4: Sum the squares
• 1+1+9+9 = 20
10 – 11 = -1
12 – 11 = 1
14 – 11 = 3
8 – 11 = -3
-1 1
1 1
3 9
-3 9

• Step5 : Find the variance
• Divide the sum of squares with n-1 that is 4-1 = 3
• 20 /3 = 6.6
• Step 6 : Find the square root
• = 2.581
• Therefore Standard Deviation of x = 2.581
• Find for Y using same method
• The Standard Deviation of y = 10.33
• Correlation = 26.6 /(2.581 x10.33 )
• Correlation = 0.998
• So, now you can understand the difference between Covariance vs
Correlation.

Some uses of Correlations
• Prediction
• If there is a relationship between two variables, we can make predictions about one from
another.
• Strengths of Correlations
• It allows the researcher to investigate naturally occurring variables that maybe unethical or
impractical to test experimentally. For example, it would be unethical to conduct an
experiment on whether smoking causes lung cancer.
• It also allows the researcher to clearly and easily see if there is a relationship between variables. This can then be
displayed in a graphical form.
• Validity
• Concurrent validity (correlation between a new measure and an established measure).
• Reliability
• Test-retest reliability (are measures consistent).
• Inter-rater reliability (are observers consistent).
• Theory verification
• Predictive validity.

• There is no rule for determining what size of correlation is
considered strong, moderate or weak. The interpretation of the
coefficient depends on the topic of study.

Summary
• "Correlation is not causation" means that just because two variables are related
it does not necessarily mean that one causes the other.
• A correlation identifies variables and looks for a relationship between them.
An experiment tests the effect that an independent variable has upon a
dependent variable but a correlation looks for a relationship between two
variables.
• This means that the experiment can predict cause and effect (causation) but a
correlation can only predict a relationship, as another extraneous variable may
be involved that is not known about.

Chapter_03 Multiple Random Variables.pptx

More Related Content

Similar to Chapter_03 Multiple Random Variables.pptx (20)

More from Abdirahman Farah Ali (17)

Recently uploaded (20)

Chapter_03 Multiple Random Variables.pptx