PROF DR WAQAR AHMED AWAN
PhD In Rehabilitation Sciences
RESEARCH METHODOLOGY &
BIOSTATISTICS
STATISTICS
 In the investigation of most clinical research questions, some form of
quantitative data will be collected.
 Initially these data exist in raw form, which means that they are nothing
more than a compilation of numbers representing empirical observations
from a group of individuals.
 For these data to be useful, must be organized, summarized, and
analyzed, so that their meaning can be communicated.
 These are the functions of the branch of mathematics called statistics.
BIOSTATISTICS
 Is the application of statistics to a wide range of topics in biology.
 It encompasses the design of biological experiments, especially in
medicine, pharmacy, agriculture and fishery; the collection,
summarization, and analysis of data from those experiments; and
the interpretation of, and inference from, the results.
 A major branch is medical biostatistics, which is exclusively
concerned with medicine and health.
RESEARCH METHOD
 Is a systematic plan for conducting research.
 Researcher draw on a variety of both qualitative and quantitative
research methods, including experiments, survey research, participant
observation, and secondary data.
 Quantitative methods aim to classify features, count them, and create
statistical models to test hypotheses and explain observations.
 Qualitative methods aim for a complete, detailed description of
observations, including the context of events and circumstances.
DESCRIPTIVE STATISTICS
 Descriptive statistics are used to characterize the shape, central
tendency, and variability within a set of data, often with the intent to
describe a population.
 Measures of population characteristics are called parameters.
 A descriptive index computed from sample data is called a statistic.
Distribution
• The total set of scores for a particular variable
is called a distribution.
• This table presents a set of hypothetical scores
of 48 therapists on a test of attitudes toward
working with geriatric clients.
• For this example, a maximum score of 20
indicates an overall positive attitude; zero
indicates a strong negative bias.
• The total number of scores in the distribution is
given the symbol n.
• In this sample, n= 48.
Frequency Distribution
• A frequency distribution is a table of rank
ordered scores that shows the number of
times each value occurred, or its frequency
• The first two columns in Table 17.1B show
the frequency distribution for the attitude
scores.
• We can see the lowest and highest scores,
where the scores tend to cluster, and which
scores occurred most often.
Percentages
 Sometimes frequencies are more meaningfully expressed as percentages
of the total distribution.
 The percentage represented by each score in the distribution, or at the
cumulative percentage obtained by adding the percentage value for
each score to all percentages that fall below that score.
 For example, it may be useful to know that 18.8% of the sample had a
score of 15 or that 56.3% of the sample had scores of 15 and below.
 Percentages are useful for describing distributions because they are
independent of sample size.
For example
 suppose we tested another sample with 150 therapists, and
found that 84 individuals obtained a score of 15.
 Although there are more people in this second sample with this
score than in the first sample, they both represent the same
percentage of the total sample (56%).
Grouped Frequency
Distributions
• If researchers will find that very few subjects, if any,
obtain the exact same score.
• Consider a hypothetical sample of 30 patients for whom
we obtained measurements of shoulder abduction
range of motion, shown in Table 17.2A.
• Obviously, creating a frequency distribution is a useless
process if almost every score has a frequency of one.
• In this situation, a grouped frequency distribution can be
constructed by grouping the scores into classes, or
intervals, where each class represents a unique range
of scores within the distribution.
• Frequencies are then assigned to each interval.
• The classes represent ranges of 10 degrees.
• The classes are mutually exclusive (no overlap) and
exhaustive within the range of scores obtained.
• The choice of the number of classes to be used and the
range within each class is an arbitrary decision.
• It depends on the overall range of scores, the number of
observations, and how much detail is relevant for the
intended audience.
• Although information is inherently lost in grouped data, this
approach is often the only feasible way to present
comprehensible data when large amounts of information are
Graphing Frequency Distributions
 Graphic representation of data often communicates information
about trends and general characteristics of distributions more
clearly than a tabular frequency distribution.
 The most common methods of graphing frequency distributions are
 Stem-and-leaf Plot,
 Histogram, And
 Frequency Polygon.
Stem-and-Leaf Plot
 The stem-and-leaf plot is a refined grouped frequency distribution that is
most useful for presenting the pattern of distribution of a continuous
variable.
 The pattern is derived by separating each score into two parts.
 The leaf consists of the last or rightmost single digit of each score,
 the stem consists of the remaining leftmost digits.
 A stem-and-leaf plot for the shoulder range of motion data.
 The scores have left most digits of 6 through 13. These values become the
 To read the stem-and-leaf plot, we look across each row, attaching each
single leaf digit to the stem.
 Therefore, the first row represents the scores 60 and 68; the second row,
72, 77 and 77; the third row, 80, 82, 84, 85 and 86; and so on.
 This display provides a concise summary of the data, while
maintaining the integrity of the original data.
 If we compare this plot with the grouped frequency distribution, it is
clear how much more information is provided by the stem-and-leaf
plot in a small space, and
 how it provides elements of both tabular and graphic displays.
Histogram
• A histogram is a bar graph, composed of a
series of columns, each representing one
score or class interval. Figure 17.1A is a
histogram showing the distribution of attitude
scores given in Table 1 7.1.
• The frequency for each score is plotted on the
Y-axis (vertical), and the measured variable, in
this case attitude score, is on the X-axis
(horizontal).
• The bars are centered over the scores.
Frequency Polygon
• A frequency polygon is a line plot, where
each point on the line represents
frequency or percentage.
• When grouped data are used, the dots in
the graph are located at the midpoint of
each class interval to represent the
frequency in that class.
Shapes Of Distribution
• Some distributions are symmetrical; that is,
each half is a mirror image of the other.
• Curves A and B in Figure 17.2 are
symmetrical.
• When scores are equal throughout the
distribution, the shape is described as uniform,
or rectangular, as shown in Curve A.
• Curve B represents a special case of the
symmetrical distribution called the normal
distribution.
• In statistical terminology, "normal" refers to a
specific type of bell-shaped distribution where
most of the scores fall in the middle of the
• A skewed distribution is asymmetrical.
• The degree to which the distribution deviates from
symmetry is its skewness.
• A figure positively skewed, or skewed to the right,
because most of the scores cluster at the low end
and only a few scores at the high end have
caused the tail of the curve to point toward the
right.
• When the curve "tails off" to the left, the
For Example
• If we were to plot a distribution for annual family income in the
United States, it would be positively skewed, because most
families have low to moderate incomes.
• When the curve "tails off" to the left, the distribution is negatively
skewed, or skewed to the left, We might see a negatively skewed
distribution if we plotted exam scores for an easy test, on which
relatively few students achieved a low score.
VARIABLES
 Variables: the events, characteristics, behaviors, or conditions that
researchers measure and study.
 A variable is either a result of some factor (dependent) or is itself
the factor (independent) that causes a change in another variable
e.g. treatment, range of motion, pain intensity etc.
Independent Variable
 Independent variables are aspect (factor) of a study which a
practitioner can control or choose
 They are called independent variables because they do not depend
on other variables for change
 These variables are cause of outcome of a research
 Treatment is an independent variable in rehabilitation sciences
Dependent Variables
 These are the factors of a study that will change as a result of change in
an independent variable
 These are the factors which a clinician measure or observe
 Range of motion and pain rating are common dependent variables in
physical therapy.
 e.g. Treatment of wrist drop will prevent overstretching of wrist extensor,
maintain ROM and maintain muscle strength etc
 A dependent variable must be defined operationally.
Controlled or Constant Variables
 These are the factors or conditions of a study that are kept
unchanged in an experiment.
 There can be more than one controlled variables in an experiment
 Age will be a constant factor if the participants belong to same age
group e.g. participants are 40 years old
Confounding Variables
 An extraneous variable in a research that correlates (directly or
inversely) with independent variable and distorts the results
 Age can be among confounding factors in rehabilitation sciences
while studying osteoarthritis
 Limited ROMs among diabetic patients
Types of Variables with Respect to
MEASUREMENT
• Variables can be classified with respect to measurement into
– Categorical Variable
– Numerical Variable
• A categorical variable is one for which the observations
recorded result in a set of categories
• There is a distinct demarcation between the categories
• For example:
– Gender (male and female)
– Recovery from treatment (not recovered, partially recovered and
completely recovered)
• Categorical variables are often referred to as qualitative
variables
Types of Variables with Respect to Measurement
Categorical Variable
• A numerical variable is one for which the observations
are recorded in numerical values such as, age, height,
etc.
• Numerical variable has further two types i.e., Discrete
and Continuous
• Numerical variable is often referred to as a quantitative
variable
Types of Variables with Respect to Measurement
Numerical Variable
• Discrete Variable
– A variable that is capable of taking a set of discrete
numerical values such as 10, 15, 1, 199, etc., but not every
possible value between two given numbers
– For example, The number of heart beats in a fixed time
period, number of successful operations in a hospital;
number of cases reported at a casualty
Types of Variables with Respect to Measurement
Numerical Variable
Types of Variables with Respect to Measurement
Numerical Variable
 Continuous Variable
• A variable, which is capable of taking every possible value
between two given number is termed as a continuous variable.
• Age, weight, length, etc. are a few examples of continuous
variables
data take many different forms: categorical variable and numerical variable
Exercise
Variable How it will be measures Variables Type Variable Subtype
Gender Male & Female
Marital Status Divorced, Widowed, Single,
Married
Blood Pressure In mmHg
Hypertension Mild, Moderate, Severe
Age In Years
Age Class <40, 40 -60, >60
Weight In Kg
Height In Inches
Number of Children Frequency
Number of school days Frequency
Depression Mild, Moderate & Severe
Anxiety Yes & No
Quality of Life Bad, average, good
Exercise
Variable How it will be measures Variables Type Variable Subtype
Gender Male & Female Categorical Data Dichotomous/binary
Marital Status Divorced, Widowed, Single, Married Categorical Data Nominal
Blood Pressure In mmHg Numerical Data Continuous
Hypertension Mild, Moderate, Severe Categorical Data Ordinal
Age In Years Numerical Data Continuous
Age Class <40, 40 -60, >60 Categorical Data Ordinal
Weight In Kg Numerical Data Continuous
Height In Inches Numerical Data Continuous
Number of Children Frequency Categorical Data Discrete
Number of school days Frequency Categorical Data Discrete
Depression Mild, Moderate & Severe Categorical Data Ordinal
Anxiety Yes & No Categorical Data Dichotomous/binary
Quality of Life Bad, average, good Categorical Data Ordinal
MEASURES OF CENTRAL
TENDENCY
 Although frequency distributions enable us to order data and
identify group patterns, they do not provide a practical quantitative
summary of a group's characteristics.
 Numerical indices are needed to describe the "typical" nature of the
data and to reflect different concepts of the "center" of a
distribution.
 These indices are called measures of central tendency, or averages
 The term average can denote three different measures of central
tendency:
 The mode,
 The median, and
 The mean.
Mean
• The mean is the sum of a set of scores divided by the
number of scores, n .
• This is the value most people refer to as the "average."
• The symbol used to represent the mean of a population is
the Greek letter μ mu (µ), and the mean of a sample is
represented by X.
• The bar above the X indicates that the value is an average
score.
• The formula for calculation of the sample mean from raw
• This is read, "the mean equals the sum of X
divided by n, " where X represents each
individual score in the distribution.
• For example, we can apply this formula to
the ROM scores shown in Table 17.2. In this
distribution of thirty scores, the sum of
scores is 2,848. Therefore, X = 2,848/30 =
94.9.
Median
• The median of a series of observations is that value above which
there are as many Scores as below it
• it divides a rank-ordered distribution into two equal halves.
• When a distribution contains an odd number of scores, such as
4, 5, 6, 7, 8, the middle score, 6, is the median.
• With an even number of scores, the midpoint between the two
middle scores is the median, so that for the series 4, 5, 6, 7, 8, 9,
the median lies halfway between 6 and 7. Therefore, the median
equals 6.5.
• For the distribution of attitude scores given in Table, with n = 48,
Mode
• The mode is the score that occurs most frequently in a
distribution.
• It is most easily determined by inspection of a frequency
distribution.
• When class intervals are used, the mode is taken as
the midpoint of the interval with the largest frequency.
• When more than one score occurs with the highest
frequency, a distribution is considered bimodal (with two
modes) or multimodal (with more than two modes).
• Many distributions of continuous variables do not have a
mode.
• The mode has only limited application as a measure of
central tendency for continuous data, but can be useful in
the assessment of categorical variables.
Advantage Of Median
 The advantage of the median as a measure of central tendency is
that it is unaffected by the value of extreme scores.
 It is an index of average position in a distribution, so useful
measure in describing skewed distributions.
 For instance, the average cost of a house is usually cited in terms
of the median, because the distribution tends to be skewed to the
right.
Comparing Measures of Central Tendency
 All three measures of central tendency can be applied to variables
on the interval or ratio scales, although the mean is most useful.
 For data on the nominal scale, only the mode is meaningful.
 If data are ordinal, both the median and mode can be applied.
 the mean is considered the most stable; that is, if we were to
repeatedly draw random samples from a population, the means of
those samples would fluctuate less than the mode or median.
• We can also consider the utility of the three measures of
central tendency for describing distributions of different
shapes.
• With uniform and normal distributions, any of the three
averages can be applied with validity.
• With skewed distributions, however, the mean is limited as
a descriptive measure because, unlike the median and
mode, it is affected by the quantitative value of every score
in a distribution and can be biased by extreme scores.
• For instance, in the previous example of ROM scores, if
the first subject obtained a score of 20 instead of 60,
the mean would decrease from 94.9 to 93.6. The
median and mode would be unaffected by this change.
• The curves in Figure illustrate how measures of
central tendency are affected by skewness.
• The median will typically fall between the mode
and the mean in a skewed curve, and the mean
will be pulled toward the tail.
• Because of these properties, the choice of which
index to report with skewed distributions
depends on what facet of information is
appropriate to the analysis.
• It is often reasonable to report all three values, to
MEASURES OF VARIABILITY
 The shape and central tendency of a distribution are useful but incomplete
descriptors of a sample.
 If we were to describe these two distributions using measures of central
tendency only, they would appear identical; however, a careful glance
reveals that the scores for Group B are more widely scattered than those
for Group A.
 This difference in variability, or dispersion of scores, is an essential element
in data analysis.
 The description of a sample is not complete unless we can characterize the
differences that exist among the scores as well as the central tendency of
the data.
Range
• The simplest measure of variability is the
range, which is the difference between the
highest and lowest values in a distribution.
• For the test scores reported in Table, the
range for Group A is 88 - 78 = 10, and for
Group B, 98 - 65 = 33. *
• These values suggest that the first group was
more homogeneous.
• Although the range is a relatively simple
statistical measure, its applicability is limited
because it is determined using only the two
extreme scores in the distribution.
• It reflects nothing about the dispersion of scores between the two
extremes.
• One aberrant extreme score can greatly increase the range, even
though the variability within the rest of the data set is unchanged.
• In addition, the range of scores tends to increase with larger
samples.
• Therefore, although it is easily computed, the range is usually
employed only as a rough descriptive measure, and is typically
Percentile
 Percentiles are used to describe a score's position within a
distribution.
 Percentiles divide data into 100 equal portions.
 A particular score is located in one of these portions, which
represents its position relative to all other scores.
 For example, if a student taking a college entrance examination scores
in the 92nd percentile (P92), that individual's score was higher than
92% of those who took the test.
 Percentiles are helpful for converting actual scores into
comparative scores or for providing a reference point for
interpreting a particular score.
 For instance, a child who scores in the 20th percentile for weight in his
age group can be evaluated relative to his peer group, rather than
Quartiles
 Quartiles divide a distribution into four equal parts, or quarters.
 Therefore, three quartiles exist for any data set.
 Quartiles Q1, Q2, and Q3 correspond to percentiles at 25%, 50%,
and 75% of the distribution (P25, P50, P75).
 The score at the 50th percentile or Q2 is the median.
 The distance between the first and third quartiles, Q3 - Q1, is
called the interquartile range, which represents the boundaries of
the middle 50% of the distribution.
Box Plot
• A box plot graph, also called a box-and-
whisker plot, (Figure) is a useful way to
demonstrate visually the spread of scores in
a distribution, including the median and
interquartile range.
• 1 Box plots may be drawn with the
"whiskers" representing highest and lowest
scores.
• The whiskers may also be drawn to
represent the 90th and 10th percentiles, and
VARIANCE
• Measures of range have limited application as indices of
variability because they are not influenced by every score
in a distribution and they are sensitive to extreme scores.
• Variance is a is the sum of the squared differences
between each data point and the mean, divided by the
number of data values.
• Variance reflects the variation within a full set of scores.
• Variance is small if scores are close together and large if
they are spread out.
• It should also be objective so that we can compare
samples of
different sizes and determine if one is more variable than
another
• Obviously, samples with larger deviation scores will be
Standard Deviation
• The limitation of variance as a descriptive measure of a sample's
variability is that it was calculated using the squares of the deviation
scores.
• It is generally not useful to describe sample variability in terms of squared
units
• Therefore, to bring the index back into the original units of measurement,
we take the positive square root of the variance.
• This value is called the standard deviation, symbolized by s.
• The standard deviation of sample data is
usually reported along with the mean so
that the data are characterized according
to both central tendency and variability.
• A mean may be expressed as X = 83.63 ±
1 2.22, which tells us that the average of
the deviations on either side of the mean is
12.22.
• An error bar graph shows these values for
both groups, illustrating the difference in
Coefficient of Variation
• The coefficient of variation (CV) is another measure of variability that can
be used to describe data measured on the interval or ratio scale.
• It is the ratio of the standard deviation to the mean, expressed as a
percentage:
• There are two major advantages to this index.
• First, it is independent of units of measurement because units will
mathematically cancel out. Therefore, it is a practical statistic for comparing
distributions recorded in different units.
• Second, the coefficient of variation expresses the standard deviation as a
proportion of the mean, thereby accounting for differences in the
magnitude of the mean.
• The coefficient of variation is, therefore, a measure of relative variation,
most meaningful when comparing two distributions.
Example
 a study of normal values of lumbar spine range of
motion, in which data were recorded in both degrees
and inches of excursion.
 The mean ranges for 20- to 29-year-olds were X = 41
.2 ± 9.6 degrees, and X = 3.7 ± 0.72 inches,
respectively.
 The absolute values of the standard deviations for
these two measurements suggest that the measure of
inches, using a tape measure, was much less
• because the means and units are
substantially different, we would
expect the standard deviations to
be different as well.
• By calculating the coefficient of
variation, we get a better idea of
the relative variation of these two
measurements:
Thank You

More Related Content

PPTX
Basics of biostatistic
PPTX
lupes presentation epsf mansursadjhhjgfhf.pptx
PPTX
STATISTICS.pptx
PPTX
Biostatistics.pptx
PPT
Descriptive Statistics and Data Visualization
PPT
Business statistics (Basics)
PPT
businessstatistics-stat10022-200411201812.ppt
PPT
Bio statistics 1
Basics of biostatistic
lupes presentation epsf mansursadjhhjgfhf.pptx
STATISTICS.pptx
Biostatistics.pptx
Descriptive Statistics and Data Visualization
Business statistics (Basics)
businessstatistics-stat10022-200411201812.ppt
Bio statistics 1

Similar to RMBS M1 Lecture 1a.pptx (20)

PPTX
Introduction to statistics in health care
PPTX
Biostatistics in Research Methodoloyg Presentation.pptx
PPTX
Quatitative Data Analysis
PPTX
Biostatistics
PPT
Biostatistics basics-biostatistics4734
PPT
Biostatistics basics-biostatistics4734
PDF
2. Descriptive Statistics.pdf
PPT
Tabular _ Graphical Presentation of data(Sep2020).ppt
PPT
Tabular & Graphical Presentation of data(2019-2020).ppt
PPT
20- Tabular & Graphical Presentation of data(UG2017-18).ppt
PPT
20- Tabular & Graphical Presentation of data(UG2017-18).ppt
PPTX
CO1_Session_6 Statistical Angalysis.pptx
PPTX
Biostats in ortho
PPTX
03.data presentation(2015) 2
PPTX
Biostatistics ppt
PPTX
biostatistics-210618023858.pptx bbbbbbbbbb
PPTX
PARAMETRIC TESTS.pptx
PPTX
RVO-STATISTICS_Statistics_Introduction To Statistics IBBI.pptx
PPTX
Intro to statistics
PDF
1Basic biostatistics.pdf
Introduction to statistics in health care
Biostatistics in Research Methodoloyg Presentation.pptx
Quatitative Data Analysis
Biostatistics
Biostatistics basics-biostatistics4734
Biostatistics basics-biostatistics4734
2. Descriptive Statistics.pdf
Tabular _ Graphical Presentation of data(Sep2020).ppt
Tabular & Graphical Presentation of data(2019-2020).ppt
20- Tabular & Graphical Presentation of data(UG2017-18).ppt
20- Tabular & Graphical Presentation of data(UG2017-18).ppt
CO1_Session_6 Statistical Angalysis.pptx
Biostats in ortho
03.data presentation(2015) 2
Biostatistics ppt
biostatistics-210618023858.pptx bbbbbbbbbb
PARAMETRIC TESTS.pptx
RVO-STATISTICS_Statistics_Introduction To Statistics IBBI.pptx
Intro to statistics
1Basic biostatistics.pdf

Recently uploaded (20)

PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
Complications of Minimal Access-Surgery.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
Empowerment Technology for Senior High School Guide
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
advance database management system book.pdf
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
20th Century Theater, Methods, History.pptx
PDF
IGGE1 Understanding the Self1234567891011
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
LDMMIA Reiki Yoga Finals Review Spring Summer
Complications of Minimal Access-Surgery.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
Paper A Mock Exam 9_ Attempt review.pdf.
Empowerment Technology for Senior High School Guide
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
B.Sc. DS Unit 2 Software Engineering.pptx
TNA_Presentation-1-Final(SAVE)) (1).pptx
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
advance database management system book.pdf
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
20th Century Theater, Methods, History.pptx
IGGE1 Understanding the Self1234567891011
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
What if we spent less time fighting change, and more time building what’s rig...
FORM 1 BIOLOGY MIND MAPS and their schemes

RMBS M1 Lecture 1a.pptx

  • 1. PROF DR WAQAR AHMED AWAN PhD In Rehabilitation Sciences RESEARCH METHODOLOGY & BIOSTATISTICS
  • 2. STATISTICS  In the investigation of most clinical research questions, some form of quantitative data will be collected.  Initially these data exist in raw form, which means that they are nothing more than a compilation of numbers representing empirical observations from a group of individuals.  For these data to be useful, must be organized, summarized, and analyzed, so that their meaning can be communicated.  These are the functions of the branch of mathematics called statistics.
  • 3. BIOSTATISTICS  Is the application of statistics to a wide range of topics in biology.  It encompasses the design of biological experiments, especially in medicine, pharmacy, agriculture and fishery; the collection, summarization, and analysis of data from those experiments; and the interpretation of, and inference from, the results.  A major branch is medical biostatistics, which is exclusively concerned with medicine and health.
  • 4. RESEARCH METHOD  Is a systematic plan for conducting research.  Researcher draw on a variety of both qualitative and quantitative research methods, including experiments, survey research, participant observation, and secondary data.  Quantitative methods aim to classify features, count them, and create statistical models to test hypotheses and explain observations.  Qualitative methods aim for a complete, detailed description of observations, including the context of events and circumstances.
  • 5. DESCRIPTIVE STATISTICS  Descriptive statistics are used to characterize the shape, central tendency, and variability within a set of data, often with the intent to describe a population.  Measures of population characteristics are called parameters.  A descriptive index computed from sample data is called a statistic.
  • 6. Distribution • The total set of scores for a particular variable is called a distribution. • This table presents a set of hypothetical scores of 48 therapists on a test of attitudes toward working with geriatric clients. • For this example, a maximum score of 20 indicates an overall positive attitude; zero indicates a strong negative bias. • The total number of scores in the distribution is given the symbol n. • In this sample, n= 48.
  • 7. Frequency Distribution • A frequency distribution is a table of rank ordered scores that shows the number of times each value occurred, or its frequency • The first two columns in Table 17.1B show the frequency distribution for the attitude scores. • We can see the lowest and highest scores, where the scores tend to cluster, and which scores occurred most often.
  • 8. Percentages  Sometimes frequencies are more meaningfully expressed as percentages of the total distribution.  The percentage represented by each score in the distribution, or at the cumulative percentage obtained by adding the percentage value for each score to all percentages that fall below that score.  For example, it may be useful to know that 18.8% of the sample had a score of 15 or that 56.3% of the sample had scores of 15 and below.  Percentages are useful for describing distributions because they are independent of sample size.
  • 9. For example  suppose we tested another sample with 150 therapists, and found that 84 individuals obtained a score of 15.  Although there are more people in this second sample with this score than in the first sample, they both represent the same percentage of the total sample (56%).
  • 10. Grouped Frequency Distributions • If researchers will find that very few subjects, if any, obtain the exact same score. • Consider a hypothetical sample of 30 patients for whom we obtained measurements of shoulder abduction range of motion, shown in Table 17.2A. • Obviously, creating a frequency distribution is a useless process if almost every score has a frequency of one. • In this situation, a grouped frequency distribution can be constructed by grouping the scores into classes, or intervals, where each class represents a unique range of scores within the distribution. • Frequencies are then assigned to each interval.
  • 11. • The classes represent ranges of 10 degrees. • The classes are mutually exclusive (no overlap) and exhaustive within the range of scores obtained. • The choice of the number of classes to be used and the range within each class is an arbitrary decision. • It depends on the overall range of scores, the number of observations, and how much detail is relevant for the intended audience. • Although information is inherently lost in grouped data, this approach is often the only feasible way to present comprehensible data when large amounts of information are
  • 12. Graphing Frequency Distributions  Graphic representation of data often communicates information about trends and general characteristics of distributions more clearly than a tabular frequency distribution.  The most common methods of graphing frequency distributions are  Stem-and-leaf Plot,  Histogram, And  Frequency Polygon.
  • 13. Stem-and-Leaf Plot  The stem-and-leaf plot is a refined grouped frequency distribution that is most useful for presenting the pattern of distribution of a continuous variable.  The pattern is derived by separating each score into two parts.  The leaf consists of the last or rightmost single digit of each score,  the stem consists of the remaining leftmost digits.  A stem-and-leaf plot for the shoulder range of motion data.  The scores have left most digits of 6 through 13. These values become the
  • 14.  To read the stem-and-leaf plot, we look across each row, attaching each single leaf digit to the stem.  Therefore, the first row represents the scores 60 and 68; the second row, 72, 77 and 77; the third row, 80, 82, 84, 85 and 86; and so on.  This display provides a concise summary of the data, while maintaining the integrity of the original data.  If we compare this plot with the grouped frequency distribution, it is clear how much more information is provided by the stem-and-leaf plot in a small space, and  how it provides elements of both tabular and graphic displays.
  • 15. Histogram • A histogram is a bar graph, composed of a series of columns, each representing one score or class interval. Figure 17.1A is a histogram showing the distribution of attitude scores given in Table 1 7.1. • The frequency for each score is plotted on the Y-axis (vertical), and the measured variable, in this case attitude score, is on the X-axis (horizontal). • The bars are centered over the scores.
  • 16. Frequency Polygon • A frequency polygon is a line plot, where each point on the line represents frequency or percentage. • When grouped data are used, the dots in the graph are located at the midpoint of each class interval to represent the frequency in that class.
  • 17. Shapes Of Distribution • Some distributions are symmetrical; that is, each half is a mirror image of the other. • Curves A and B in Figure 17.2 are symmetrical. • When scores are equal throughout the distribution, the shape is described as uniform, or rectangular, as shown in Curve A. • Curve B represents a special case of the symmetrical distribution called the normal distribution. • In statistical terminology, "normal" refers to a specific type of bell-shaped distribution where most of the scores fall in the middle of the
  • 18. • A skewed distribution is asymmetrical. • The degree to which the distribution deviates from symmetry is its skewness. • A figure positively skewed, or skewed to the right, because most of the scores cluster at the low end and only a few scores at the high end have caused the tail of the curve to point toward the right. • When the curve "tails off" to the left, the
  • 19. For Example • If we were to plot a distribution for annual family income in the United States, it would be positively skewed, because most families have low to moderate incomes. • When the curve "tails off" to the left, the distribution is negatively skewed, or skewed to the left, We might see a negatively skewed distribution if we plotted exam scores for an easy test, on which relatively few students achieved a low score.
  • 20. VARIABLES  Variables: the events, characteristics, behaviors, or conditions that researchers measure and study.  A variable is either a result of some factor (dependent) or is itself the factor (independent) that causes a change in another variable e.g. treatment, range of motion, pain intensity etc.
  • 21. Independent Variable  Independent variables are aspect (factor) of a study which a practitioner can control or choose  They are called independent variables because they do not depend on other variables for change  These variables are cause of outcome of a research  Treatment is an independent variable in rehabilitation sciences
  • 22. Dependent Variables  These are the factors of a study that will change as a result of change in an independent variable  These are the factors which a clinician measure or observe  Range of motion and pain rating are common dependent variables in physical therapy.  e.g. Treatment of wrist drop will prevent overstretching of wrist extensor, maintain ROM and maintain muscle strength etc  A dependent variable must be defined operationally.
  • 23. Controlled or Constant Variables  These are the factors or conditions of a study that are kept unchanged in an experiment.  There can be more than one controlled variables in an experiment  Age will be a constant factor if the participants belong to same age group e.g. participants are 40 years old
  • 24. Confounding Variables  An extraneous variable in a research that correlates (directly or inversely) with independent variable and distorts the results  Age can be among confounding factors in rehabilitation sciences while studying osteoarthritis  Limited ROMs among diabetic patients
  • 25. Types of Variables with Respect to MEASUREMENT • Variables can be classified with respect to measurement into – Categorical Variable – Numerical Variable
  • 26. • A categorical variable is one for which the observations recorded result in a set of categories • There is a distinct demarcation between the categories • For example: – Gender (male and female) – Recovery from treatment (not recovered, partially recovered and completely recovered) • Categorical variables are often referred to as qualitative variables Types of Variables with Respect to Measurement Categorical Variable
  • 27. • A numerical variable is one for which the observations are recorded in numerical values such as, age, height, etc. • Numerical variable has further two types i.e., Discrete and Continuous • Numerical variable is often referred to as a quantitative variable Types of Variables with Respect to Measurement Numerical Variable
  • 28. • Discrete Variable – A variable that is capable of taking a set of discrete numerical values such as 10, 15, 1, 199, etc., but not every possible value between two given numbers – For example, The number of heart beats in a fixed time period, number of successful operations in a hospital; number of cases reported at a casualty Types of Variables with Respect to Measurement Numerical Variable
  • 29. Types of Variables with Respect to Measurement Numerical Variable  Continuous Variable • A variable, which is capable of taking every possible value between two given number is termed as a continuous variable. • Age, weight, length, etc. are a few examples of continuous variables
  • 30. data take many different forms: categorical variable and numerical variable
  • 31. Exercise Variable How it will be measures Variables Type Variable Subtype Gender Male & Female Marital Status Divorced, Widowed, Single, Married Blood Pressure In mmHg Hypertension Mild, Moderate, Severe Age In Years Age Class <40, 40 -60, >60 Weight In Kg Height In Inches Number of Children Frequency Number of school days Frequency Depression Mild, Moderate & Severe Anxiety Yes & No Quality of Life Bad, average, good
  • 32. Exercise Variable How it will be measures Variables Type Variable Subtype Gender Male & Female Categorical Data Dichotomous/binary Marital Status Divorced, Widowed, Single, Married Categorical Data Nominal Blood Pressure In mmHg Numerical Data Continuous Hypertension Mild, Moderate, Severe Categorical Data Ordinal Age In Years Numerical Data Continuous Age Class <40, 40 -60, >60 Categorical Data Ordinal Weight In Kg Numerical Data Continuous Height In Inches Numerical Data Continuous Number of Children Frequency Categorical Data Discrete Number of school days Frequency Categorical Data Discrete Depression Mild, Moderate & Severe Categorical Data Ordinal Anxiety Yes & No Categorical Data Dichotomous/binary Quality of Life Bad, average, good Categorical Data Ordinal
  • 33. MEASURES OF CENTRAL TENDENCY  Although frequency distributions enable us to order data and identify group patterns, they do not provide a practical quantitative summary of a group's characteristics.  Numerical indices are needed to describe the "typical" nature of the data and to reflect different concepts of the "center" of a distribution.  These indices are called measures of central tendency, or averages  The term average can denote three different measures of central tendency:  The mode,  The median, and  The mean.
  • 34. Mean • The mean is the sum of a set of scores divided by the number of scores, n . • This is the value most people refer to as the "average." • The symbol used to represent the mean of a population is the Greek letter μ mu (µ), and the mean of a sample is represented by X. • The bar above the X indicates that the value is an average score. • The formula for calculation of the sample mean from raw
  • 35. • This is read, "the mean equals the sum of X divided by n, " where X represents each individual score in the distribution. • For example, we can apply this formula to the ROM scores shown in Table 17.2. In this distribution of thirty scores, the sum of scores is 2,848. Therefore, X = 2,848/30 = 94.9.
  • 36. Median • The median of a series of observations is that value above which there are as many Scores as below it • it divides a rank-ordered distribution into two equal halves. • When a distribution contains an odd number of scores, such as 4, 5, 6, 7, 8, the middle score, 6, is the median. • With an even number of scores, the midpoint between the two middle scores is the median, so that for the series 4, 5, 6, 7, 8, 9, the median lies halfway between 6 and 7. Therefore, the median equals 6.5. • For the distribution of attitude scores given in Table, with n = 48,
  • 37. Mode • The mode is the score that occurs most frequently in a distribution. • It is most easily determined by inspection of a frequency distribution. • When class intervals are used, the mode is taken as the midpoint of the interval with the largest frequency. • When more than one score occurs with the highest frequency, a distribution is considered bimodal (with two modes) or multimodal (with more than two modes). • Many distributions of continuous variables do not have a mode. • The mode has only limited application as a measure of central tendency for continuous data, but can be useful in the assessment of categorical variables.
  • 38. Advantage Of Median  The advantage of the median as a measure of central tendency is that it is unaffected by the value of extreme scores.  It is an index of average position in a distribution, so useful measure in describing skewed distributions.  For instance, the average cost of a house is usually cited in terms of the median, because the distribution tends to be skewed to the right.
  • 39. Comparing Measures of Central Tendency  All three measures of central tendency can be applied to variables on the interval or ratio scales, although the mean is most useful.  For data on the nominal scale, only the mode is meaningful.  If data are ordinal, both the median and mode can be applied.  the mean is considered the most stable; that is, if we were to repeatedly draw random samples from a population, the means of those samples would fluctuate less than the mode or median.
  • 40. • We can also consider the utility of the three measures of central tendency for describing distributions of different shapes. • With uniform and normal distributions, any of the three averages can be applied with validity. • With skewed distributions, however, the mean is limited as a descriptive measure because, unlike the median and mode, it is affected by the quantitative value of every score in a distribution and can be biased by extreme scores. • For instance, in the previous example of ROM scores, if the first subject obtained a score of 20 instead of 60, the mean would decrease from 94.9 to 93.6. The median and mode would be unaffected by this change.
  • 41. • The curves in Figure illustrate how measures of central tendency are affected by skewness. • The median will typically fall between the mode and the mean in a skewed curve, and the mean will be pulled toward the tail. • Because of these properties, the choice of which index to report with skewed distributions depends on what facet of information is appropriate to the analysis. • It is often reasonable to report all three values, to
  • 42. MEASURES OF VARIABILITY  The shape and central tendency of a distribution are useful but incomplete descriptors of a sample.  If we were to describe these two distributions using measures of central tendency only, they would appear identical; however, a careful glance reveals that the scores for Group B are more widely scattered than those for Group A.  This difference in variability, or dispersion of scores, is an essential element in data analysis.  The description of a sample is not complete unless we can characterize the differences that exist among the scores as well as the central tendency of the data.
  • 43. Range • The simplest measure of variability is the range, which is the difference between the highest and lowest values in a distribution. • For the test scores reported in Table, the range for Group A is 88 - 78 = 10, and for Group B, 98 - 65 = 33. * • These values suggest that the first group was more homogeneous. • Although the range is a relatively simple statistical measure, its applicability is limited because it is determined using only the two extreme scores in the distribution.
  • 44. • It reflects nothing about the dispersion of scores between the two extremes. • One aberrant extreme score can greatly increase the range, even though the variability within the rest of the data set is unchanged. • In addition, the range of scores tends to increase with larger samples. • Therefore, although it is easily computed, the range is usually employed only as a rough descriptive measure, and is typically
  • 45. Percentile  Percentiles are used to describe a score's position within a distribution.  Percentiles divide data into 100 equal portions.  A particular score is located in one of these portions, which represents its position relative to all other scores.  For example, if a student taking a college entrance examination scores in the 92nd percentile (P92), that individual's score was higher than 92% of those who took the test.  Percentiles are helpful for converting actual scores into comparative scores or for providing a reference point for interpreting a particular score.  For instance, a child who scores in the 20th percentile for weight in his age group can be evaluated relative to his peer group, rather than
  • 46. Quartiles  Quartiles divide a distribution into four equal parts, or quarters.  Therefore, three quartiles exist for any data set.  Quartiles Q1, Q2, and Q3 correspond to percentiles at 25%, 50%, and 75% of the distribution (P25, P50, P75).  The score at the 50th percentile or Q2 is the median.  The distance between the first and third quartiles, Q3 - Q1, is called the interquartile range, which represents the boundaries of the middle 50% of the distribution.
  • 47. Box Plot • A box plot graph, also called a box-and- whisker plot, (Figure) is a useful way to demonstrate visually the spread of scores in a distribution, including the median and interquartile range. • 1 Box plots may be drawn with the "whiskers" representing highest and lowest scores. • The whiskers may also be drawn to represent the 90th and 10th percentiles, and
  • 48. VARIANCE • Measures of range have limited application as indices of variability because they are not influenced by every score in a distribution and they are sensitive to extreme scores. • Variance is a is the sum of the squared differences between each data point and the mean, divided by the number of data values. • Variance reflects the variation within a full set of scores. • Variance is small if scores are close together and large if they are spread out. • It should also be objective so that we can compare samples of different sizes and determine if one is more variable than another • Obviously, samples with larger deviation scores will be
  • 49. Standard Deviation • The limitation of variance as a descriptive measure of a sample's variability is that it was calculated using the squares of the deviation scores. • It is generally not useful to describe sample variability in terms of squared units • Therefore, to bring the index back into the original units of measurement, we take the positive square root of the variance. • This value is called the standard deviation, symbolized by s.
  • 50. • The standard deviation of sample data is usually reported along with the mean so that the data are characterized according to both central tendency and variability. • A mean may be expressed as X = 83.63 ± 1 2.22, which tells us that the average of the deviations on either side of the mean is 12.22. • An error bar graph shows these values for both groups, illustrating the difference in
  • 51. Coefficient of Variation • The coefficient of variation (CV) is another measure of variability that can be used to describe data measured on the interval or ratio scale. • It is the ratio of the standard deviation to the mean, expressed as a percentage: • There are two major advantages to this index. • First, it is independent of units of measurement because units will mathematically cancel out. Therefore, it is a practical statistic for comparing distributions recorded in different units. • Second, the coefficient of variation expresses the standard deviation as a proportion of the mean, thereby accounting for differences in the magnitude of the mean. • The coefficient of variation is, therefore, a measure of relative variation, most meaningful when comparing two distributions.
  • 52. Example  a study of normal values of lumbar spine range of motion, in which data were recorded in both degrees and inches of excursion.  The mean ranges for 20- to 29-year-olds were X = 41 .2 ± 9.6 degrees, and X = 3.7 ± 0.72 inches, respectively.  The absolute values of the standard deviations for these two measurements suggest that the measure of inches, using a tape measure, was much less
  • 53. • because the means and units are substantially different, we would expect the standard deviations to be different as well. • By calculating the coefficient of variation, we get a better idea of the relative variation of these two measurements:

Editor's Notes

  • #20: A distribution is positively skewed if the scores fall toward the lower side of the scale and there are very few higher scores. A distribution is negatively skewed if the scores fall toward the higher side of the scale and there are very few low scores. 
  • #48: The interquartile range is often used to find outliers in data. Outliers here are defined as observations that fall below Q1 − 1.5 IQR or above Q3 + 1.5 IQR. In a boxplot, the highest (90th) and lowest (10th) occurring value within this limit are indicated by whiskers of the box (frequently with an additional bar at the end of the whisker) and any outliers as individual points.
  • #49: the deviation of each score from the mean; that is, we subtract the mean from each score in the distribution to obtain a deviation score, X - X̄.