Analyzing and Interpreting quantitative data.pptx

AnalyzingandInterpreting
QuantitativeData
Presented By: Aroob Tariq
Khaleeqa Naveed
Sehr Suleman

STEPS IN THE PROCESS OF
QUANTITATIVE DATA ANALYSIS
• First, assign numbers to the data and choose how to
score it. Pick a statistical software, enter the data, and
clean it to remove errors.
• Next, begin analysis with descriptive statistics to show
averages and variation. Then use inferential statistics
to test your hypotheses.
• Present the results with tables, charts, and clear
explanations.
• Finally, interpret the findings by summarizing key
results, comparing them to past research, noting any
study limits, and suggesting ideas for future research.

Organizing and Scoring Data for Analysis
• Scoring data means that the researcher assigns a numeric score (or value)
to each response category for each question on the instruments used to
collect data.
• For instance, assume that parents respond to a survey asking them to
indicate their attitudes about choice of a school for children in the school
district. One question might be:
“Students should be given an opportunity to select a school of their choice.”
-Strongly agree
-Agree
-Undecided
-Disagree
-Strongly disagree
• To analyze the data, you will need to assign scores to responses such as 5
= strongly agree, 4 = agree, 3 = undecided, 2 = disagree, and 1 = strongly
disagree. Based on these assigned numbers, the parent who checks
“Agree” would receive a score of 4.

Guidelines for Assigning Scores to Responses
• Continuous (Interval) Scales: When using a scale that measures levels of
agreement or opinion (like "Strongly agree" to "Strongly disagree"), you should
always assign scores in the same direction. -Example:
• Strongly agree = 5
• Agree = 4
• Undecided = 3
• Disagree = 2
• Strongly disagree = 1
• This ensures that higher numbers consistently represent more positive or
stronger agreement.
• Categorical Scales: For non-numerical categories (like education level or job
role), you can assign numbers arbitrarily, as long as you define them clearly.
• Example:
• High school = 3
• Middle school = 2
• Elementary = 1
• Pre-assign Numbers on the Questionnaire: To simplify the process, you can
include the numeric scores directly on the survey or checklist next to each
response option.
• Example question:
“Fourth graders should be tested for math proficiency.”
• (5) Strongly agree
• (4) Agree … so on

• Use Bubble Sheets for Automated Scoring: If you use bubble sheets
(like those in standardized testing), responses can be scanned
electronically, saving time and reducing human error. These are
especially useful in large-scale studies or classroom evaluations.
• Refer to Scoring Manuals for Standardized Tools: When using
commercial instruments or surveys, the company usually provides a
scoring manual. Always follow these guidelines, as they are based on
tested scoring systems.
• Create a Codebook: A codebook is a table or list that explains how
each response will be scored. It usually includes:
• Variable Name: The name used in the dataset (e.g., Grade)
• Definition: What the variable represents (e.g., grade level of
student)
• Codes/Values: Numeric values assigned to each possible response
(e.g., 10 = 10th grade, 11 = 11th grade, etc.)
Why it's important: A codebook helps keep your scoring system
organized and consistent, especially if others will use or review your
data.

Determin
e the
Types of
Scores to
Analyze
Before conducting an analysis of scores, researchers
consider what types of scores to use from their instruments.
The type of score will affect how you enter data into a
computer ﬁle for analysis.
Single-item score
• A single-item score is the numeric value assigned to one
individual question or item for each participant.
• Useful for examining specific responses in isolation.
• Example: In a survey asking, “Will you vote yes or no on
the tax levy?”, a researcher might assign: 1 = No, 2 = Yes
Each participant’s answer to that question is recorded as
a single-item score.
• Helps capture specific attitudes or decisions on individual
items.
Summed Scores (Scale Scores)
• A summed score is created by adding the numeric
responses from multiple related questions that measure
the same variable.
• More reliable than single-item scores because they
reduce the impact of misunderstanding or bias in any one
question.
• Example: If a depression scale has five questions, and a
participant scores 4 on each, their summed score is 20 (4
+ 4 + 4 + 4 + 4).
• Use Case: Useful for evaluating overall attitudes,
personality traits, or behaviors based on multiple items.

• Difference Scores (Net Scores)
• These scores reflect the change in
a participant’s performance over
time, often calculated by
subtracting a pre-test score from a
post-test score.
• Measures progress, growth, or
effect of an intervention.
• Example: A student scores 60 on a
pre-test and 75 on a post-test. The
difference score is 15 (75 – 60).
• Use Case: Common in
experiments and educational
assessments to evaluate
improvement or treatment effects.

Select a Statistical Program
After scoring the data, researchers select a computer program to analyze their data.
Here are some guidelines to follow when selecting a statistical program.
Find a program with documentation about how to use the program.
Look for intuitive interfaces, such as pull-down menus and straightforward data
entry.
Ensure the program supports the statistical tests needed to address your research
questions and hypotheses.
The software should handle your dataset size, support various data formats (e.g.,
numeric and text), and manage missing data effectively.
Opt for a tool that generates graphs and tables suitable for research reports.
Consider the cost—student versions are often more affordable but may have
limited features.
Use software supported by your institution to access help from faculty or peers
more easily.
Some of the more frequently used programs are:
• Minitab
• JMP
• SYSTAT
• SAS

Input Data
• Data entry involves transferring responses from your
surveys, checklists, or instruments into a digital format
within a computer program.
• The format is often similar to a spreadsheet (like
Microsoft Excel) with rows and columns.
• Rows = Participants
• Columns = Variables (e.g., gender, age, smoking
habits)
• The names for the variables are short and simple but
descriptive (no more than eight characters in SPSS,
such as “gender,” “smoke,” or “chewer”).

Steps to Input Data into SPSS
• Enter data directly into the SPSS grid by selecting a cell and typing
the corresponding value from your instrument or checklist.
• Each row in the grid represents a different participant.
• Each column represents a variable, such as gender, age, or smoking
status.
• Use numeric codes based on your codebook (e.g., 1 = Male, 2 =
Female) when entering responses.
• Assign a unique ID number to each participant and place it in the
first column. This can be auto-generated by SPSS or based on a
system you choose (e.g., last 3 digits of a student ID).
• SPSS assigns default variable names like var001, var002, etc. You
should rename these to meaningful names (e.g., "gender",
"parents", "grade").
• Use value labels to make your dataset easier to read (e.g., 1 =
Married, 2 = Divorced, 3 = Separated for the "parents" variable).
• Naming variables and values clearly will make the output easier to
interpret and improve overall data quality.

Clean and Account for Missing Data
• Data entry errors can happen if a participant gives a score outside the
valid range or if a wrong number is typed into the grid.
• Cleaning the data is the process of inspecting the data for scores (or
values) that are outside the accepted range. One way to accomplish this
is by visually inspecting the data grid. For example, participants May
provide a “6” for a “strongly agree” to “strongly disagree” scale when
there are only ﬁve response options.
• Missing data are data missing in the database because participants do
not supply it. To handle missing data:
You can eliminate participants with missing scores from the data
analysis.
You can substitute numbers for missing data in the database for
individuals.
Missing data in categorical variables can be replaced with placeholder
values like “-9,” while continuous variables may be filled using averages,
such as the mean of all participants’ responses. Up to 15% of missing
data can be substituted without affecting statistical results, and more
advanced techniques are available if needed.

Analyzing the Data
Once your data is organized, the next step is to
analyze it to answer your research questions or test
your hypotheses. Here's how:
• Use descriptive statistics (mean, median, mode,
standard deviation, variance, range) to summarize
data for a single variable. For example, you might
describe the average self-esteem level of middle
school students.
• Use inferential statistics to compare two or more
groups (e.g., boys vs. girls on self-esteem) and
determine if any observed differences are
statistically significant for the larger population.
• Inferential statistics are also used to examine
relationships between variables, such as the link
between self-esteem and optimism.

Conductin
g
Descriptiv
e Analysis
• Descriptive analysis is the process of
summarizing and understanding the
basic features of your collected data
using statistics. It helps answer
questions like:
What is the average score?
How spread out are the responses?
Where does a particular score stand
compared to others?
• There are three key types of
descriptive statistics used in
educational research: the central
tendency, variability, and relative
standing.

Central Tendency: How to Summarize Your Data
• This page teaches how to find a central value that best represents your data. This is useful
when you have a lot of scores and want to summarize them using one number.
• 1. Mean (Average)
• Add all your numbers together and divide by how many there are.
• Example: Depression scores = 60, 70, 80 Mean = (60+70+80)/3 = 70
→
• Most commonly used because it uses all data points.
• But mean is affected by extremely high or low values (called "outliers"). So sometimes it’s
not the best measure.
• 2. Median (Middle Value)
• Sort your data from lowest to highest.
• If odd number of values: pick the middle one.
• If even number of values: average the two middle numbers.
• Example: Scores = 60, 70, 80, 90 Median = (70+80)/2 = 75
→
• Not affected by extreme values, which makes it good when the data is skewed.
• 3. Mode (Most Frequent Value)
• The score that occurs most often.
• Example: 60, 60, 70, 80 Mode = 60
→
• Can be used for both numerical and categorical data (e.g., most common favorite subject).
• 🟦 These three measures (mean, median, mode) help summarize a dataset.

• Median:
• Helps when your data has outliers.
• For example, if most students score 70 but one scores 0, the mean
would drop too much. But the median stays more accurate.
• It divides data into two halves—half the scores are below it, half
above.
• Mode:
• Most useful when data is not numerical, like survey answers:
• E.g., “Which peer group do you belong to?” — singers, athletes, punkers.
• Mode tells which group is most common.
• Example in the book:
• 50 students selected peer groups:
• Singers = 14 Mode is “singers” because it’s the most chosen group.
→
• 💡 If the mode is a category, calculating a mean doesn’t make
sense. For instance, you can’t say the average of “singer = 1,
punker = 2” is 1.5 and it means something—because these are
labels, not real numbers.
When to Use Median and Mode

How Spread Out Are the Scores?
(Measures of Variability)
• 1. Range
• Highest score minus lowest score.
• Example: Scores = 60 to 99 Range = 39
→
• Simple, but doesn’t show the full picture (only two scores used).
• 2. Variance
• Measures the average squared difference from the mean.
• Steps:
• Subtract the mean from each score.
• Square each result.
• Add them up.
• Divide by number of scores.
• Variance tells us how far scores are from the mean on average, but its unit is “squared,”
which is hard to interpret.
• 3. Standard Deviation (SD)
• Square root of the variance.
• Easier to interpret than variance.
• A small SD means most scores are close to the mean. A large SD means scores vary a lot.
• Example in Table 6.3:
• Scores: 60, 64, 75, 76, 76, 83, 93, 94, 98, 99
• Mean: 81.8
• SD: 13.18
• Range: 99 - 60 = 39

Understanding the Bell Curve
(Normal Distribution)
• Imagine drawing a curve over your scores: most are in the middle,
fewer at the ends. That’s the normal curve.
• Key Features:
• Symmetrical: same shape on both sides.
• Mean, median, mode are all at the center.
• Most scores (68%) fall within one SD of the mean.
• 95% fall within two SDs.
• 99.7% fall within three SDs.
• Percentiles:
• Tell you how well a person scored compared to others.
• Example: 80th percentile = you did better than 80% of people.
• z Scores:
• Show how many SDs above or below the mean a score is.
• Formula: (Raw score - Mean) ÷ SD
• Example: If a student scores 60, and mean = 81.8, SD = 13.18
z = (60 - 81.8) / 13.18 = -1.57 this score is
→ 1.57 SDs below average.

Moving from Description to
Prediction: Inferential Statistics
• Moving from Description to Prediction: Inferential Statistics
• Descriptive statistics just describe data. But what if we want to
make judgments, compare groups, or make predictions?
• This is where inferential statistics are used.
• You can:
• Compare groups – e.g., Are girls more motivated than boys?
• Relate variables – e.g., Is study time related to GPA?
• Test hypotheses – e.g., Do non-smokers have less depression than
smokers?
• Inferential statistics test if your results are real or just due to
chance.
• Three tools used:
• Hypothesis testing: Are the differences real?
• Confidence intervals: What is the likely range of true scores?
• Effect size: How big is the difference?

5 Steps of
Hypothesi
s Testing
• Step 1: Set Hypotheses
• Null Hypothesis (H )
₀ : No difference or relationship.
• E.g., “No difference in depression between
smokers and non-smokers.”
• Alternative Hypothesis (H )
₁ : There is a difference.
• E.g., “Smokers are more depressed than non-
smokers.”
• Step 2: Set Significance Level (α)
• Usually 0.05 (5% chance results are by luck).
• If p-value is less than α, we reject H₀.
• Step 3: Collect Data
• Use surveys, tests, or observations.
• Step 4: Compute p-value
• Use statistical software like SPSS.
• Step 5: Make a Decision
• p < 0.05 → statistically significant → reject H₀
• p ≥ 0.05 → not significant → fail to reject H₀
• Also explains one-tailed vs. two-tailed:
• One-tailed: Expect change in one direction.
• Two-tailed: Open to both directions.

Example
of a t-
test
A t-test compares the means of two groups.
Example: Comparing depression levels of
smokers and non-smokers.
Mean for non-smokers: 69.77
Mean for smokers: 79.79
Difference = 10.02
p-value = 0.00 → significant reject H₀
→
Also introduced:
Degrees of freedom (df): Used in calculating
statistics, usually (n 1).
−
Confidence interval (CI): Range of likely
values for the population.

Chi-
Square
Test
Exampl
e
✍ You may expect a relationship, but statistics tell you if
that relationship is real or just by chance.
Conclusion: No strong link between smoking and peer
group.
p = 0.635 not significant.
→
Results:
Example: Are smokers more likely to belong to a specific
peer group?
A chi-square test is used for categorical data.

Interpreting p-values and
Decisions
A step-by-step guide to decision-making:
Look at the statistical test result (like t or chi-square).
Find the p-value.
If p < alpha (like 0.05), reject the null.
If p > alpha, fail to reject the null.
Statistical significance = finding something real in
your sample.
Example: If p = 0.00 strong evidence reject H₀
→ →

Errors in Hypothesis Testing
Decision Reality Result
Reject H₀ H₀ true
Type I
error (false
positive)
Fail to
reject H₀
H₁ true
Type II
error (false
negative)
Reject H₀ H₁ true Correct
Fail to
reject H₀
H₀ true Correct
•There are 4 possible outcomes:
•Type I error: You think there's a
difference, but there isn't.
•Type II error: You think there's no
difference, but there is.
•🚨 Type I is considered worse
because it can mislead research
conclusions.

Confidence Intervals and Effect Size
• Confidence Interval (CI)
• A range of values likely to include the true population score.
• Example: Difference in depression between smokers and non-
smokers = 10.02
• CI = -12.71 to -7.33 means we are 95% sure the real
→
difference is between these two values.
• Effect Size
• Tells how big the difference is.
• A small p-value means the difference is statistically real.
• But effect size tells if it’s practically important.
• Example:
• Depression scores: Mean difference = 10 points
• On a 100-point scale, that’s a big difference.
• This helps judge impact, not just significance.

What Are
Confidenc
e
Intervals?
- Provide range of
values for population
mean
- 95% CI population
→
mean falls within
range in 95% of cases
- Helps appreciate
magnitude of
differences, not just
significance

Effect Size
- Measures strength of the difference or
relationship
- Cohen’s d (for mean differences)
- Eta squared ( percentage of variance)
∼
for ANOVA
- Phi coefficient for Chi-square
(association)

Reporting
the
Results
• - Include p-value,
confidence interval, and
effect size
• - Provide tables for
statistical results (mean,
SD, p-value)
• - Figures aid
understanding of
relationships
• - Always follow APA
format in tables and
figures

Interpreting the
Results
• - Summarize major findings
in context of research
questions
• - Explain significance in
comparison to previous
studies
• - Discuss limitations and
weaknesses of the study
• - Suggest future directions
for further research

Parent Involvement
Study (Example)
• - Deslandes & Bertrand (2005)
used multiple regression
• - Parent’s perceptions of
student invitations significantly
predicted parent involvement
at home (beta = .44)
• - p < .001 (high significance)
• - This illustrates how statistical
analysis guides understanding
of relationships in educational
research

Summary
- Quantitative data analysis involves:
- Preparing and organizing data
- Analyzing (descriptive & inferencial)
- Reporting (with p-values, confidence intervals,
effect sizes)
- Interpreting (with context, limitations, future
directions)

Analyzing and Interpreting quantitative data.pptx

More Related Content

Similar to Analyzing and Interpreting quantitative data.pptx (20)

Recently uploaded (20)

Analyzing and Interpreting quantitative data.pptx