SlideShare a Scribd company logo
Validity and reliability
In Research
•Agenda
AT the end of this lesson, you should be able to:
Discuss validity
Discuss reliability
Discuss validity in qualitative research
Discuss validity in experimental design
1
2
5
3
4
Discuss how to achieve validity and reliability
 The consistency of scores or answers from one administration of
an instrument to another, or from one set of items to another.
 A reliable instrument yields similar results if given to a similar
population at different times.
Reliability
 Appropriateness, meaningfulness, correctness, and
usefulness of inferences a researcher makes.
 Validity of ??
 Instrument?
 Data?
Validity
Validity
• Internal validity is the extent to which research findings are free
from bias and effects
• External validity is the extent to which the findings can be
generalised
• Content-related evidence of validity focuses on the content and
format of an instrument.
• Is it appropriate?
• Comprehensive?
• Is it logical?
• How do the items or questions represent the content? Is the
format appropriate?
Validity - Content-related evidence
This refers to the relationship between the scores obtained using
the instrument and the scores obtained using one or more
other instruments or measures. For example, are students’
scores on teacher made tests consistent with their scores on
standardized tests in the same subject areas?
Validity - Criterion-related evidence
Construct validity is defined as “establishing correct operational
measures for the concepts being studied” (Yin, 1984).
For example, if one is looking at problem solving in leaders, how
well does a particular instrument explain the relationship
between being able to problem solve and effectiveness as a
leader.
Validity - Construct-related evidence
ATTAINING VALIDITY AND
RELIABILITY
 Adequacy : the size and scope of the questions must be large
enough to cover the topic.
 Format of the instrument: Clarity of printing, type size,
adequacy of work area, appropriateness of language, clarity of
directions, etc.
Elements of content-related evidence
 Consult other experts who rate the items.
 Rate items, eliminating or changing those that do not meet the
specified content.
 Repeat until all raters agree on the questions and answers.
How to achieve content validity
To obtain criterion-related validity, researchers identify a
characteristic, assess it using one instrument (e.g., IQ test) and
compare the score with performance on an external measure,
such as GPA or an achievement test.
 A validity coefficient is obtained by correlating a set of scores on
one test (a predictor) with a set of scores on another (the
criterion).
 The degree to which the predictor and the criterion relate is the
validity coefficient. A predictor that has a strong relationship to
a criterion test would have a high coefficient.
Criterion-related validity
 This type of validity is more typically associated with research
studies than testing.
 It relates to psychological traits, so multiple sources are used to
collect evidence. Often times a combination of observation,
surveys, focus groups, and other measures are used to identify
how much of the trait being measured is possessed by the
observee.
Construct-related validity
Proactive
Coping Skills
The consistency of scores obtained from one
instrument to another, or from the same
instrument over different groups.
Reliability
 Every test or instrument has associated with its errors of
measurement.
 These can be due to a number of things: testing conditions,
student health or motivation, test anxiety, etc.
 Instrument/test developers work hard to try to ensure that their
errors are not grounded in flaws with the instrument/test itself.
Errors of measurement
 Test-retest: Same test to same group
 Equivalent-forms: A different form of the same instrument is
given to the same group of individuals
 Internal consistency: Split-half procedure
 Kuder-Richardson: Mathematically computes reliability from
the # of items, the mean, and the standard deviation of the test.
Reliability Methods
• Reliability coefficient - a number that tells us how likely one
instrument is to be consistent over repeated administrations
• Alpha or Cronbach’s alpha
• used on instruments where answers aren’t scored “right” and “wrong”.
It is often used to test the reliability of survey instruments.
Reliability coefficient
INTERNAL VALIDITY
•Validity
• Validity can be used in three ways.
• instrument or measurement validity
• external or generalization validity
• internal validity, which means that what a
researcher observes between two variables should
be clear in its meaning rather than due to
something that is unclear (“something else”)
• Any one (or more) of these conditions:
• Age or ability of subjects
• Conditions under which the study was conducted
• Type of materials used in the study
• Technically, the “something else” is called a threat to internal validity.
What is “something else”?
• Subject characteristics
• Loss of subjects
• Location
• Instrumentation
• Testing
• History
• Maturation
• Attitude of subjects
• Implementation
Threats to internal validity
•Subject characteristics
• Subject characteristics can pose a threat if
there is selection bias, or if there are
unintended factors present within or among
groups selected for a study. For example, in
group studies, members may differ on the basis
of age, gender, ability, socioeconomic
background, etc. They must be controlled for
in order to ensure that the key variables in the
study, not these, explain differences.
•Subject characteristics
• Age Intelligence
• Strength Vocabulary
• Maturity Reading ability
• Gender Fluency
• Ethnicity Manual dexterity
• Coordination Socioeconomic status
• Speed Religious/political belief
•Loss of subjects (mortality)
• Loss of subjects limits generalizability, but it can also
affect internal validity if the subjects who don’t
respond or participate are over represented in a
group.
•Location
• The place where data collection occurs, aka
“location” might pose a threat. For example,
hot, noisy, unpleasant conditions might affect
scores; situations where privacy is important
for the results, but where people are streaming
in and out of the room, might pose a threat.
• Decay: If the nature of the instrument or the scoring procedure
is changed in some way, instrument decay occurs.
• Data Collector Characteristics: The person collecting data can
affect the outcome.
• Data Collector Bias: The data collector might hold an opinion
that is at odds with respondents and it affects the
administration.
Instrumentation
• In longitudinal studies, data are often collected through more
than one administration of a test.
• If the previous test influences subsequent ones by getting the
subject to engage in learning or some other behavior that he or
she might not otherwise have done, there is a testing threat.
Testing
• If an unanticipated or unplanned event occurs prior to a study
or intervention, there might be a history threat.
History
• Sometimes the very fact of being studied influences subjects.
The best known example of this is the Hawthorne Effect.
Attitude of subjects
• This threat can be caused by various things; different data
collectors, teachers, conditions in treatment, method bias, etc.
Implementation
• Standardize conditions of study
• Obtain more information on subjects
• Obtain as much information on details of the study: location,
history, instrumentation, subject attitude, implementation
• Choose an appropriate design
• Train data collectors
Minimizing Threats
Qualitative Research
Validity and reliability??
• Many qualitative researchers
contend that validity and
reliability are irrelevant to their
work because they study one
phenomenon and don’t seek to
generalize
• Fraenkel and Wallen - any
instrument or design used to
collect data should be credible
and backed by evidence
consistent with quantitative
studies.
• Trustworthiness
•Qualitative research
.
•Quantitative vs. Qualitative
Traditional Criteria for
Judging Quantitative
Research
Alternative Criteria for
Judging Qualitative
Research
Internal validity Credibility
External validity Transferability
Reliability Dependability
Objectivity Confirmability
In qualitative research
• Reliability pertained to the extent to which the study is
replicable and how accurate the research methods and the
techniques used to produce data
• Objectivity of the researcher - researcher must look at her bias
and preconceived notions of what she will find before she
begins her research.
• Objectivity of the interviewee
• Triangulation
• Member check
• Audit trail
In qualitative research
Let’s look at one particular design
Validity in experimental research
Experimental
Designs Should
be Developed to
Ensure Internal
and External
Validity of the
Study
Internal Validity:
• Are the results of the study
(DV) caused by the factors
included in the study (IV) or
are they caused by other
factors (EV) which were not
part of the study?
(Selection Bias/Differential Selection) -- The groups may have been
different from the start. If you were testing instructional strategies to
improve reading and one group enjoyed reading more than the
other group, they may improve more in their reading because they
enjoy it, rather than the instructional strategy you used.
Subject
Characteristics
Threats to Internal Validity
(Mortality) -- All of the high or low scoring subject may
have dropped out or were missing from one of the
groups. If we collected posttest data on a day when the
debate society was on field trip , the mean for the
treatment group would probably be much lower than it
really should have been.
Loss of Subjects
Threats to Internal Validity
Perhaps one group was at a
disadvantage because of their
location. The city may have been
demolishing a building next to one of
the schools in our study and there are
constant distractions which interfere
with our treatment.
Location
Threats to Internal Validity
The testing instruments may not be scores similarly.
Perhaps the person grading the posttest is fatigued
and pays less attention to the last set of papers
reviewed. It may be that those papers are from one
of our groups and will received different scores than
the earlier group's papers
Threats to Internal Validity
Instrumentation
Instrument Decay
The subjects of one group may react differently to the data collector
than the other group. A male interviewing males and females about
their attitudes toward a type of math instruction may not receive the
same responses from females as a female interviewing females would.
Threats to Internal Validity
Data Collector
Characteristics
The person collecting data my favors one group, or some
characteristic some subject possess, over another. A principal
who favors strict classroom management may rate students'
attention under different teaching conditions with a bias toward
one of the teaching conditions.
Threats to Internal Validity
Data Collector Bias
The act of taking a pretest or posttest may influence the results of the
experiment. Suppose we were conducting a unit to increase student
sensitivity to racial prejudice. As a pretest we have the control and
treatment groups watch a movie on racism and write a reaction essay.
The pretest may have actually increased both groups' sensitivity and we
find that our treatment groups didn't score any higher on a posttest given
later than the control group did. If we hadn't given the pretest, we might
have seen differences in the groups at the end of the study.
Threats to Internal Validity
Testing
Something may happen at one site during our study that influences the results.
Perhaps a classmate was injured in a car accident at the control site for a study
teaching children bike safety. The control group may actually demonstrate more
concern about bike safety than the treatment group.
Threats to Internal Validity
History
There may be natural changes in
the subjects that can account for
the changes found in a study. A
critical thinking unit may appear
more effective if it taught during a
time when children are developing
abstract reasoning.
Threats to Internal Validity
Maturation
The subjects may respond differently just because they are being studied. The
name comes from a classic study in which researchers were studying the effect
of lighting on worker productivity. As the intensity of the factory lights increased,
so did the worker productivity. One researcher suggested that they reverse the
treatment and lower the lights. The productivity of the workers continued to
increase. It appears that being observed by the researchers was increasing
productivity, not the intensity of the lights.
Threats to Internal Validity
Hawthorne Effect
One group may view that it is in competition with the other group and may work
harder than they would under normal circumstances. This generally is applied to
the control group "taking on" the treatment group.
Threats to Internal Validity
John
Henry
Effect
The control group may become discouraged because it is not
receiving the special attention that is given to the treatment
group. They may perform lower than usual because of this.
Threats to Internal Validity
Resentful
Demoralization of
the Control Group
(Statistical Regression) -- A class that scores particularly low can
be expected to score slightly higher just by chance. Likewise, a
class that scores particularly high, will have a tendency to score
slightly lower by chance. The change in these scores may have
nothing to do with the treatment.
Threats to Internal Validity
Regression
The treatment may not be implemented as intended. A
study where teachers are asked to use student modeling
techniques may not show positive results, not because
modeling techniques don't work, but because the teacher
didn't implement them or didn't implement them as they
were designed.
Threats to Internal Validity
Implementation
Threats to Internal Validity
Compensatory
Equalization of
Treatment
Someone may feel sorry for the control group because they
are not receiving much attention and give them special
treatment. For example, a researcher could be studying the
effect of laptop computers on students' attitudes toward
math. The teacher feels sorry for the class that doesn't have
computers and sponsors a popcorn party during math
class. The control group begins to develop a more positive
attitude about mathematics.
Experimental Treatment
Diffusion
Threats to Internal Validity
Sometimes the control group actually
implements the treatment. If two different
techniques are being tested in two
different third grades in the same
building, the teachers may share what
they are doing. Unconsciously, the control
may use of the techniques she or he
learned from the treatment teacher.
Once the researchers are confident that
the outcome (dependent variable) of the
experiment they are designing is the
result of their treatment
(independent variable)
[internal validity],
they determine for which
people or situations
the results of
their study apply
[external validity].
External Validity:
• Are the results of the study generalizable to other
populations and settings?
• Population
• Ecological
...the extent to which one can generalize from the study sample to a defined
population--
If the sample is drawn from an accessible population, rather than the target
population, generalizing the research results from the accessible population to the
target population is risky.
Threats to External Validity (Population)
Population Validity is the extent to which the results of a study can be generalized
from the specific sample that was studied to a larger group of subjects. It involves...
Ecological Validity is the extent
to which the results of an experiment can be generalized from the set
of environmental conditions created by the researcher to other
environmental conditions (settings and conditions).
Threats to External Validity (Ecological)
There are 10 common
threats to external
validity.
(not sufficiently described for others to replicate) If the
researcher fails to adequately describe how he or
she conducted a study, it is difficult to determine
whether the results are applicable to other
settings.
Threats to External Validity (Ecological)
Explicit description of
the experimental
treatment
(catalyst effect)
If a researcher were to apply several treatments,
it is difficult to determine how well each of the
treatments would work individually. It might be
that only the combination of the treatments is
effective.
Threats to External Validity (Ecological)
Multiple-treatment
interference
(attention causes differences)
Subjects perform differently because they know they
are being studied. "...External validity of the experiment
is jeopardized because the findings might not
generalize to a situation in which researchers or others
who were involved in the research are not present"
(Gall, Borg, & Gall, 1996, p. 475)
Threats to External Validity (Ecological)
Hawthorne effect
Threats to External Validity (Ecological)
(anything different makes a difference)
A treatment may work because it is novel and the subjects respond to the
uniqueness, rather than the actual treatment. The opposite may also occur,
the treatment may not work because it is unique, but given time for the
subjects to adjust to it, it might have worked.
Novelty and
disruption effect
(it only works with this experimenter)
The treatment might have worked because of the
person implementing it. Given a different person, the
treatment might not work at all.
Threats to External Validity (Ecological)
Experimenter effect
(pretest sets the stage)
A treatment might only work if a pretest is
given. Because they have taken a pretest, the
subjects may be more sensitive to the
treatment. Had they not taken a pretest, the
treatment would not have worked.
Threats to External Validity (Ecological)
Pretest sensitization
(posttest helps treatment "fall into place")
The posttest can become a learning experience. "For
example, the posttest might cause certain ideas presented
during the treatment to 'fall into place' “ . If the subjects had
not taken a posttest, the treatment would not have worked.
Threats to External Validity (Ecological)
Posttest sensitization
Interaction of
history and
treatment effect
Threats to External Validity (Ecological)
(...to everything there is a time...)
Not only should researchers be cautious about generalizing to other
population, caution should be taken to generalize to a different time
period. As time passes, the conditions under which treatments work
change.
(maybe only works with M/C tests)
A treatment may only be evident with certain types of
measurements. A teaching method may produce
superior results when its effectiveness is tested with an
essay test, but show no differences when the
effectiveness is measured with a multiple choice test.
Threats to External Validity (Ecological)
Measurement of
the dependent
variable
Interaction of time
of measurement
and treatment
effect
Threats to External Validity (Ecological)
(it takes a while for the treatment to kick in)
It may be that the treatment effect does not occur until several weeks after the end
of the treatment. In this situation, a posttest at the end of the treatment would
show no impact, but a posttest a month later might show an impact.
NEXT WEEK
Consultation

More Related Content

PDF
Validity and reliability of the instrument
PPTX
Research instruments
PPTX
measurement Data collection
PPTX
Reliability
PPTX
Validity in Research
PPTX
Presentation on validity and reliability in research methods
PDF
reliablity and validity in social sciences research
PDF
Validity and reliability
Validity and reliability of the instrument
Research instruments
measurement Data collection
Reliability
Validity in Research
Presentation on validity and reliability in research methods
reliablity and validity in social sciences research
Validity and reliability

What's hot (20)

PPTX
Qualitative Research Method
PPTX
Analysis of data in research
PPTX
Introduction to NVivo
PPT
Qualitative Research Methods
PPSX
Tools in Qualitative Research: Validity and Reliability
PPT
Data collection and analysis
PPT
Qualitative research designs
PPTX
Qualitative Research in Education
PPT
Topic 1 introduction to quantitative research
PPTX
Data Analysis, Presentation and Interpretation of Data
PPT
Advanced research methods
PPTX
Mixed research-methods (1)
PPTX
Specifying a purpose, Purpose statement, Hypostheses and research questions
PPTX
Experimental research
PPTX
Quantitative reseach method
PPTX
Mixed method research
PPSX
Inferential statistics.ppt
PPT
The research instruments
PPTX
The Research Problem
PPTX
Mixed methods research in Education pptx
Qualitative Research Method
Analysis of data in research
Introduction to NVivo
Qualitative Research Methods
Tools in Qualitative Research: Validity and Reliability
Data collection and analysis
Qualitative research designs
Qualitative Research in Education
Topic 1 introduction to quantitative research
Data Analysis, Presentation and Interpretation of Data
Advanced research methods
Mixed research-methods (1)
Specifying a purpose, Purpose statement, Hypostheses and research questions
Experimental research
Quantitative reseach method
Mixed method research
Inferential statistics.ppt
The research instruments
The Research Problem
Mixed methods research in Education pptx
Ad

Viewers also liked (20)

PDF
8. validity and reliability of research instruments
PPT
Presentation Validity & Reliability
PPTX
Validity and Reliability
PPTX
Threats to internal and external validity
PPT
Validity, its types, measurement & factors.
PPT
Louzel Report - Reliability & validity
PPTX
Validity & reliability an interesting powerpoint slide i created
PPT
Reliability and validity
PPTX
Validity and reliability of questionnaires
PPT
Threats to Internal and External Validity
PPTX
Prof. dr. Rolf Fasting
PDF
Validity & Ethics in Research
PDF
Reliability, validity, generalizability and the use of multi-item scales
PPT
Reliability & validity
PPT
Reliability and validity1
PPT
Internal and external validity factors
PDF
Experimental Research
PPTX
Validity, reliability & practicality
PPTX
validity its types and importance
PPT
Reliability and validity
8. validity and reliability of research instruments
Presentation Validity & Reliability
Validity and Reliability
Threats to internal and external validity
Validity, its types, measurement & factors.
Louzel Report - Reliability & validity
Validity & reliability an interesting powerpoint slide i created
Reliability and validity
Validity and reliability of questionnaires
Threats to Internal and External Validity
Prof. dr. Rolf Fasting
Validity & Ethics in Research
Reliability, validity, generalizability and the use of multi-item scales
Reliability & validity
Reliability and validity1
Internal and external validity factors
Experimental Research
Validity, reliability & practicality
validity its types and importance
Reliability and validity
Ad

Similar to Week 9 validity and reliability (20)

PPTX
Establlishing Reliability-Validity.pptx
PPT
Validity and Reliabilty.ppt
PPTX
VALIDITY
PDF
Validity and Reliability.pdf
PDF
Validity and Reliability.pdf
PPT
15th batch NPTI Validity & Reliablity Business Research Methods
PPT
Chapter 8 compilation
PPTX
Data collection reliability
PPTX
VALIDITY OF DATA.pptx
PPT
23APR_NR_Data collection Methods_Part 3.ppt
PPT
23APR_NR_Data collection Methods_Part 3.ppt
PPT
Reliability and validity
PPT
Test characteristics
PPTX
VALIDITY.pptx.statistics.011111917181111
PPT
Validity, reliability & Internal validity in Researches
PPTX
week_10._validity_and_reliability_0.pptx
PPT
validityitstypesmeasurementfactors-130908120814- (1).ppt
PPTX
Validity and reliability (aco section 6a) sheena jayma msgs ed
PPTX
Validity & reliability seminar
PPTX
Threats to validity
Establlishing Reliability-Validity.pptx
Validity and Reliabilty.ppt
VALIDITY
Validity and Reliability.pdf
Validity and Reliability.pdf
15th batch NPTI Validity & Reliablity Business Research Methods
Chapter 8 compilation
Data collection reliability
VALIDITY OF DATA.pptx
23APR_NR_Data collection Methods_Part 3.ppt
23APR_NR_Data collection Methods_Part 3.ppt
Reliability and validity
Test characteristics
VALIDITY.pptx.statistics.011111917181111
Validity, reliability & Internal validity in Researches
week_10._validity_and_reliability_0.pptx
validityitstypesmeasurementfactors-130908120814- (1).ppt
Validity and reliability (aco section 6a) sheena jayma msgs ed
Validity & reliability seminar
Threats to validity

More from wawaaa789 (20)

DOCX
DOCX
Research proposal
PPTX
Week 10 apa powerpoint
PPT
Week 10 writing research proposal
DOCX
Transcript qualitative
PPTX
Week 7 spss 2 2013
PPT
Week 7 spss
PPT
Qualitative
PPT
Week 7 a statistics
PPTX
Week 8 sampling and measurements 2015
PDF
Survey design
PPT
Experimental
DOCX
Ethnography
PPT
Correlation case study
PPT
Causal comparative study
PPT
Case study research by maureann o keefe
DOCX
Research proposal 1
PPT
Week 4 variables and designs
PDF
Qual and quant
PDF
Kornfeld dissertation 12 15-09-1
Research proposal
Week 10 apa powerpoint
Week 10 writing research proposal
Transcript qualitative
Week 7 spss 2 2013
Week 7 spss
Qualitative
Week 7 a statistics
Week 8 sampling and measurements 2015
Survey design
Experimental
Ethnography
Correlation case study
Causal comparative study
Case study research by maureann o keefe
Research proposal 1
Week 4 variables and designs
Qual and quant
Kornfeld dissertation 12 15-09-1

Recently uploaded (20)

PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Lesson notes of climatology university.
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
master seminar digital applications in india
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Complications of Minimal Access Surgery at WLH
PDF
01-Introduction-to-Information-Management.pdf
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
human mycosis Human fungal infections are called human mycosis..pptx
Insiders guide to clinical Medicine.pdf
Cell Structure & Organelles in detailed.
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Lesson notes of climatology university.
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
O5-L3 Freight Transport Ops (International) V1.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
master seminar digital applications in india
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Complications of Minimal Access Surgery at WLH
01-Introduction-to-Information-Management.pdf
Basic Mud Logging Guide for educational purpose
Final Presentation General Medicine 03-08-2024.pptx
Computing-Curriculum for Schools in Ghana
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx

Week 9 validity and reliability

  • 2. •Agenda AT the end of this lesson, you should be able to: Discuss validity Discuss reliability Discuss validity in qualitative research Discuss validity in experimental design 1 2 5 3 4 Discuss how to achieve validity and reliability
  • 3.  The consistency of scores or answers from one administration of an instrument to another, or from one set of items to another.  A reliable instrument yields similar results if given to a similar population at different times. Reliability
  • 4.  Appropriateness, meaningfulness, correctness, and usefulness of inferences a researcher makes.  Validity of ??  Instrument?  Data? Validity
  • 5. Validity • Internal validity is the extent to which research findings are free from bias and effects • External validity is the extent to which the findings can be generalised
  • 6. • Content-related evidence of validity focuses on the content and format of an instrument. • Is it appropriate? • Comprehensive? • Is it logical? • How do the items or questions represent the content? Is the format appropriate? Validity - Content-related evidence
  • 7. This refers to the relationship between the scores obtained using the instrument and the scores obtained using one or more other instruments or measures. For example, are students’ scores on teacher made tests consistent with their scores on standardized tests in the same subject areas? Validity - Criterion-related evidence
  • 8. Construct validity is defined as “establishing correct operational measures for the concepts being studied” (Yin, 1984). For example, if one is looking at problem solving in leaders, how well does a particular instrument explain the relationship between being able to problem solve and effectiveness as a leader. Validity - Construct-related evidence
  • 10.  Adequacy : the size and scope of the questions must be large enough to cover the topic.  Format of the instrument: Clarity of printing, type size, adequacy of work area, appropriateness of language, clarity of directions, etc. Elements of content-related evidence
  • 11.  Consult other experts who rate the items.  Rate items, eliminating or changing those that do not meet the specified content.  Repeat until all raters agree on the questions and answers. How to achieve content validity
  • 12. To obtain criterion-related validity, researchers identify a characteristic, assess it using one instrument (e.g., IQ test) and compare the score with performance on an external measure, such as GPA or an achievement test.  A validity coefficient is obtained by correlating a set of scores on one test (a predictor) with a set of scores on another (the criterion).  The degree to which the predictor and the criterion relate is the validity coefficient. A predictor that has a strong relationship to a criterion test would have a high coefficient. Criterion-related validity
  • 13.  This type of validity is more typically associated with research studies than testing.  It relates to psychological traits, so multiple sources are used to collect evidence. Often times a combination of observation, surveys, focus groups, and other measures are used to identify how much of the trait being measured is possessed by the observee. Construct-related validity Proactive Coping Skills
  • 14. The consistency of scores obtained from one instrument to another, or from the same instrument over different groups. Reliability
  • 15.  Every test or instrument has associated with its errors of measurement.  These can be due to a number of things: testing conditions, student health or motivation, test anxiety, etc.  Instrument/test developers work hard to try to ensure that their errors are not grounded in flaws with the instrument/test itself. Errors of measurement
  • 16.  Test-retest: Same test to same group  Equivalent-forms: A different form of the same instrument is given to the same group of individuals  Internal consistency: Split-half procedure  Kuder-Richardson: Mathematically computes reliability from the # of items, the mean, and the standard deviation of the test. Reliability Methods
  • 17. • Reliability coefficient - a number that tells us how likely one instrument is to be consistent over repeated administrations • Alpha or Cronbach’s alpha • used on instruments where answers aren’t scored “right” and “wrong”. It is often used to test the reliability of survey instruments. Reliability coefficient
  • 19. •Validity • Validity can be used in three ways. • instrument or measurement validity • external or generalization validity • internal validity, which means that what a researcher observes between two variables should be clear in its meaning rather than due to something that is unclear (“something else”)
  • 20. • Any one (or more) of these conditions: • Age or ability of subjects • Conditions under which the study was conducted • Type of materials used in the study • Technically, the “something else” is called a threat to internal validity. What is “something else”?
  • 21. • Subject characteristics • Loss of subjects • Location • Instrumentation • Testing • History • Maturation • Attitude of subjects • Implementation Threats to internal validity
  • 22. •Subject characteristics • Subject characteristics can pose a threat if there is selection bias, or if there are unintended factors present within or among groups selected for a study. For example, in group studies, members may differ on the basis of age, gender, ability, socioeconomic background, etc. They must be controlled for in order to ensure that the key variables in the study, not these, explain differences.
  • 23. •Subject characteristics • Age Intelligence • Strength Vocabulary • Maturity Reading ability • Gender Fluency • Ethnicity Manual dexterity • Coordination Socioeconomic status • Speed Religious/political belief
  • 24. •Loss of subjects (mortality) • Loss of subjects limits generalizability, but it can also affect internal validity if the subjects who don’t respond or participate are over represented in a group.
  • 25. •Location • The place where data collection occurs, aka “location” might pose a threat. For example, hot, noisy, unpleasant conditions might affect scores; situations where privacy is important for the results, but where people are streaming in and out of the room, might pose a threat.
  • 26. • Decay: If the nature of the instrument or the scoring procedure is changed in some way, instrument decay occurs. • Data Collector Characteristics: The person collecting data can affect the outcome. • Data Collector Bias: The data collector might hold an opinion that is at odds with respondents and it affects the administration. Instrumentation
  • 27. • In longitudinal studies, data are often collected through more than one administration of a test. • If the previous test influences subsequent ones by getting the subject to engage in learning or some other behavior that he or she might not otherwise have done, there is a testing threat. Testing
  • 28. • If an unanticipated or unplanned event occurs prior to a study or intervention, there might be a history threat. History
  • 29. • Sometimes the very fact of being studied influences subjects. The best known example of this is the Hawthorne Effect. Attitude of subjects
  • 30. • This threat can be caused by various things; different data collectors, teachers, conditions in treatment, method bias, etc. Implementation
  • 31. • Standardize conditions of study • Obtain more information on subjects • Obtain as much information on details of the study: location, history, instrumentation, subject attitude, implementation • Choose an appropriate design • Train data collectors Minimizing Threats
  • 33. • Many qualitative researchers contend that validity and reliability are irrelevant to their work because they study one phenomenon and don’t seek to generalize • Fraenkel and Wallen - any instrument or design used to collect data should be credible and backed by evidence consistent with quantitative studies. • Trustworthiness •Qualitative research .
  • 34. •Quantitative vs. Qualitative Traditional Criteria for Judging Quantitative Research Alternative Criteria for Judging Qualitative Research Internal validity Credibility External validity Transferability Reliability Dependability Objectivity Confirmability
  • 35. In qualitative research • Reliability pertained to the extent to which the study is replicable and how accurate the research methods and the techniques used to produce data • Objectivity of the researcher - researcher must look at her bias and preconceived notions of what she will find before she begins her research. • Objectivity of the interviewee
  • 36. • Triangulation • Member check • Audit trail In qualitative research
  • 37. Let’s look at one particular design Validity in experimental research
  • 38. Experimental Designs Should be Developed to Ensure Internal and External Validity of the Study
  • 39. Internal Validity: • Are the results of the study (DV) caused by the factors included in the study (IV) or are they caused by other factors (EV) which were not part of the study?
  • 40. (Selection Bias/Differential Selection) -- The groups may have been different from the start. If you were testing instructional strategies to improve reading and one group enjoyed reading more than the other group, they may improve more in their reading because they enjoy it, rather than the instructional strategy you used. Subject Characteristics Threats to Internal Validity
  • 41. (Mortality) -- All of the high or low scoring subject may have dropped out or were missing from one of the groups. If we collected posttest data on a day when the debate society was on field trip , the mean for the treatment group would probably be much lower than it really should have been. Loss of Subjects Threats to Internal Validity
  • 42. Perhaps one group was at a disadvantage because of their location. The city may have been demolishing a building next to one of the schools in our study and there are constant distractions which interfere with our treatment. Location Threats to Internal Validity
  • 43. The testing instruments may not be scores similarly. Perhaps the person grading the posttest is fatigued and pays less attention to the last set of papers reviewed. It may be that those papers are from one of our groups and will received different scores than the earlier group's papers Threats to Internal Validity Instrumentation Instrument Decay
  • 44. The subjects of one group may react differently to the data collector than the other group. A male interviewing males and females about their attitudes toward a type of math instruction may not receive the same responses from females as a female interviewing females would. Threats to Internal Validity Data Collector Characteristics
  • 45. The person collecting data my favors one group, or some characteristic some subject possess, over another. A principal who favors strict classroom management may rate students' attention under different teaching conditions with a bias toward one of the teaching conditions. Threats to Internal Validity Data Collector Bias
  • 46. The act of taking a pretest or posttest may influence the results of the experiment. Suppose we were conducting a unit to increase student sensitivity to racial prejudice. As a pretest we have the control and treatment groups watch a movie on racism and write a reaction essay. The pretest may have actually increased both groups' sensitivity and we find that our treatment groups didn't score any higher on a posttest given later than the control group did. If we hadn't given the pretest, we might have seen differences in the groups at the end of the study. Threats to Internal Validity Testing
  • 47. Something may happen at one site during our study that influences the results. Perhaps a classmate was injured in a car accident at the control site for a study teaching children bike safety. The control group may actually demonstrate more concern about bike safety than the treatment group. Threats to Internal Validity History
  • 48. There may be natural changes in the subjects that can account for the changes found in a study. A critical thinking unit may appear more effective if it taught during a time when children are developing abstract reasoning. Threats to Internal Validity Maturation
  • 49. The subjects may respond differently just because they are being studied. The name comes from a classic study in which researchers were studying the effect of lighting on worker productivity. As the intensity of the factory lights increased, so did the worker productivity. One researcher suggested that they reverse the treatment and lower the lights. The productivity of the workers continued to increase. It appears that being observed by the researchers was increasing productivity, not the intensity of the lights. Threats to Internal Validity Hawthorne Effect
  • 50. One group may view that it is in competition with the other group and may work harder than they would under normal circumstances. This generally is applied to the control group "taking on" the treatment group. Threats to Internal Validity John Henry Effect
  • 51. The control group may become discouraged because it is not receiving the special attention that is given to the treatment group. They may perform lower than usual because of this. Threats to Internal Validity Resentful Demoralization of the Control Group
  • 52. (Statistical Regression) -- A class that scores particularly low can be expected to score slightly higher just by chance. Likewise, a class that scores particularly high, will have a tendency to score slightly lower by chance. The change in these scores may have nothing to do with the treatment. Threats to Internal Validity Regression
  • 53. The treatment may not be implemented as intended. A study where teachers are asked to use student modeling techniques may not show positive results, not because modeling techniques don't work, but because the teacher didn't implement them or didn't implement them as they were designed. Threats to Internal Validity Implementation
  • 54. Threats to Internal Validity Compensatory Equalization of Treatment Someone may feel sorry for the control group because they are not receiving much attention and give them special treatment. For example, a researcher could be studying the effect of laptop computers on students' attitudes toward math. The teacher feels sorry for the class that doesn't have computers and sponsors a popcorn party during math class. The control group begins to develop a more positive attitude about mathematics.
  • 55. Experimental Treatment Diffusion Threats to Internal Validity Sometimes the control group actually implements the treatment. If two different techniques are being tested in two different third grades in the same building, the teachers may share what they are doing. Unconsciously, the control may use of the techniques she or he learned from the treatment teacher.
  • 56. Once the researchers are confident that the outcome (dependent variable) of the experiment they are designing is the result of their treatment (independent variable) [internal validity], they determine for which people or situations the results of their study apply [external validity].
  • 57. External Validity: • Are the results of the study generalizable to other populations and settings? • Population • Ecological
  • 58. ...the extent to which one can generalize from the study sample to a defined population-- If the sample is drawn from an accessible population, rather than the target population, generalizing the research results from the accessible population to the target population is risky. Threats to External Validity (Population) Population Validity is the extent to which the results of a study can be generalized from the specific sample that was studied to a larger group of subjects. It involves...
  • 59. Ecological Validity is the extent to which the results of an experiment can be generalized from the set of environmental conditions created by the researcher to other environmental conditions (settings and conditions). Threats to External Validity (Ecological) There are 10 common threats to external validity.
  • 60. (not sufficiently described for others to replicate) If the researcher fails to adequately describe how he or she conducted a study, it is difficult to determine whether the results are applicable to other settings. Threats to External Validity (Ecological) Explicit description of the experimental treatment
  • 61. (catalyst effect) If a researcher were to apply several treatments, it is difficult to determine how well each of the treatments would work individually. It might be that only the combination of the treatments is effective. Threats to External Validity (Ecological) Multiple-treatment interference
  • 62. (attention causes differences) Subjects perform differently because they know they are being studied. "...External validity of the experiment is jeopardized because the findings might not generalize to a situation in which researchers or others who were involved in the research are not present" (Gall, Borg, & Gall, 1996, p. 475) Threats to External Validity (Ecological) Hawthorne effect
  • 63. Threats to External Validity (Ecological) (anything different makes a difference) A treatment may work because it is novel and the subjects respond to the uniqueness, rather than the actual treatment. The opposite may also occur, the treatment may not work because it is unique, but given time for the subjects to adjust to it, it might have worked. Novelty and disruption effect
  • 64. (it only works with this experimenter) The treatment might have worked because of the person implementing it. Given a different person, the treatment might not work at all. Threats to External Validity (Ecological) Experimenter effect
  • 65. (pretest sets the stage) A treatment might only work if a pretest is given. Because they have taken a pretest, the subjects may be more sensitive to the treatment. Had they not taken a pretest, the treatment would not have worked. Threats to External Validity (Ecological) Pretest sensitization
  • 66. (posttest helps treatment "fall into place") The posttest can become a learning experience. "For example, the posttest might cause certain ideas presented during the treatment to 'fall into place' “ . If the subjects had not taken a posttest, the treatment would not have worked. Threats to External Validity (Ecological) Posttest sensitization
  • 67. Interaction of history and treatment effect Threats to External Validity (Ecological) (...to everything there is a time...) Not only should researchers be cautious about generalizing to other population, caution should be taken to generalize to a different time period. As time passes, the conditions under which treatments work change.
  • 68. (maybe only works with M/C tests) A treatment may only be evident with certain types of measurements. A teaching method may produce superior results when its effectiveness is tested with an essay test, but show no differences when the effectiveness is measured with a multiple choice test. Threats to External Validity (Ecological) Measurement of the dependent variable
  • 69. Interaction of time of measurement and treatment effect Threats to External Validity (Ecological) (it takes a while for the treatment to kick in) It may be that the treatment effect does not occur until several weeks after the end of the treatment. In this situation, a posttest at the end of the treatment would show no impact, but a posttest a month later might show an impact.