SlideShare a Scribd company logo
Section 8: DATA COLLECTION AND
INSTRUMENTATION
Section 8
DATA COLLECTION AND
INSTRUMENTATION
Myra Vanessa C. Teofilo
Reporter
James L.Paglinawan Ph.D
Professor
Objectives
1.Recognize the importance of data gathering.
2.Identify the various data collection techniques and sources
of data.
3.Distinguish primary from secondary data sources.
4.Describe the various instruments for data gathering.
5.Cite the advantages of the use of such instruments.
6.Recognize the limitations of certain research instruments.
7.Design instruments for data gathering.
Data Collection
Techniques
Instruments for Data
Gathering
Use of Existing
Data
Questioning
Interview
Opinnionaire
Observation
Reliability
Validity
- know the various sources of existing data
Ex.
census data – NSO
statistics on health – DOH
employment statistics – DOLE
students attributes, characteristics - SCHOOL
Use of Existing
Data
Sources of
Data
Primary Sources
- provide information collected for the
first time
Ex. Newspaper stories, personal letters,
public documents, eyewitness, verbal
accounts…
Secondary Sources
- data collected previously and
reported by some individual
- borrow the knowledge they contain
from other sources
o Provide all the data needed instantly
o Save time and money
o Data provided are already tabulated
Advantages of Existing
Data
Limitations
o Definitions of variables may not
correspond to own definitions
o Difficult to dig private files
o Difficulty in evaluating the quality of
data
- through the use of self-administered questionnaire or use of an interview schedule
- self-administered may be mailed or personally administered
Questioning
Advantages of Questioning
o Less expensive to administer
o Greater confidence of respondent’s
anonymity
o Less pressure of the respondents
Limitations
o The amount of information gathered
is limited by the respondents
availability, time & interest span
o Researcher cannot probe into a topic
if topic is unclear to the respondent
o Problem of return (mailed)
1. Define or qualify terms that could easily be misinterpreted.
2. Beware of double negatives.
3. Be careful of inadequate alternatives.
4. Double-barreled questions should be avoided.
5. Underline a word if you wish to indicate special emphasis.
6. When asking fro ratings or comparisons, a point of reference is necessary.
7. Design questions that will give a complete response.
8. Phrase the questions so that they are appropriate for all respondents.
9. Questions must not suggest answers.
Guidelines for the Formulation of Questions
Questionnaire must be:
1. As short as possible
2. Attractive in appearance
3. Instructions are clear
4. Easy to understand
Pretest the instrument
- try out to selected samples for
modifications and revisions
- face-to-face interaction between two persons
- interviewer- one who asks questions
- interviewee/respondent- the one who supplies the information asked for
- interview schedule- formal list of questions used in the interview
Interview
Scheduled-
structured
- Uses an instrument in
which the questions are
identical for every
respondent
3 Basic Types of
Interview
Nonscheduled-
structured
- Uses only guide
questions for the
interview
Nonscheduled
- Does not use pre-
specified set of
questions
- No direction from the
interviewer
1. First establish rapport with the respondent.
2. If a scheduled-structured interview is used, ask the question precisely as specified
in the schedule.
3. Conducted in an informal and relaxed atmosphere.
4. Questions that are misinterpreted or misunderstood should be repeated and
clarified.
5. Responses should be recorded exactly as stated.
6. Thank the respondent and make an appointment for a possible call back.
Guidelines for Interview
- an instrument that measured attitude or belief of an individual
Opinionnaire
SEMANTIC DIFFERENTIAL SCALE
- find the meanings that objects and
people possess
- consists of: number of paired
adjectives, opposite in meaning, with
seven blanks between them
LIKERT SCALE
- another measurement of attitudes,
feelings and behavior
- most commonly used attitude scale in
educational research
SEMANTIC DIFFERENTIAL SCALE
Determine the students’ attitudes toward mathematics.
Direction: Complete each statement by using the descriptors. Mark √ (check) on
the part of the scale which closely matches your description.
AS A MATHEMATICS STUDENT, I AM
Industrious ___:___:___:___:___:___:___: Lazy
Interested ___:___:___:___:___:___:___: Bored
Passive ___:___:___:___:___:___:___: Active
Illogical ___:___:___:___:___:___:___: Logical
Well-behaved ___:___:___:___:___:___:___: Disorderly
√
√
√
√
√
The above answer mean you are not very lazy, generally
interested, very active, not very illogical, and very well-behaved.
LIKERT SCALE
ATTITUDES TOWARD MATHEMATICS
Directions: Read each statement carefully. Write
SA : if you STRONGLY AGREE with the statement,
A : if you AGREE with the statement,
U : if you are UNCERTAIN,
D : if you DISAGREE with the statement,
SD : if you STRONGLY DISAGREE with the statement.
Write your response on the separate answer sheet provided.
1. Mathematics is a subject I am afraid of.
2. When I work with mathematics problems, my thinking and reasoning are
sharpened.
3. I feel excited learning mathematics.
4. Learning mathematics makes me feel bored.
5. Mathematics is the subject I greatly enjoy.
- process whereby researcher watches the research situation
- used when respondents are unwilling to express themselves verbally
- most appropriately used in researches involving teaching-learning conditions,
interactions, physical behavior, group interaction
Observation
Guidelines
Advantages & Limitations
Methods
1. The observation scheme must be carefully planned. The observer is usually
equipped with either structured or unstructured observation guide.
2. The observer must be objective.
3. The observer must be able to separate facts from interpretation of the facts.
4. Observations must be carefully and expertly recorded.
Guidelines
1. Most direct means of studying a wide variety of phenomena.
2. Less subjects under observation, permits recording of data.
Advantages
Limitations
1. People deliberately try to create favorable or unfavorable impressions when
they know they are being observed.
2. Limited by the duration of events.
3. There are unforeseeable factors such as weather conditions that may
interfere with observational tasks.
TESTS
- systematic procedure in which
individual tested is presented
with a set of constructed stimuli
to which he responds
- ex. intelligence tests, aptitude
tests, achievement tests and
personality tests
Methods
SCALE
- set of symbols or numerals so
constructed that the symbols or
numerals can be assigned by
rule to the individuals (or
behavior) to whom the scale is
applied
- ex. attitude scale like Likert
scale, Thurnstones scale (equal-
appearing interval scales),
Guttman scale (cumulative)
- estimates of the degree to which a measurement is free of random or unstable
error
- the extent to which a test is dependable, stable and self-consistent
- high degree of reliability means good measurement and evaluation
Reliability
Stability
3 Approaches to Reliability
Equivalence Internal Consistency
- one can secure consistent results
with repeated measurements of
the same person with the same
instrument
- Method: TEST-RETEST
- same test or instrument is
administered twice to the same
group of subjects
Stability
Student Score on the 1st
administration
Score on the 2nd
administration
1 74 78
2 56 51
3 87 87
4 90 92
5 76 80
6 66 69
7 83 88
8 92 95
9 75 75
10 80 82
Example
- Correlation is 0.98 which mean the
test is reliable.
Reliability Indices & Interpretation:
0.6 and above Reliable
Below 0.6 Not reliable
1. In case when the time interval is short, the subjects may recall his
previous responses and this tends to make the correlation coefficient
high.
2. In case when the time interval is long, such factors as unlearning,
forgetting, and so on, may occur and result in low correlation of the test.
3. Regardless of the time interval between the two test administrations,
environmental conditions such as temperature, lightning, and noise may
affect the correlation of the instrument.
Limitations
- how much error may be introduced by different investigators (in observations)
or different samples of items being studied (in questioning or scales)
- major interest is how well a given set of items will categorize individuals
- GOOD EQUIVALENCY if a person is classified the same way by each test
- Method: PARALLEL FORMS (ALTERNATE/EQUIVALENT FORMS)
- administered to the group of subjects
- criteria parallelism – two forms of the test must be constructed so
that the content type of item, difficulty, instructions for administration
are similar but not identical
- ex. Convert 3,000 grams to kilograms (Form A)
Convert 3 kilograms to grams (Form B)
Equivalence
- only one administration of a test in order to assess consistency or homogeneity
among the items
Internal Consistency
Methods
Split-Half
Method Kuder-
Richardson
Method
1. K-R 20
2. K-R 21
Cronbach’s
Coefficient
Alpha
- used when the measuring tool has many similar statements or questions to
which the subject can respond
- after the administration, results are separated by item into even & odd numbers
or into two randomly selected halves
- highly reliable when correlation coefficient is very high
- SPEARMAN-BROWN PROPHECY FORMULA – used to correct or
adjust for the effect of test length & to estimate the reliability of the whole test
Split-Half
Method
where,
rw = the correlation for the whole test
rh = the correlation between the two halves of the test
- most widely accepted methods for estimating reliability
- Measure the extent to which items within one form of the test have as much in
common with one another as do the items in that one form with corresponding
items in an equivalent form
- normally yields higher estimates reliability
Kuder-
Richardson
Methods
KUDER-RICHARDSON
FORMULA 20 (K-R 20)
- advisable to use if the p values
(proportion of correct
responses to a particular item)
vary a lot
KUDER-RICHARDSON FORMULA 21
(K-R 21)
- advisable to use if the items do not
vary much in difficulty
- p values (proportion of correct
responses to a particular item) are
more or less similar
KUDER-RICHARDSON
FORMULA 20 (K-R 20)
where,
n = the number of test items
Σ = summation
p = the proportion of correct
responses to a particular item
q = 1-p
s2 = the variance of the scores on
the test
KUDER-RICHARDSON FORMULA
21 (K-R 21)
where,
n = the number of test items
X = the mean score of the test
s2 = the variance of the scores of the
test
- applicable to multiple scores tests –
those that are not scored right or wrong,
but, respondent receive a different
numerical score on an item depending on
his choice: “Strongly Agree”, “A”, “U”, “D”,
“SD”
- advisable for: essay items, problem
solving, five-scaled items (Likert),
semantic differential scale
Cronbach’s
Coefficient
Alpha
where,
n = the number of test items
si
2 = the variance of a single test
item
s2 = the variance of the scores on
the test
- external sources of variation are minimized & the conditions are standardized
Increasing Reliability
Lists that affect reliability:
1. The coefficient will be greater for…
2. The coefficient will be lower for…
1. A long test.
2. A test over homogeneous content rather than heterogeneous content.
3. A set of scores from a group of examinees with a wide ability range that
causes a wide achievement score range rather than from a group that has
members much alike.
4. A test composed of well-written and appropriate items.
5. Measures with few scoring errors than for measures that vary from test to test
or paper to paper because of scoring procedure alone.
6. Test scores obtained by proper conditions for testing and students with
optimum motivation.
The coefficient will be greater for…
1. A test that is too short.
2. Content that is heterogeneous.
3. A test of poorly written items.
4. A test with poor format.
5. Poor testing conditions.
6. Many items of very low or very high difficulty.
7. A test that has been scored with many errors.
8. A low level of motivation for examinees.
The coefficient will be lower for…
- the extent to which it measures what it claims to measure
- inferences made from it are appropriate, meaningful and useful
Validity
3 categories
Content
Validity
Criterion-
related
Validity
Construct
Validity
- degree to which the questions, and items on a test
are representative of the universe of behavior the
test was designed to sample
- if the specified items on the test is representative of
all possible items, the test is valid
Content
Validity
- ability of the test to predict performance on another measure
- Test – predictor
- Validation measure – criterion
- predicting future performance
Ex. CMUCAT Score
- used to select students for admission to the university
- used to predict the likelihood of succeeding in college
Criterion-
related
Validity
PREDICTIVE VALIDITY STUDIES
- criterion measure is obtained in
the future
PREDICTIVE VALIDITY STUDIES
- criterion measure is obtained in
the future
- a relevant criterion for a college
entrance examination would be
the freshmen-year grade point
average
CONCURRENT VALIDITY STUDIES
- the correlation between test
scores and a current criterion
measure is determined
- Ex. – if we want to find out if
CMUCAT predicts college GPA,
correlate CMUCAT scores with
GPA
- CONSTRUCT – intangible quality/trait in which individuals differ
(mathematics, teacher, school)
- constructs are inferred from behavior, attitudes and feelings
- CONSTRUCT VALIDITY – appropriateness of these inferences about the
underlying construct
Construct
Validity
- done by examining the physical appearance of the instrument
- valid to test users, examiners and especially the examinees
Face
Validity
Data Collection and Instrumentation

More Related Content

PPTX
Measurement in social science research
PPTX
Research method
PPTX
Scales of measurement (1)
PPTX
Research methodology
PPS
Scales of Measurement
PPTX
Scaling Techniques
PPT
Scalling technique
PPTX
Measurement, scaling and sampling
Measurement in social science research
Research method
Scales of measurement (1)
Research methodology
Scales of Measurement
Scaling Techniques
Scalling technique
Measurement, scaling and sampling

What's hot (18)

PDF
7 measurement & questionnaires design (Dr. Mai,2014)
PPTX
eeMba ii rm unit-3.1 measurement & scaling a
PPTX
Scales and Measures in Research
PPT
Measurement in Marketing Research
PPTX
Measurement
PPTX
Attitude scale
PPTX
PPTX
Measurement and scaling
PPTX
Lesson 5 chapter 3
PPTX
Presentation on nominal and ordinal scales of measurement
PDF
Writing the Theoretical and Conceptual Framework of a Quantitative Research
PPTX
Rating scale and sociometry (2)
PPT
Lecture 07
PPTX
Spss measurement scales
PPTX
Mesurement & scaling- Sem Shaikh
PPTX
Measurement and scaling techniques
7 measurement & questionnaires design (Dr. Mai,2014)
eeMba ii rm unit-3.1 measurement & scaling a
Scales and Measures in Research
Measurement in Marketing Research
Measurement
Attitude scale
Measurement and scaling
Lesson 5 chapter 3
Presentation on nominal and ordinal scales of measurement
Writing the Theoretical and Conceptual Framework of a Quantitative Research
Rating scale and sociometry (2)
Lecture 07
Spss measurement scales
Mesurement & scaling- Sem Shaikh
Measurement and scaling techniques
Ad

Similar to Data Collection and Instrumentation (20)

PPTX
mndssassjhhhsvnaadasasdasdawdssdqq1.pptx
PDF
Surveys And Interviews CXBJKZCKLJLA;SLDASD;LKASL;KDASKDKAS
PPT
Quantitative Research
PPTX
PResearchcommunityforallstudents..1.pptx
PPTX
Quantitative & Qualitative Data Collection.pptx
PPTX
caharacter instrument chapter 3 ppt for reporting
PPTX
DATA COLLECTION AND INSTRUMENTATION
PPTX
yturi900.pptx
PPTX
Q2-PPT EAPP okeiehwihwhsuhe0282625728282636.ppt
PPTX
Class demo in teaching (ugly version)
PPTX
research-instruments (1).pptx
PPT
Malimu data collection methods
PPTX
Statistical Instruments in the mathematics
PPTX
Ch 2 types of research
PPTX
Chapter 2 Types of Research
PPTX
Week-1_PR2-1.pptx
PDF
Review of descriptive statistics
DOCX
Correlation and Regression StudyBackground During this week .docx
DOCX
Research the process of data collection
PPTX
Standardization of a test by Dr. Neha Deo
mndssassjhhhsvnaadasasdasdawdssdqq1.pptx
Surveys And Interviews CXBJKZCKLJLA;SLDASD;LKASL;KDASKDKAS
Quantitative Research
PResearchcommunityforallstudents..1.pptx
Quantitative & Qualitative Data Collection.pptx
caharacter instrument chapter 3 ppt for reporting
DATA COLLECTION AND INSTRUMENTATION
yturi900.pptx
Q2-PPT EAPP okeiehwihwhsuhe0282625728282636.ppt
Class demo in teaching (ugly version)
research-instruments (1).pptx
Malimu data collection methods
Statistical Instruments in the mathematics
Ch 2 types of research
Chapter 2 Types of Research
Week-1_PR2-1.pptx
Review of descriptive statistics
Correlation and Regression StudyBackground During this week .docx
Research the process of data collection
Standardization of a test by Dr. Neha Deo
Ad

Recently uploaded (20)

PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
20th Century Theater, Methods, History.pptx
PDF
IGGE1 Understanding the Self1234567891011
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
1_English_Language_Set_2.pdf probationary
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PPTX
Introduction to pro and eukaryotes and differences.pptx
PDF
Computing-Curriculum for Schools in Ghana
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
AI-driven educational solutions for real-life interventions in the Philippine...
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
LDMMIA Reiki Yoga Finals Review Spring Summer
What if we spent less time fighting change, and more time building what’s rig...
20th Century Theater, Methods, History.pptx
IGGE1 Understanding the Self1234567891011
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
1_English_Language_Set_2.pdf probationary
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Introduction to pro and eukaryotes and differences.pptx
Computing-Curriculum for Schools in Ghana
TNA_Presentation-1-Final(SAVE)) (1).pptx
B.Sc. DS Unit 2 Software Engineering.pptx
Paper A Mock Exam 9_ Attempt review.pdf.
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf

Data Collection and Instrumentation

  • 1. Section 8: DATA COLLECTION AND INSTRUMENTATION
  • 2. Section 8 DATA COLLECTION AND INSTRUMENTATION Myra Vanessa C. Teofilo Reporter James L.Paglinawan Ph.D Professor
  • 3. Objectives 1.Recognize the importance of data gathering. 2.Identify the various data collection techniques and sources of data. 3.Distinguish primary from secondary data sources. 4.Describe the various instruments for data gathering. 5.Cite the advantages of the use of such instruments. 6.Recognize the limitations of certain research instruments. 7.Design instruments for data gathering.
  • 4. Data Collection Techniques Instruments for Data Gathering Use of Existing Data Questioning Interview Opinnionaire Observation Reliability Validity
  • 5. - know the various sources of existing data Ex. census data – NSO statistics on health – DOH employment statistics – DOLE students attributes, characteristics - SCHOOL Use of Existing Data Sources of Data Primary Sources - provide information collected for the first time Ex. Newspaper stories, personal letters, public documents, eyewitness, verbal accounts… Secondary Sources - data collected previously and reported by some individual - borrow the knowledge they contain from other sources
  • 6. o Provide all the data needed instantly o Save time and money o Data provided are already tabulated Advantages of Existing Data Limitations o Definitions of variables may not correspond to own definitions o Difficult to dig private files o Difficulty in evaluating the quality of data
  • 7. - through the use of self-administered questionnaire or use of an interview schedule - self-administered may be mailed or personally administered Questioning Advantages of Questioning o Less expensive to administer o Greater confidence of respondent’s anonymity o Less pressure of the respondents Limitations o The amount of information gathered is limited by the respondents availability, time & interest span o Researcher cannot probe into a topic if topic is unclear to the respondent o Problem of return (mailed)
  • 8. 1. Define or qualify terms that could easily be misinterpreted. 2. Beware of double negatives. 3. Be careful of inadequate alternatives. 4. Double-barreled questions should be avoided. 5. Underline a word if you wish to indicate special emphasis. 6. When asking fro ratings or comparisons, a point of reference is necessary. 7. Design questions that will give a complete response. 8. Phrase the questions so that they are appropriate for all respondents. 9. Questions must not suggest answers. Guidelines for the Formulation of Questions Questionnaire must be: 1. As short as possible 2. Attractive in appearance 3. Instructions are clear 4. Easy to understand Pretest the instrument - try out to selected samples for modifications and revisions
  • 9. - face-to-face interaction between two persons - interviewer- one who asks questions - interviewee/respondent- the one who supplies the information asked for - interview schedule- formal list of questions used in the interview Interview Scheduled- structured - Uses an instrument in which the questions are identical for every respondent 3 Basic Types of Interview Nonscheduled- structured - Uses only guide questions for the interview Nonscheduled - Does not use pre- specified set of questions - No direction from the interviewer
  • 10. 1. First establish rapport with the respondent. 2. If a scheduled-structured interview is used, ask the question precisely as specified in the schedule. 3. Conducted in an informal and relaxed atmosphere. 4. Questions that are misinterpreted or misunderstood should be repeated and clarified. 5. Responses should be recorded exactly as stated. 6. Thank the respondent and make an appointment for a possible call back. Guidelines for Interview
  • 11. - an instrument that measured attitude or belief of an individual Opinionnaire SEMANTIC DIFFERENTIAL SCALE - find the meanings that objects and people possess - consists of: number of paired adjectives, opposite in meaning, with seven blanks between them LIKERT SCALE - another measurement of attitudes, feelings and behavior - most commonly used attitude scale in educational research
  • 12. SEMANTIC DIFFERENTIAL SCALE Determine the students’ attitudes toward mathematics. Direction: Complete each statement by using the descriptors. Mark √ (check) on the part of the scale which closely matches your description. AS A MATHEMATICS STUDENT, I AM Industrious ___:___:___:___:___:___:___: Lazy Interested ___:___:___:___:___:___:___: Bored Passive ___:___:___:___:___:___:___: Active Illogical ___:___:___:___:___:___:___: Logical Well-behaved ___:___:___:___:___:___:___: Disorderly √ √ √ √ √ The above answer mean you are not very lazy, generally interested, very active, not very illogical, and very well-behaved.
  • 13. LIKERT SCALE ATTITUDES TOWARD MATHEMATICS Directions: Read each statement carefully. Write SA : if you STRONGLY AGREE with the statement, A : if you AGREE with the statement, U : if you are UNCERTAIN, D : if you DISAGREE with the statement, SD : if you STRONGLY DISAGREE with the statement. Write your response on the separate answer sheet provided. 1. Mathematics is a subject I am afraid of. 2. When I work with mathematics problems, my thinking and reasoning are sharpened. 3. I feel excited learning mathematics. 4. Learning mathematics makes me feel bored. 5. Mathematics is the subject I greatly enjoy.
  • 14. - process whereby researcher watches the research situation - used when respondents are unwilling to express themselves verbally - most appropriately used in researches involving teaching-learning conditions, interactions, physical behavior, group interaction Observation Guidelines Advantages & Limitations Methods
  • 15. 1. The observation scheme must be carefully planned. The observer is usually equipped with either structured or unstructured observation guide. 2. The observer must be objective. 3. The observer must be able to separate facts from interpretation of the facts. 4. Observations must be carefully and expertly recorded. Guidelines
  • 16. 1. Most direct means of studying a wide variety of phenomena. 2. Less subjects under observation, permits recording of data. Advantages Limitations 1. People deliberately try to create favorable or unfavorable impressions when they know they are being observed. 2. Limited by the duration of events. 3. There are unforeseeable factors such as weather conditions that may interfere with observational tasks.
  • 17. TESTS - systematic procedure in which individual tested is presented with a set of constructed stimuli to which he responds - ex. intelligence tests, aptitude tests, achievement tests and personality tests Methods SCALE - set of symbols or numerals so constructed that the symbols or numerals can be assigned by rule to the individuals (or behavior) to whom the scale is applied - ex. attitude scale like Likert scale, Thurnstones scale (equal- appearing interval scales), Guttman scale (cumulative)
  • 18. - estimates of the degree to which a measurement is free of random or unstable error - the extent to which a test is dependable, stable and self-consistent - high degree of reliability means good measurement and evaluation Reliability Stability 3 Approaches to Reliability Equivalence Internal Consistency
  • 19. - one can secure consistent results with repeated measurements of the same person with the same instrument - Method: TEST-RETEST - same test or instrument is administered twice to the same group of subjects Stability Student Score on the 1st administration Score on the 2nd administration 1 74 78 2 56 51 3 87 87 4 90 92 5 76 80 6 66 69 7 83 88 8 92 95 9 75 75 10 80 82 Example - Correlation is 0.98 which mean the test is reliable. Reliability Indices & Interpretation: 0.6 and above Reliable Below 0.6 Not reliable
  • 20. 1. In case when the time interval is short, the subjects may recall his previous responses and this tends to make the correlation coefficient high. 2. In case when the time interval is long, such factors as unlearning, forgetting, and so on, may occur and result in low correlation of the test. 3. Regardless of the time interval between the two test administrations, environmental conditions such as temperature, lightning, and noise may affect the correlation of the instrument. Limitations
  • 21. - how much error may be introduced by different investigators (in observations) or different samples of items being studied (in questioning or scales) - major interest is how well a given set of items will categorize individuals - GOOD EQUIVALENCY if a person is classified the same way by each test - Method: PARALLEL FORMS (ALTERNATE/EQUIVALENT FORMS) - administered to the group of subjects - criteria parallelism – two forms of the test must be constructed so that the content type of item, difficulty, instructions for administration are similar but not identical - ex. Convert 3,000 grams to kilograms (Form A) Convert 3 kilograms to grams (Form B) Equivalence
  • 22. - only one administration of a test in order to assess consistency or homogeneity among the items Internal Consistency Methods Split-Half Method Kuder- Richardson Method 1. K-R 20 2. K-R 21 Cronbach’s Coefficient Alpha
  • 23. - used when the measuring tool has many similar statements or questions to which the subject can respond - after the administration, results are separated by item into even & odd numbers or into two randomly selected halves - highly reliable when correlation coefficient is very high - SPEARMAN-BROWN PROPHECY FORMULA – used to correct or adjust for the effect of test length & to estimate the reliability of the whole test Split-Half Method where, rw = the correlation for the whole test rh = the correlation between the two halves of the test
  • 24. - most widely accepted methods for estimating reliability - Measure the extent to which items within one form of the test have as much in common with one another as do the items in that one form with corresponding items in an equivalent form - normally yields higher estimates reliability Kuder- Richardson Methods KUDER-RICHARDSON FORMULA 20 (K-R 20) - advisable to use if the p values (proportion of correct responses to a particular item) vary a lot KUDER-RICHARDSON FORMULA 21 (K-R 21) - advisable to use if the items do not vary much in difficulty - p values (proportion of correct responses to a particular item) are more or less similar
  • 25. KUDER-RICHARDSON FORMULA 20 (K-R 20) where, n = the number of test items Σ = summation p = the proportion of correct responses to a particular item q = 1-p s2 = the variance of the scores on the test KUDER-RICHARDSON FORMULA 21 (K-R 21) where, n = the number of test items X = the mean score of the test s2 = the variance of the scores of the test
  • 26. - applicable to multiple scores tests – those that are not scored right or wrong, but, respondent receive a different numerical score on an item depending on his choice: “Strongly Agree”, “A”, “U”, “D”, “SD” - advisable for: essay items, problem solving, five-scaled items (Likert), semantic differential scale Cronbach’s Coefficient Alpha where, n = the number of test items si 2 = the variance of a single test item s2 = the variance of the scores on the test
  • 27. - external sources of variation are minimized & the conditions are standardized Increasing Reliability Lists that affect reliability: 1. The coefficient will be greater for… 2. The coefficient will be lower for…
  • 28. 1. A long test. 2. A test over homogeneous content rather than heterogeneous content. 3. A set of scores from a group of examinees with a wide ability range that causes a wide achievement score range rather than from a group that has members much alike. 4. A test composed of well-written and appropriate items. 5. Measures with few scoring errors than for measures that vary from test to test or paper to paper because of scoring procedure alone. 6. Test scores obtained by proper conditions for testing and students with optimum motivation. The coefficient will be greater for…
  • 29. 1. A test that is too short. 2. Content that is heterogeneous. 3. A test of poorly written items. 4. A test with poor format. 5. Poor testing conditions. 6. Many items of very low or very high difficulty. 7. A test that has been scored with many errors. 8. A low level of motivation for examinees. The coefficient will be lower for…
  • 30. - the extent to which it measures what it claims to measure - inferences made from it are appropriate, meaningful and useful Validity 3 categories Content Validity Criterion- related Validity Construct Validity
  • 31. - degree to which the questions, and items on a test are representative of the universe of behavior the test was designed to sample - if the specified items on the test is representative of all possible items, the test is valid Content Validity
  • 32. - ability of the test to predict performance on another measure - Test – predictor - Validation measure – criterion - predicting future performance Ex. CMUCAT Score - used to select students for admission to the university - used to predict the likelihood of succeeding in college Criterion- related Validity PREDICTIVE VALIDITY STUDIES - criterion measure is obtained in the future
  • 33. PREDICTIVE VALIDITY STUDIES - criterion measure is obtained in the future - a relevant criterion for a college entrance examination would be the freshmen-year grade point average CONCURRENT VALIDITY STUDIES - the correlation between test scores and a current criterion measure is determined - Ex. – if we want to find out if CMUCAT predicts college GPA, correlate CMUCAT scores with GPA
  • 34. - CONSTRUCT – intangible quality/trait in which individuals differ (mathematics, teacher, school) - constructs are inferred from behavior, attitudes and feelings - CONSTRUCT VALIDITY – appropriateness of these inferences about the underlying construct Construct Validity
  • 35. - done by examining the physical appearance of the instrument - valid to test users, examiners and especially the examinees Face Validity

Editor's Notes

  • #35: Comparing psychological traits that theoretically influences scores