SlideShare a Scribd company logo
6
Most read
7
Most read
8
Most read
H. Douglas Brown
Chapter 2 : Principles of Language Assessment
Presentation by: Hamid Najaf Pour Sani
Ph.D. Candidate
Dr. Saeedi
Outline
Reliability
Validity
Practicality
Authenticity
Wash back
5
4
3
2
1
Testing
Reliability
Reliability
Reliability is the extent to which a test produces consistent scores at different administrations to the similar group
of examinees. Reliability is synonymous with Dependability, Stability, Consistency, Predictability and Accuracy.
= +
Accordingly, reliability is defined as the extent to which a test is error free. In fact, your
scores (Obtained Scores) are only a partial representation of your true score ( Real Ability)
and the reason lies in the presence of the factors other than the ability being tested.
Therefore, a reliable test is a test in which true score variance is higher and error score
variance is lower. If the test is error free, the true score will be equal to observed score.
Classical True Score Measurement
Obtained Score Real Ability Other Factors
X = Xt + Xe
X= Observed Score
Xt= True Score
Xe= Error Score
Reliability
1). It refers to psychological and physical factors including “bad day” anxiety, illness, test taker’s “test wiseness” and
fatigue which can make an “observed score” deviate from one’s true score.
A). Inter- Rater Reliability: Scorers yield inconsistent scores of the same test.
2). Rater reliability falls into 2 categories
B). Intra- Rater Reliability: Unclear scoring criteria, bias and carelessness.
3). It basically springs from the conditions in which the test is administered: noisy class, amount of light, chairs ..
4). The test should fit into the time constraints, the test should not be too long or short and test items should be clear.
4). Test Reliability
Reliability falls within 4 kinds
1). Student-related
Reliability
3). Test
Administration
Reliability
2). Rater Reliability
Validity
Validity
Validity is the degree of correspondence between the test content and the content of the material to be tested.
Ex: A valid test of Reading Ability actually measures the reading ability itself: not previous knowledge.
5 Ways to Establish Validity
1). Content Validity 2). Criterion Validity 3). Construct Validity
4). Consequential
Validity
5). Face Validity
Validity
Validity
1). Content Validity: If a test actually samples the subject matter about which conclusions are to be drawn, so it can claim
content-related evidence of validity. Ex
Direct Testing: It requires the test takers to perform the target task directly.
Indirect Testing: Learners are required to perform by the use of indirectly related tasks.
2). Criterion validity: Is the extent to which performance on a test is related to a criterion which is the indicator of the ability
being tested. The criterion may be individuals’ performance on another test or even a known standard.
Concurrent Validity: A test has CV if its results are supported by other concurrent performance beyond the assessment itself.
Predictive Validity: It tends to predict a student’s likelihood of future success
3). Construct Validity: Is the extent to which a test measures just the construct it is supposed to measure.
4). Consequential Validity: It refers to the positive or negative consequences of a particular test. Consequences include its
impact on the preparation of test takers, on learners, social consequences and washback as well.
5). Face Validity: It is the extent to which the measurement method “on its face” appears to measure the particular ability.
It is generally based on the subjective judgment of the examinees.
Speaking Multiple-choice
Oral production
TOEFL
Depression
Practicality and Authenticity
P & A
Practicality: It is defined as the relationship between the resources that will be required in the design, development, and
use of the test and the resources that will be available for these activities. It is represented as following figure:
Authenticity: Is the extent to which the tasks required on a given test are similar to normal “real life” language use, in
other words, it is the degree of correspondence between tests, tasks, and activities of target language use.
Therefore, the higher the correspondence, the more authentic the test.
Authenticity may be present in the following ways:
 Brown (2004:19) defines practicality is in terms of:
1) Cost
2) Time
3) Administration
4) Scoring / Evaluation
1. The language in the test is natural as possible.
2. Items are contextualized rather than isolated.
3. Topics are meaningful (relevant, interesting) to the learners.
4. Some thematic organization to items is provided, such as through a story or
episode.
5. Tasks represent, or closely approximate, real-world tasks.
Washback/Backwash
washback
Washback Effect: Generally, it is the influence of the nature of a test on teaching and learning.
1). Negative Washback: When test and testing techniques are at variance with the objectives of the course. Tests which
have negative washback is considered to have negative influence on teaching and learning.
Ex: Taking an English course to be trained in 4 language skills, however the language test does not test those skills.
2). Positive Washback: Positive washback would result when a testing procedure encourages “good” teaching practices.
EX: The consequence of many reading comprehension tests is a possible development of the reading skills.
2 kinds of washback
1). Negative Washback 2). Positive Washback
Chapter 2: Principles of Language Assessment

More Related Content

PPT
Chapter 2(principles of language assessment)
PPTX
Principles of language assessment
PPTX
Designing classroom language tests
PPTX
Principles of language assessment
PPTX
Designing classroom language tests
PPTX
Designing classroom language tests
PPTX
Assessing speaking
PDF
Testing grammar and vocabulary.pdf
Chapter 2(principles of language assessment)
Principles of language assessment
Designing classroom language tests
Principles of language assessment
Designing classroom language tests
Designing classroom language tests
Assessing speaking
Testing grammar and vocabulary.pdf

What's hot (20)

PPT
Chapter 3(designing classroom language tests)
PPTX
Kinds of tests and testing
PPT
Introduction to Language Assessment by Brown
PPT
How to make tests more reliable
PPTX
Types of tests and types of testing
PPTX
Validity, reliablility, washback
PPTX
Input vs output hypothesis
PPTX
Testing, assessing, and teaching
PPTX
Test Techniques
PPT
Communicative language testing
PDF
Understanding Authenticity in Language Teaching & Assessment
PPTX
Task based syllabus
PPTX
Notional functional syllabus
PPTX
Needs analysis in syllabus design.pptx
PPTX
Language Testing
PPTX
Testing for Language Teachers Arthur Hughes
PPTX
Motivation in second language acquisition
PPTX
discrete-point and integrative testing
PPT
ASSESSMENT: DISCRETE POINT TEST, INTEGRATIVE TESTING, PERFORMANCE-BASED ASSES...
Chapter 3(designing classroom language tests)
Kinds of tests and testing
Introduction to Language Assessment by Brown
How to make tests more reliable
Types of tests and types of testing
Validity, reliablility, washback
Input vs output hypothesis
Testing, assessing, and teaching
Test Techniques
Communicative language testing
Understanding Authenticity in Language Teaching & Assessment
Task based syllabus
Notional functional syllabus
Needs analysis in syllabus design.pptx
Language Testing
Testing for Language Teachers Arthur Hughes
Motivation in second language acquisition
discrete-point and integrative testing
ASSESSMENT: DISCRETE POINT TEST, INTEGRATIVE TESTING, PERFORMANCE-BASED ASSES...
Ad

Similar to Chapter 2: Principles of Language Assessment (20)

PPTX
3232423232323232323232323232323232323 .pptx
PPTX
Principles of Language Assessment
PPTX
Language Assessments - Key Features and Concepts
PPTX
The nittygritty of language testing
PPTX
presentation of Requirements of a good test..pptx
PDF
Principles of language assessment
PPTX
Basic Principles of Language Assessment.pptx
PPTX
PRINCIPLES OF ASSESSMENT 2.pptx
PPTX
CHARACTERISTICS OF A GOOD INSTRUMENT
PDF
Principles of Language Assessment
PPTX
Learning_activity1_Moreno Agama_Lourdes Magdalena.pptx
PPTX
LANGUAGE ASSESSMENTS STRATEGIE.ELLE.pptx
DOCX
CLASSROOM ACTIVITIES
DOC
Testing
PPTX
ppt language as..pptx
PPTX
Learning_activity1_Moreno Agama_Lourdes Magdalena.pptx
PPT
Principles_of_language_testing.ppt
PPT
Principles of Lang Assessment_Recently RvsdRe.ppt
PPTX
ASSESSMENT.pptx
PPTX
Introdcution to Language Assessment including with its aspects.
3232423232323232323232323232323232323 .pptx
Principles of Language Assessment
Language Assessments - Key Features and Concepts
The nittygritty of language testing
presentation of Requirements of a good test..pptx
Principles of language assessment
Basic Principles of Language Assessment.pptx
PRINCIPLES OF ASSESSMENT 2.pptx
CHARACTERISTICS OF A GOOD INSTRUMENT
Principles of Language Assessment
Learning_activity1_Moreno Agama_Lourdes Magdalena.pptx
LANGUAGE ASSESSMENTS STRATEGIE.ELLE.pptx
CLASSROOM ACTIVITIES
Testing
ppt language as..pptx
Learning_activity1_Moreno Agama_Lourdes Magdalena.pptx
Principles_of_language_testing.ppt
Principles of Lang Assessment_Recently RvsdRe.ppt
ASSESSMENT.pptx
Introdcution to Language Assessment including with its aspects.
Ad

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
RMMM.pdf make it easy to upload and study
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Pharma ospi slides which help in ospi learning
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Pre independence Education in Inndia.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
Lesson notes of climatology university.
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Cell Structure & Organelles in detailed.
PDF
TR - Agricultural Crops Production NC III.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Anesthesia in Laparoscopic Surgery in India
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Final Presentation General Medicine 03-08-2024.pptx
RMMM.pdf make it easy to upload and study
FourierSeries-QuestionsWithAnswers(Part-A).pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Pharma ospi slides which help in ospi learning
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Pre independence Education in Inndia.pdf
Sports Quiz easy sports quiz sports quiz
Lesson notes of climatology university.
O5-L3 Freight Transport Ops (International) V1.pdf
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Microbial disease of the cardiovascular and lymphatic systems
Cell Structure & Organelles in detailed.
TR - Agricultural Crops Production NC III.pdf

Chapter 2: Principles of Language Assessment

  • 1. H. Douglas Brown Chapter 2 : Principles of Language Assessment Presentation by: Hamid Najaf Pour Sani Ph.D. Candidate Dr. Saeedi
  • 3. Reliability Reliability Reliability is the extent to which a test produces consistent scores at different administrations to the similar group of examinees. Reliability is synonymous with Dependability, Stability, Consistency, Predictability and Accuracy. = + Accordingly, reliability is defined as the extent to which a test is error free. In fact, your scores (Obtained Scores) are only a partial representation of your true score ( Real Ability) and the reason lies in the presence of the factors other than the ability being tested. Therefore, a reliable test is a test in which true score variance is higher and error score variance is lower. If the test is error free, the true score will be equal to observed score. Classical True Score Measurement Obtained Score Real Ability Other Factors X = Xt + Xe X= Observed Score Xt= True Score Xe= Error Score
  • 4. Reliability 1). It refers to psychological and physical factors including “bad day” anxiety, illness, test taker’s “test wiseness” and fatigue which can make an “observed score” deviate from one’s true score. A). Inter- Rater Reliability: Scorers yield inconsistent scores of the same test. 2). Rater reliability falls into 2 categories B). Intra- Rater Reliability: Unclear scoring criteria, bias and carelessness. 3). It basically springs from the conditions in which the test is administered: noisy class, amount of light, chairs .. 4). The test should fit into the time constraints, the test should not be too long or short and test items should be clear. 4). Test Reliability Reliability falls within 4 kinds 1). Student-related Reliability 3). Test Administration Reliability 2). Rater Reliability
  • 5. Validity Validity Validity is the degree of correspondence between the test content and the content of the material to be tested. Ex: A valid test of Reading Ability actually measures the reading ability itself: not previous knowledge. 5 Ways to Establish Validity 1). Content Validity 2). Criterion Validity 3). Construct Validity 4). Consequential Validity 5). Face Validity
  • 6. Validity Validity 1). Content Validity: If a test actually samples the subject matter about which conclusions are to be drawn, so it can claim content-related evidence of validity. Ex Direct Testing: It requires the test takers to perform the target task directly. Indirect Testing: Learners are required to perform by the use of indirectly related tasks. 2). Criterion validity: Is the extent to which performance on a test is related to a criterion which is the indicator of the ability being tested. The criterion may be individuals’ performance on another test or even a known standard. Concurrent Validity: A test has CV if its results are supported by other concurrent performance beyond the assessment itself. Predictive Validity: It tends to predict a student’s likelihood of future success 3). Construct Validity: Is the extent to which a test measures just the construct it is supposed to measure. 4). Consequential Validity: It refers to the positive or negative consequences of a particular test. Consequences include its impact on the preparation of test takers, on learners, social consequences and washback as well. 5). Face Validity: It is the extent to which the measurement method “on its face” appears to measure the particular ability. It is generally based on the subjective judgment of the examinees. Speaking Multiple-choice Oral production TOEFL Depression
  • 7. Practicality and Authenticity P & A Practicality: It is defined as the relationship between the resources that will be required in the design, development, and use of the test and the resources that will be available for these activities. It is represented as following figure: Authenticity: Is the extent to which the tasks required on a given test are similar to normal “real life” language use, in other words, it is the degree of correspondence between tests, tasks, and activities of target language use. Therefore, the higher the correspondence, the more authentic the test. Authenticity may be present in the following ways:  Brown (2004:19) defines practicality is in terms of: 1) Cost 2) Time 3) Administration 4) Scoring / Evaluation 1. The language in the test is natural as possible. 2. Items are contextualized rather than isolated. 3. Topics are meaningful (relevant, interesting) to the learners. 4. Some thematic organization to items is provided, such as through a story or episode. 5. Tasks represent, or closely approximate, real-world tasks.
  • 8. Washback/Backwash washback Washback Effect: Generally, it is the influence of the nature of a test on teaching and learning. 1). Negative Washback: When test and testing techniques are at variance with the objectives of the course. Tests which have negative washback is considered to have negative influence on teaching and learning. Ex: Taking an English course to be trained in 4 language skills, however the language test does not test those skills. 2). Positive Washback: Positive washback would result when a testing procedure encourages “good” teaching practices. EX: The consequence of many reading comprehension tests is a possible development of the reading skills. 2 kinds of washback 1). Negative Washback 2). Positive Washback