SlideShare a Scribd company logo
CHAPTER 32 OF ROUTLEDGE HANDBOOK
FAIRNESS
OVERVIEW
Introduction
1. Roles Of The Stakeholders
2. Roles Of The Test-takers
3. Power Relations In Testing
Test Fairness
1. Test Sensitivity Review
2. Test Bias
Fairness and validity
1. Fairness independent of validity
2. Fairness subsumes validity
3. Fairness and validity are overlapping
4. Fairness as an important aspect of validity
Xi's framework
Kunnan's Test Context Framework
INTRODUCTION
• There are many definitions of fairness.
• Your-dictionary.com (2011) states that to be fair is to be “just and honest,” “impartial,” and “unprejudiced,”
specifically, “free from discrimination based on race, religion, sex, etc.”
• Kunnan (2000) argues that fairness embraces three concerns: validity, accessibility of tests to test takers, and
justice.
INTRODUCTION
Spaan (2000)
Defines fairness as an ideal “in which opportunities are equal,”
“In the natural world, test writers and developers cannot be ‘fair’ in the ideal sense, but … they can try to
equitable.” this equitability is “the joint responsibility of the test developer, the test user, and the examinee, in a
sort of social contract”.
Fairness can be seen as a system or process – as distinct from a quality.
INTRODUCTION
What most language testing views of fairness have in common is a desire to avoid the effects of any construct-
irrelevant factors on the entire testing process, from the test-design stage through post-administration decision-
making.
In this context, one dimension of fairness concerns the roles of the major stakeholders in achieving language
testing fairness. these stakeholders, to use spaan’s (2000) tripartite “social contract” scheme, are:
test developers;
L2/FL learners;
and test users (i.e., teachers, administrators, etc.).
INTRODUCTION
Role Of Test Developers
❑They must try to ensure the validity, reliability, and practicality of their test methods;
❑They must also provide test users with easily understandable guidelines for the use of their tests;
❑They must also solicit feedback from them, to effect further test improvements.
INTRODUCTION
Roles Of The Test Takers
❑They must become familiar with the testing format and overall test content before taking the test;
❑They must try to make sure that the level of the test matches their own skill/knowledge level.
Roles Of The Test Administrators
❑They must give tests to students for whom the tests were designed; to do otherwise would be an instance of
test abuse.
INTRODUCTION
Power Relations In Testing
Viewing test developer, test taker, and test user as parties to a social contract highlights a second issue, namely
the phenomenon of power relations in testing.
Shohamy (2001) points out that language tests have often been used by powerful agencies such as
governments, educational bureaucracies, or school staff for reasons other than the assessment of language
skills. For example, tests have been used to establish discipline, to impose sanctions on schools or teachers, or
to raise the prestige of the subject matter being tested.
INTRODUCTION
Power relations in testing
She offers a set of principles organized under the heading of critical language testing, as a way to engage
language testers “in a wider sphere of social dialogue and debate about … the roles tests … have been assigned
to play in society”
❑Critical language testing “encourages test takers to develop a critical view of tests as well as to act on it by
questioning tests and critiquing the value which is inherent in them.”
A critical language testing perspective asks that all parties to the contract remain vigilant.
Test Fairness
Test Sensitivity Review
One approach to examining whether or not test questions are fair is through a test sensitivity review.
Such reviews are performed by trained judges employed by test-development organizations, who examine test
tasks to determine whether they contain language or content that may be considered stereotyping, patronizing,
inflammatory, or otherwise offensive to test takers belonging to subgroups defined by culture, ethnicity, or
gender.
Test Fairness
Test bias
Is a technical term indicating a testing situation in which a particular test use results in different interpretations
of test scores received by cultural, ethnic, gender, or linguistic subgroups.
Synonymous with DIF.
Bias or DIF is considered to be present when a test item is differentially difficult for a ethnic, cultural, or
gender-related subgroup which is otherwise equally matched with another subgroup in terms of knowledge or
skill.
Among the statistical methods used to uncover DIF are:
the Standardization Procedure;
and the Mantel–Haenszel method.
Fairness And Validity
Kane (2010) points out that the relationship between fairness and validity depends on how the two concepts
are defined: narrowly defining validity and broadly defining fairness will result in validity being considered a
component of fairness. On the other hand, a broad definition of validity and a narrow conceptualization of
fairness will result in fairness being understood as a part of validity.
Fairness And Validity
1. Fairness Independent Of Validity
One example of this is the Standards for Educational and Psychological Testing which define it as having, at
minimum, three components: lack of item bias, the presence of equitable treatment of all test-takers in the
testing process, and equity of opportunity of examinees to learn the material on a given test. While fairness
here is not linked directly with validity, these 1999 standards do mention that fairness “promotes the validity
and reliability of inferences made from test performance”.
Fairness And Validity
2. Fairness Subsumes Validity
Kunnan (2000) articulates a framework in which fairness includes issues of validity, accessibility to test takers,
and justice. Under validity, Kunnan includes issues such as construct validity, DIF, insensitive test-item language,
and content bias. An example of the latter might be a dialect of English employed in the test prompts that differs
in some respects from another English dialect that may constitute the L2 of the test taker. Under accessibility,
Kunnan indicates issues such as affordability, geographic proximity of test taker to the testing site,
accommodations for test takers with disabilities, and opportunity to learn. “Opportunity to learn” is closely
connected with the notion of construct under-representation (messick, 1989), which indicates the ability of a test
to measure some aspects of a construct or skill, but not others.
Fairness And Validity
2. Fairness Subsumes Validity (continued)
A test may be measuring aspects of a construct, such as knowledge of a particular rule of language pragmatics,
that certain test takers will not have had the opportunity to learn and thus score poorly on the test, despite the
fact that they may be proficient in other relevant areas of the construct. Finally, Kunnan’s facet of justice
embraces the notion of whether or not a test contributes to social equity. Kunnan (2004) later modified this
model to include absence of bias, and test-administration conditions.
Fairness And Validity
3. Fairness And Validity Are Overlapping
Kane’s definition of test fairness draws on political and legal concepts. One, procedural due process, states that the
same rules should be applied to everyone in more or less the same way.
Kane also bases his definition on substantive due process, which states that the procedures applied should be
reasonable both in general and in the context in which they are applied. In applying this twin definition of fairness to
assessment, he gives two principles: the first is procedural fairness, in which test takers are treated “in essentially
the same way…take the same test or equivalent tests, under the same conditions or equivalent conditions, and …
their performances [are] evaluated using the same (or essentially the same) rules and procedures.” The second is
substantive fairness, in which the score interpretation and any test-based decision rule are reasonable and
appropriate for all test takers.
Fairness And Validity
4. Fairness As An Important Aspect Of Validity
By Willingham and Cole
In this context, a fair test is one for which both the (preferably, small) extent of statistical error of
measurement, and the inferences (hopefully, reasonable ones) from the test results regarding test-taker ability,
are comparable from individual to individual and from subgroup to subgroup.
They state that comparable validity must be met at all stages of the testing process – when designing the test,
developing the test, administering the test, and using the test results. Comparable validity must be achieved by
selecting test material that does not give an advantage to some test takers for reasons that are irrelevant to the
construct being measured.
Fairness And Validity
4. Fairness As An Important Aspect Of Validity (continued)
Willingham and Cole see fairness as having three qualities linked to validity:
(1) comparable opportunities for test takers to show their knowledge and skills;
(2) comparable test tasks and scores;
(3) comparable treatment of test takers in test interpretation and use.
Xi both expands the definition of test fairness and offers a new framework for investigating fairness issues.
Xi's Framework
Xi states that fairness is “comparable validity for identifiable and relevant groups across all stages of
assessment, from assessment conceptualization to use of assessment results,” where “construct-irrelevant
factors, construct under-representation, inconsistent test administration practices, inappropriate decision-
making procedures or use of test results have no systematic or appreciable effects on test scores, test score
interpretations, score-based decisions and consequences for all relevant groups of examinees”.
Xi's Framework
Consists a fairness argument embedded within a validity argument. A validity argument is a chain of inferences that
leads a test user to appropriate interpretations of test results. Xi’s validity argument framework consists of six
successive sub-arguments, that: (1) there is evidence that the domain of L2 use which is of interest, provides a
meaningful basis for our observations of test-taker performance on the test; (2) there is evidence that the observed
test scores reflect that domain of L2 use and not construct-irrelevant factors; (3) there is evidence that the observed
scores on the test are generalizable over similar language tasks on other, similar tests; (4) there is evidence that the
abovementioned generalization of observed scores can be linked to a theoretical interpretation (i.e., The construct,
the theoretical skill) of such scores; (5) there is evidence that the theoretical construct can explain the L2 use in
actual situations envisioned by the users of the test; and (6) there is evidence that the language-test results are
“relevant, useful, and sufficient” for determining the level of L2 ability. Each of these sub-arguments is supported
by certain assumptions.
Xi's Framework
Embedded in the above chain of sub-arguments and underlying assumptions, xi proposes, can be a fairness
argument, which consists in part of a series of rebuttals, one or more posed to each of the validity sub-
arguments. One can conceive of such rebuttals as research questions into the degree of fairness of a given
language test, i.e., each rebuttal serves as a practical check on the claims of each sub-argument. For example,
to the first of the sub-arguments above, one can ask whether or not the domain of L2 use actually provides a
meaningful basis for observations of test-taker performance.
Kunnan’s Test Context Framework
This approach is intended to consider the wider political, educational, Cultural, social, economic, legal, and historical
aspects of a test. In this it differs somewhat from other, more psychometrically focused approaches considered above,
such as DIF, or even xi’s fairness argument framework. It has a certain overlap with Kane’s (2010) applications of
political and legal concepts to language testing, and it also resonates with the analyses of power relations in language
testing offered by Shohamy (2001), both mentioned above.
Kunnan’s approach thus brings wider social factors into consideration when evaluating the fairness of a language test.

More Related Content

PPTX
Language testing the social dimension
PPT
Reliability in Language Testing
PPTX
Reliability and dependability by neil jones
PPTX
Understanding reliability and validity
PPTX
Ino 520 issues in testing Lyle Bachman (1990) Chapter 1
PPTX
Lyle F. Bachman Measurement ( Chapter 2 )
PDF
Validity and reliability
PDF
Reability & Validity
Language testing the social dimension
Reliability in Language Testing
Reliability and dependability by neil jones
Understanding reliability and validity
Ino 520 issues in testing Lyle Bachman (1990) Chapter 1
Lyle F. Bachman Measurement ( Chapter 2 )
Validity and reliability
Reability & Validity

What's hot (16)

PPTX
Characteristics of a good test
PPT
UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
PPSX
Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee
PPTX
Validity of a Research Tool
PPT
Louzel Report - Reliability & validity
PPT
Chapter 6 selection
PDF
Introduction to language testing (wed, 23 sept 2014)
DOCX
CLASSROOM ACTIVITIES
PPTX
Reliability and validity ppt
PPTX
Article Analysis - Language Testing
PPTX
01 introducción a la evaluación del aprendizaje de idiomas
PPTX
Project desing
PDF
Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...
PDF
Development of pyschologica test construction
PPT
15th batch NPTI Validity & Reliablity Business Research Methods
PPTX
Introduction to standard setting (cutscores)
Characteristics of a good test
UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee
Validity of a Research Tool
Louzel Report - Reliability & validity
Chapter 6 selection
Introduction to language testing (wed, 23 sept 2014)
CLASSROOM ACTIVITIES
Reliability and validity ppt
Article Analysis - Language Testing
01 introducción a la evaluación del aprendizaje de idiomas
Project desing
Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...
Development of pyschologica test construction
15th batch NPTI Validity & Reliablity Business Research Methods
Introduction to standard setting (cutscores)
Ad

Similar to the Routledge hanbook of language testing Ch 32. fairness (20)

PPTX
Fairness in language testing
PDF
8. brown & hudson 1998 the alternatives in language assessment
PPTX
How do we go about investigating test fairness
PPT
Qualities of a Good Test
PDF
Reliability And Validity
PPTX
Validity in Research
PPT
Test characteristics
PPTX
Presentation validity
PPTX
Week 8 & 9 - Validity and Reliability
PDF
TEST DEVELOPMENT AND EVALUATION (6462)
PPTX
Validity & reliability seminar
PDF
Principles of Language Assessment
PPTX
VALIDITY
PPTX
3232423232323232323232323232323232323 .pptx
DOCX
Validity and objectivity of tests
DOCX
Running head ASSESSING A CLIENT .docx
PPTX
Enhancing fairness through a social contract
PPTX
PPTX
The validity of Assessment.pptx
PPTX
Principles of assessment
Fairness in language testing
8. brown & hudson 1998 the alternatives in language assessment
How do we go about investigating test fairness
Qualities of a Good Test
Reliability And Validity
Validity in Research
Test characteristics
Presentation validity
Week 8 & 9 - Validity and Reliability
TEST DEVELOPMENT AND EVALUATION (6462)
Validity & reliability seminar
Principles of Language Assessment
VALIDITY
3232423232323232323232323232323232323 .pptx
Validity and objectivity of tests
Running head ASSESSING A CLIENT .docx
Enhancing fairness through a social contract
The validity of Assessment.pptx
Principles of assessment
Ad

Recently uploaded (20)

PDF
Computing-Curriculum for Schools in Ghana
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
Introduction to Building Materials
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PDF
advance database management system book.pdf
PDF
1_English_Language_Set_2.pdf probationary
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Lesson notes of climatology university.
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Hazard Identification & Risk Assessment .pdf
PDF
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Empowerment Technology for Senior High School Guide
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
Classroom Observation Tools for Teachers
Computing-Curriculum for Schools in Ghana
Orientation - ARALprogram of Deped to the Parents.pptx
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Introduction to Building Materials
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
advance database management system book.pdf
1_English_Language_Set_2.pdf probationary
LDMMIA Reiki Yoga Finals Review Spring Summer
Supply Chain Operations Speaking Notes -ICLT Program
Lesson notes of climatology university.
Chinmaya Tiranga quiz Grand Finale.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Hazard Identification & Risk Assessment .pdf
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
What if we spent less time fighting change, and more time building what’s rig...
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Empowerment Technology for Senior High School Guide
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Classroom Observation Tools for Teachers

the Routledge hanbook of language testing Ch 32. fairness

  • 1. CHAPTER 32 OF ROUTLEDGE HANDBOOK FAIRNESS
  • 2. OVERVIEW Introduction 1. Roles Of The Stakeholders 2. Roles Of The Test-takers 3. Power Relations In Testing Test Fairness 1. Test Sensitivity Review 2. Test Bias Fairness and validity 1. Fairness independent of validity 2. Fairness subsumes validity 3. Fairness and validity are overlapping 4. Fairness as an important aspect of validity Xi's framework Kunnan's Test Context Framework
  • 3. INTRODUCTION • There are many definitions of fairness. • Your-dictionary.com (2011) states that to be fair is to be “just and honest,” “impartial,” and “unprejudiced,” specifically, “free from discrimination based on race, religion, sex, etc.” • Kunnan (2000) argues that fairness embraces three concerns: validity, accessibility of tests to test takers, and justice.
  • 4. INTRODUCTION Spaan (2000) Defines fairness as an ideal “in which opportunities are equal,” “In the natural world, test writers and developers cannot be ‘fair’ in the ideal sense, but … they can try to equitable.” this equitability is “the joint responsibility of the test developer, the test user, and the examinee, in a sort of social contract”. Fairness can be seen as a system or process – as distinct from a quality.
  • 5. INTRODUCTION What most language testing views of fairness have in common is a desire to avoid the effects of any construct- irrelevant factors on the entire testing process, from the test-design stage through post-administration decision- making. In this context, one dimension of fairness concerns the roles of the major stakeholders in achieving language testing fairness. these stakeholders, to use spaan’s (2000) tripartite “social contract” scheme, are: test developers; L2/FL learners; and test users (i.e., teachers, administrators, etc.).
  • 6. INTRODUCTION Role Of Test Developers ❑They must try to ensure the validity, reliability, and practicality of their test methods; ❑They must also provide test users with easily understandable guidelines for the use of their tests; ❑They must also solicit feedback from them, to effect further test improvements.
  • 7. INTRODUCTION Roles Of The Test Takers ❑They must become familiar with the testing format and overall test content before taking the test; ❑They must try to make sure that the level of the test matches their own skill/knowledge level. Roles Of The Test Administrators ❑They must give tests to students for whom the tests were designed; to do otherwise would be an instance of test abuse.
  • 8. INTRODUCTION Power Relations In Testing Viewing test developer, test taker, and test user as parties to a social contract highlights a second issue, namely the phenomenon of power relations in testing. Shohamy (2001) points out that language tests have often been used by powerful agencies such as governments, educational bureaucracies, or school staff for reasons other than the assessment of language skills. For example, tests have been used to establish discipline, to impose sanctions on schools or teachers, or to raise the prestige of the subject matter being tested.
  • 9. INTRODUCTION Power relations in testing She offers a set of principles organized under the heading of critical language testing, as a way to engage language testers “in a wider sphere of social dialogue and debate about … the roles tests … have been assigned to play in society” ❑Critical language testing “encourages test takers to develop a critical view of tests as well as to act on it by questioning tests and critiquing the value which is inherent in them.” A critical language testing perspective asks that all parties to the contract remain vigilant.
  • 10. Test Fairness Test Sensitivity Review One approach to examining whether or not test questions are fair is through a test sensitivity review. Such reviews are performed by trained judges employed by test-development organizations, who examine test tasks to determine whether they contain language or content that may be considered stereotyping, patronizing, inflammatory, or otherwise offensive to test takers belonging to subgroups defined by culture, ethnicity, or gender.
  • 11. Test Fairness Test bias Is a technical term indicating a testing situation in which a particular test use results in different interpretations of test scores received by cultural, ethnic, gender, or linguistic subgroups. Synonymous with DIF. Bias or DIF is considered to be present when a test item is differentially difficult for a ethnic, cultural, or gender-related subgroup which is otherwise equally matched with another subgroup in terms of knowledge or skill. Among the statistical methods used to uncover DIF are: the Standardization Procedure; and the Mantel–Haenszel method.
  • 12. Fairness And Validity Kane (2010) points out that the relationship between fairness and validity depends on how the two concepts are defined: narrowly defining validity and broadly defining fairness will result in validity being considered a component of fairness. On the other hand, a broad definition of validity and a narrow conceptualization of fairness will result in fairness being understood as a part of validity.
  • 13. Fairness And Validity 1. Fairness Independent Of Validity One example of this is the Standards for Educational and Psychological Testing which define it as having, at minimum, three components: lack of item bias, the presence of equitable treatment of all test-takers in the testing process, and equity of opportunity of examinees to learn the material on a given test. While fairness here is not linked directly with validity, these 1999 standards do mention that fairness “promotes the validity and reliability of inferences made from test performance”.
  • 14. Fairness And Validity 2. Fairness Subsumes Validity Kunnan (2000) articulates a framework in which fairness includes issues of validity, accessibility to test takers, and justice. Under validity, Kunnan includes issues such as construct validity, DIF, insensitive test-item language, and content bias. An example of the latter might be a dialect of English employed in the test prompts that differs in some respects from another English dialect that may constitute the L2 of the test taker. Under accessibility, Kunnan indicates issues such as affordability, geographic proximity of test taker to the testing site, accommodations for test takers with disabilities, and opportunity to learn. “Opportunity to learn” is closely connected with the notion of construct under-representation (messick, 1989), which indicates the ability of a test to measure some aspects of a construct or skill, but not others.
  • 15. Fairness And Validity 2. Fairness Subsumes Validity (continued) A test may be measuring aspects of a construct, such as knowledge of a particular rule of language pragmatics, that certain test takers will not have had the opportunity to learn and thus score poorly on the test, despite the fact that they may be proficient in other relevant areas of the construct. Finally, Kunnan’s facet of justice embraces the notion of whether or not a test contributes to social equity. Kunnan (2004) later modified this model to include absence of bias, and test-administration conditions.
  • 16. Fairness And Validity 3. Fairness And Validity Are Overlapping Kane’s definition of test fairness draws on political and legal concepts. One, procedural due process, states that the same rules should be applied to everyone in more or less the same way. Kane also bases his definition on substantive due process, which states that the procedures applied should be reasonable both in general and in the context in which they are applied. In applying this twin definition of fairness to assessment, he gives two principles: the first is procedural fairness, in which test takers are treated “in essentially the same way…take the same test or equivalent tests, under the same conditions or equivalent conditions, and … their performances [are] evaluated using the same (or essentially the same) rules and procedures.” The second is substantive fairness, in which the score interpretation and any test-based decision rule are reasonable and appropriate for all test takers.
  • 17. Fairness And Validity 4. Fairness As An Important Aspect Of Validity By Willingham and Cole In this context, a fair test is one for which both the (preferably, small) extent of statistical error of measurement, and the inferences (hopefully, reasonable ones) from the test results regarding test-taker ability, are comparable from individual to individual and from subgroup to subgroup. They state that comparable validity must be met at all stages of the testing process – when designing the test, developing the test, administering the test, and using the test results. Comparable validity must be achieved by selecting test material that does not give an advantage to some test takers for reasons that are irrelevant to the construct being measured.
  • 18. Fairness And Validity 4. Fairness As An Important Aspect Of Validity (continued) Willingham and Cole see fairness as having three qualities linked to validity: (1) comparable opportunities for test takers to show their knowledge and skills; (2) comparable test tasks and scores; (3) comparable treatment of test takers in test interpretation and use. Xi both expands the definition of test fairness and offers a new framework for investigating fairness issues.
  • 19. Xi's Framework Xi states that fairness is “comparable validity for identifiable and relevant groups across all stages of assessment, from assessment conceptualization to use of assessment results,” where “construct-irrelevant factors, construct under-representation, inconsistent test administration practices, inappropriate decision- making procedures or use of test results have no systematic or appreciable effects on test scores, test score interpretations, score-based decisions and consequences for all relevant groups of examinees”.
  • 20. Xi's Framework Consists a fairness argument embedded within a validity argument. A validity argument is a chain of inferences that leads a test user to appropriate interpretations of test results. Xi’s validity argument framework consists of six successive sub-arguments, that: (1) there is evidence that the domain of L2 use which is of interest, provides a meaningful basis for our observations of test-taker performance on the test; (2) there is evidence that the observed test scores reflect that domain of L2 use and not construct-irrelevant factors; (3) there is evidence that the observed scores on the test are generalizable over similar language tasks on other, similar tests; (4) there is evidence that the abovementioned generalization of observed scores can be linked to a theoretical interpretation (i.e., The construct, the theoretical skill) of such scores; (5) there is evidence that the theoretical construct can explain the L2 use in actual situations envisioned by the users of the test; and (6) there is evidence that the language-test results are “relevant, useful, and sufficient” for determining the level of L2 ability. Each of these sub-arguments is supported by certain assumptions.
  • 21. Xi's Framework Embedded in the above chain of sub-arguments and underlying assumptions, xi proposes, can be a fairness argument, which consists in part of a series of rebuttals, one or more posed to each of the validity sub- arguments. One can conceive of such rebuttals as research questions into the degree of fairness of a given language test, i.e., each rebuttal serves as a practical check on the claims of each sub-argument. For example, to the first of the sub-arguments above, one can ask whether or not the domain of L2 use actually provides a meaningful basis for observations of test-taker performance.
  • 22. Kunnan’s Test Context Framework This approach is intended to consider the wider political, educational, Cultural, social, economic, legal, and historical aspects of a test. In this it differs somewhat from other, more psychometrically focused approaches considered above, such as DIF, or even xi’s fairness argument framework. It has a certain overlap with Kane’s (2010) applications of political and legal concepts to language testing, and it also resonates with the analyses of power relations in language testing offered by Shohamy (2001), both mentioned above. Kunnan’s approach thus brings wider social factors into consideration when evaluating the fairness of a language test.