Enhancing Comparability of Standards through Validation and Moderation A study funded by the National Quality Council Shelley Gillis  Berwyn Clayton  Andrea Bateman
Rationale Some key stakeholders have raised concerns with the quality and consistency of assessments being undertaken by RTOs. That is, concerns have been raised about comparability of standards.
Aim To develop a series of products that would: I mprove the consistency in assessment decisions within VET;  Increase the level of confidence in industry in assessment in VET;  Increase awareness of, and consistency in, the application of reasonable adjustments in making assessment decisions; Increase capability in RTOs to demonstrate compliance with AQTF 2007 Essential Standards for Registration, Standard  1.
Products Guide for Developing Assessment Tools Code of Professional Practice for Validation and Moderation Implementation Guide: Validation and Moderation http://www.nqc.tvetaustralia.com.au/nqc_publications
Changes to the AQTF User Guide Validity Reliability Assessment tool Validation  Moderation
The Guide for Developing Assessment Tools
Essential Characteristics of an Assessment Tool An assessment tool includes the following components:  The learning or competency unit(s) to be assessed The target group, context and conditions for the assessment the tasks to be administered to the candidate An outline of the evidence to be gathered from the candidate The evidence criteria used to judge the quality of performance (i.e., the assessment decision making rules); as well as the  The administration, recording and reporting requirements .
Ideal Characteristics The context Competency mapping The information to be provided to the candidate The evidence to be collected from the candidate Decision making rules Range and conditions Materials/resources required Assessor intervention Reasonable adjustments   Validity evidence Reliability evidence  Recording requirements Reporting Requirements
Competency Mapping The components of the Unit(s) of Competency that the tool should cover should be described. This could be as simple as a mapping exercise between the components within a task (eg each structured interview question) and components within a Unit or cluster of Units of Competency. The mapping will help determine the suffiency of the evidence to be collected as well as the content validity
Decision Making Rules The rules to be used to: Check evidence quality (i.e., the rules of evidence) Judge how well the candidate performed according to the standard expected Synthesise evidence from multiple sources to make an overall judgement
Reasonable Adjustments This section should describe the guidelines for making reasonable adjustments to the way in which evidence of performance is gathered without altering the expected performance standards (as outlined in the decision making rules).
Validity Evidence Validity is concerned with the extent to which an assessment decision about a candidate, based on the performance by the candidate, is justified. Requires determining conditions that weaken the truthfulness of the decision, exploring alternative explanations for good or poor performance, and feeding them back into the assessment process to reduce errors when making inferences about competence  Evidence of validity (such as face, construct, predictive, concurrent, consequential and content) should be provided to support the use of the assessment evidence for the defined purpose and target group of the tool.  .
Reliability Evidence Reliability is concerned with how much error is included in the evidence.  If using a performance based task that requires professional judgement of the assessor, evidence of reliability could include providing evidence of:  The level of agreement between two different assessors who have assessed the same evidence of performance for a particular candidate (i.e., inter-rater reliability).  The level of agreement of the same assessor who has assessed the same evidence of performance of the candidate, but at a different time (i.e., intra-rater reliability).  If using objective test items (e.g., multiple choice tests) than other forms of reliability should be considered such as the internal consistency of a test (i.e., internal reliability) as well as the equivalence of two alternative assessment tasks (i.e., parallel forms).
Examples Write  Say Do Create Portfolio Interview Observation Product
Quality Checks Panel Pilot Trial
A Code of Professional Practice for Validation and Moderation
Assessment Quality Management Quality Assurance Quality Control Quality Review
 
Validation Versus Moderation
Focus - Tool Has clear, documented evidence of the procedures for collecting, synthesising, judging and recording outcomes (i.e., to help improve the consistency of assessments across assessors [inter-rater reliability]).  Has evidence of content validity (i.e., whether the assessment task(s) as a whole, represents the full range of knowledge and skills specified within the Unit(s) of competency. Reflect work-based contexts, specific enterprise language and job-tasks and meets industry requirements (i.e., face validity). Adheres to the literacy and numeracy requirements of the Unit(s) of Competency (construct validity). Has been designed to assess a variety of evidence over time and contexts (predictive validity). Has been designed to minimise the influence of extraneous factors (i.e., factors that are not related to the unit of competency) on candidate performance (construct validity).
Focus - Tool Has clear decision making rules to ensure consistency of judgements across assessors (inter-rater reliability) as well as consistency of judgements within an assessor (intra-rater reliability). Has a clear instruction on how to synthesise multiple sources of evidence to make an overall judgement of performance (inter-rater reliability). Has evidence that the principles of fairness and flexibility have been adhered to. Has been designed to produce sufficient, current and authentic evidence. Is appropriate in terms of the level of difficulty of the task(s) to be performed in relation to the skills and knowledge specified within the relevant unit(s) of Competency. Has outlined appropriate reasonable adjustments that could be made to the gathering of assessment evidence for specific individuals and/or groups. Has adhered to the relevant organisation assessment policy.
Focus - Judgement Check whether the judgement was too harsh or too lenient by reviewing samples of judged candidate evidence against the: Requirements set out in the Unit(s) of Competency; Benchmark samples of candidate evidence at varying levels of achievement (including borderline cases); and the Assessment decision making rules specified within the assessment tools.  Desirable for validation, mandatory for moderation
Types of Approaches – Assessor Partnerships Validation only  Informal, self-managed, collegial  Small group of assessors May involve: Sharing, discussing and/or reviewing one another’s tools and/or judgements Benefit Low costs, personally empowering, non-threatening Weakness Potential to reinforce misconceptions and mistakes
Types of Approaches - Consensus Typically involves reviewing their own & colleagues assessment tools and judgements as a group Can occur within and/or across organisations Strength Professional development, networking, promotes collegiality and sharing Weakness Less quality control than external and statistical approaches as they can also be influenced by local values and expectations Requires a culture of sharing
Types of Approaches - External Types Site Visit Versus Central Agency Strengths Offer authoritative interpretations of standards Improve consistency of standards across locations by identifying local bias and/or misconceptions (if any) Educative Weakness Expensive Less control than statistical
Types of Approaches - Statistical Limited to moderation Yet to be pursued at the national level in VET Requires some form of common assessment task at the national level Adjusts level and spread of RTO based assessments to match the level and spread of the same candidates scores on a common assessment task Maintains RTO-based rank ordering but brings the distribution of scores across groups of candidates into alignment Strength Strongest form of quality control Weakness Lacks face validity, may have limited content validity
Summary of major distinguishing features Validation is concerned with quality review whilst moderation is concerned with quality control; The primary purpose of moderation is to help achieve comparability of standards across organisations whilst validation is primarily concerned with continuous improvement of assessment practices and outcomes; Whilst validation and moderation can both focus on assessment tools, moderation requires access to judged (or scored) candidate evidence. The latter is only desirable for validation; Both consensus and external approaches to validation and moderation are possible. Moderation can also be based upon statistical procedures whilst validation can include less formal arrangements such as assessor partnerships; and The outcomes of validation are in terms of recommendations for future improvement to the assessment tools and/or processes; whereas moderation may also include making adjustments to assessor judgements to bring standards into alignment, where determined necessary.
Principles Transparent Representative Confidential Educative Equitable Tolerable
Transparent
Representative
Confidential
Educative
Equitable
Tolerable
Associate Professor Dr Shelley Gillis Deputy Director Work-based Education Research Centre Victoria University Email: shelley.gillis@vu.edu.au Phone: 0432 756 638   Andrea Bateman Director Education Consultant Bateman Giles Pty Ltd Email: andrea@batemangiles.com.au Phone: 0418 585 754 Berwyn Clayton Director Work-based Education Research Centre Victoria University Email: berwyn.clayton@vu.edu.au Phone: 0411 138 205

More Related Content

PPT
Assessment Validation
PDF
Validation and moderation_-_guide_for_developing_assessment_tools
PPTX
Validation and moderation workshop Session 1
PPTX
Power point quality_assessment_and_validation
PPTX
The Evaluation Checklist
PPT
Assessment methods
PPT
Methods in Open ECB-Check
PDF
Babok businessanalysisposter-160904080556
Assessment Validation
Validation and moderation_-_guide_for_developing_assessment_tools
Validation and moderation workshop Session 1
Power point quality_assessment_and_validation
The Evaluation Checklist
Assessment methods
Methods in Open ECB-Check
Babok businessanalysisposter-160904080556

What's hot (20)

PPT
Tae assessment power point
PPTX
Properties of Assessment Method
PDF
Test quality validity
PPTX
Instrument development and data analysis validation
PPTX
Definition of Evaluation
PDF
Evaluation methods
PPTX
Data collection
PPT
Performance Appraisal
PPT
Determining the assessment evidence
PPT
Performance Assessments with MFRM
PPSX
Reliability And Validity Iv
PPTX
Performance measurement techniques
PPT
evaluation process
PPT
PPTX
Introducing a tool into an organization
PDF
How to structure your table for systematic review and meta analysis – Pubrica
DOCX
Research papers on performance appraisal
PPTX
Performance evaluation
DOCX
Software quality management plan
Tae assessment power point
Properties of Assessment Method
Test quality validity
Instrument development and data analysis validation
Definition of Evaluation
Evaluation methods
Data collection
Performance Appraisal
Determining the assessment evidence
Performance Assessments with MFRM
Reliability And Validity Iv
Performance measurement techniques
evaluation process
Introducing a tool into an organization
How to structure your table for systematic review and meta analysis – Pubrica
Research papers on performance appraisal
Performance evaluation
Software quality management plan
Ad

Similar to NQC Presentation On Validation And Moderation (20)

PDF
Assessor
PPTX
Week 8 & 9 - Validity and Reliability
PPT
Tae assessment cluster part 1
PPTX
Ch 5 - Requirement Validation.pptx
DOCX
FOCUSING YOUR RESEARCH EFFORTS Planning Your Research
DOCX
Running head TECHNOLOGY EVALUATION .docx
PDF
Guidance and Counseling Human Assessment
PDF
Writing Comprehensive and Effective Test Cases for Software Testing
PDF
ASSESSMENT AND QUALITY ASSURANCE
PPTX
Iso training material for auditing for companies
PPTX
Validity and reliability of questionnaires
PPT
WASC Evaluator Training Webinar Fall 2011
PPT
Assessing learning in Instructional Design
PPT
Wasc evaluator training webinar spring 2011
PPT
Wasc evaluator training webinar spring 2011 (Jan 11, 2011)
PPT
Wasc evaluator training webinar spring 2011 (Jan 11, 2011)
PPT
Ch16 measures of performance
PDF
Information about auditing
PPT
Quality Assurance
Assessor
Week 8 & 9 - Validity and Reliability
Tae assessment cluster part 1
Ch 5 - Requirement Validation.pptx
FOCUSING YOUR RESEARCH EFFORTS Planning Your Research
Running head TECHNOLOGY EVALUATION .docx
Guidance and Counseling Human Assessment
Writing Comprehensive and Effective Test Cases for Software Testing
ASSESSMENT AND QUALITY ASSURANCE
Iso training material for auditing for companies
Validity and reliability of questionnaires
WASC Evaluator Training Webinar Fall 2011
Assessing learning in Instructional Design
Wasc evaluator training webinar spring 2011
Wasc evaluator training webinar spring 2011 (Jan 11, 2011)
Wasc evaluator training webinar spring 2011 (Jan 11, 2011)
Ch16 measures of performance
Information about auditing
Quality Assurance
Ad

NQC Presentation On Validation And Moderation

  • 1. Enhancing Comparability of Standards through Validation and Moderation A study funded by the National Quality Council Shelley Gillis Berwyn Clayton Andrea Bateman
  • 2. Rationale Some key stakeholders have raised concerns with the quality and consistency of assessments being undertaken by RTOs. That is, concerns have been raised about comparability of standards.
  • 3. Aim To develop a series of products that would: I mprove the consistency in assessment decisions within VET; Increase the level of confidence in industry in assessment in VET; Increase awareness of, and consistency in, the application of reasonable adjustments in making assessment decisions; Increase capability in RTOs to demonstrate compliance with AQTF 2007 Essential Standards for Registration, Standard 1.
  • 4. Products Guide for Developing Assessment Tools Code of Professional Practice for Validation and Moderation Implementation Guide: Validation and Moderation http://www.nqc.tvetaustralia.com.au/nqc_publications
  • 5. Changes to the AQTF User Guide Validity Reliability Assessment tool Validation Moderation
  • 6. The Guide for Developing Assessment Tools
  • 7. Essential Characteristics of an Assessment Tool An assessment tool includes the following components: The learning or competency unit(s) to be assessed The target group, context and conditions for the assessment the tasks to be administered to the candidate An outline of the evidence to be gathered from the candidate The evidence criteria used to judge the quality of performance (i.e., the assessment decision making rules); as well as the The administration, recording and reporting requirements .
  • 8. Ideal Characteristics The context Competency mapping The information to be provided to the candidate The evidence to be collected from the candidate Decision making rules Range and conditions Materials/resources required Assessor intervention Reasonable adjustments Validity evidence Reliability evidence Recording requirements Reporting Requirements
  • 9. Competency Mapping The components of the Unit(s) of Competency that the tool should cover should be described. This could be as simple as a mapping exercise between the components within a task (eg each structured interview question) and components within a Unit or cluster of Units of Competency. The mapping will help determine the suffiency of the evidence to be collected as well as the content validity
  • 10. Decision Making Rules The rules to be used to: Check evidence quality (i.e., the rules of evidence) Judge how well the candidate performed according to the standard expected Synthesise evidence from multiple sources to make an overall judgement
  • 11. Reasonable Adjustments This section should describe the guidelines for making reasonable adjustments to the way in which evidence of performance is gathered without altering the expected performance standards (as outlined in the decision making rules).
  • 12. Validity Evidence Validity is concerned with the extent to which an assessment decision about a candidate, based on the performance by the candidate, is justified. Requires determining conditions that weaken the truthfulness of the decision, exploring alternative explanations for good or poor performance, and feeding them back into the assessment process to reduce errors when making inferences about competence Evidence of validity (such as face, construct, predictive, concurrent, consequential and content) should be provided to support the use of the assessment evidence for the defined purpose and target group of the tool. .
  • 13. Reliability Evidence Reliability is concerned with how much error is included in the evidence. If using a performance based task that requires professional judgement of the assessor, evidence of reliability could include providing evidence of: The level of agreement between two different assessors who have assessed the same evidence of performance for a particular candidate (i.e., inter-rater reliability). The level of agreement of the same assessor who has assessed the same evidence of performance of the candidate, but at a different time (i.e., intra-rater reliability). If using objective test items (e.g., multiple choice tests) than other forms of reliability should be considered such as the internal consistency of a test (i.e., internal reliability) as well as the equivalence of two alternative assessment tasks (i.e., parallel forms).
  • 14. Examples Write Say Do Create Portfolio Interview Observation Product
  • 15. Quality Checks Panel Pilot Trial
  • 16. A Code of Professional Practice for Validation and Moderation
  • 17. Assessment Quality Management Quality Assurance Quality Control Quality Review
  • 18.  
  • 20. Focus - Tool Has clear, documented evidence of the procedures for collecting, synthesising, judging and recording outcomes (i.e., to help improve the consistency of assessments across assessors [inter-rater reliability]). Has evidence of content validity (i.e., whether the assessment task(s) as a whole, represents the full range of knowledge and skills specified within the Unit(s) of competency. Reflect work-based contexts, specific enterprise language and job-tasks and meets industry requirements (i.e., face validity). Adheres to the literacy and numeracy requirements of the Unit(s) of Competency (construct validity). Has been designed to assess a variety of evidence over time and contexts (predictive validity). Has been designed to minimise the influence of extraneous factors (i.e., factors that are not related to the unit of competency) on candidate performance (construct validity).
  • 21. Focus - Tool Has clear decision making rules to ensure consistency of judgements across assessors (inter-rater reliability) as well as consistency of judgements within an assessor (intra-rater reliability). Has a clear instruction on how to synthesise multiple sources of evidence to make an overall judgement of performance (inter-rater reliability). Has evidence that the principles of fairness and flexibility have been adhered to. Has been designed to produce sufficient, current and authentic evidence. Is appropriate in terms of the level of difficulty of the task(s) to be performed in relation to the skills and knowledge specified within the relevant unit(s) of Competency. Has outlined appropriate reasonable adjustments that could be made to the gathering of assessment evidence for specific individuals and/or groups. Has adhered to the relevant organisation assessment policy.
  • 22. Focus - Judgement Check whether the judgement was too harsh or too lenient by reviewing samples of judged candidate evidence against the: Requirements set out in the Unit(s) of Competency; Benchmark samples of candidate evidence at varying levels of achievement (including borderline cases); and the Assessment decision making rules specified within the assessment tools. Desirable for validation, mandatory for moderation
  • 23. Types of Approaches – Assessor Partnerships Validation only Informal, self-managed, collegial Small group of assessors May involve: Sharing, discussing and/or reviewing one another’s tools and/or judgements Benefit Low costs, personally empowering, non-threatening Weakness Potential to reinforce misconceptions and mistakes
  • 24. Types of Approaches - Consensus Typically involves reviewing their own & colleagues assessment tools and judgements as a group Can occur within and/or across organisations Strength Professional development, networking, promotes collegiality and sharing Weakness Less quality control than external and statistical approaches as they can also be influenced by local values and expectations Requires a culture of sharing
  • 25. Types of Approaches - External Types Site Visit Versus Central Agency Strengths Offer authoritative interpretations of standards Improve consistency of standards across locations by identifying local bias and/or misconceptions (if any) Educative Weakness Expensive Less control than statistical
  • 26. Types of Approaches - Statistical Limited to moderation Yet to be pursued at the national level in VET Requires some form of common assessment task at the national level Adjusts level and spread of RTO based assessments to match the level and spread of the same candidates scores on a common assessment task Maintains RTO-based rank ordering but brings the distribution of scores across groups of candidates into alignment Strength Strongest form of quality control Weakness Lacks face validity, may have limited content validity
  • 27. Summary of major distinguishing features Validation is concerned with quality review whilst moderation is concerned with quality control; The primary purpose of moderation is to help achieve comparability of standards across organisations whilst validation is primarily concerned with continuous improvement of assessment practices and outcomes; Whilst validation and moderation can both focus on assessment tools, moderation requires access to judged (or scored) candidate evidence. The latter is only desirable for validation; Both consensus and external approaches to validation and moderation are possible. Moderation can also be based upon statistical procedures whilst validation can include less formal arrangements such as assessor partnerships; and The outcomes of validation are in terms of recommendations for future improvement to the assessment tools and/or processes; whereas moderation may also include making adjustments to assessor judgements to bring standards into alignment, where determined necessary.
  • 28. Principles Transparent Representative Confidential Educative Equitable Tolerable
  • 35. Associate Professor Dr Shelley Gillis Deputy Director Work-based Education Research Centre Victoria University Email: shelley.gillis@vu.edu.au Phone: 0432 756 638 Andrea Bateman Director Education Consultant Bateman Giles Pty Ltd Email: andrea@batemangiles.com.au Phone: 0418 585 754 Berwyn Clayton Director Work-based Education Research Centre Victoria University Email: berwyn.clayton@vu.edu.au Phone: 0411 138 205