NQC Presentation On Validation And Moderation

Enhancing Comparability of Standards through Validation and Moderation A study funded by the National Quality Council Shelley Gillis Berwyn Clayton Andrea Bateman

Rationale Some key stakeholders have raised concerns with the quality and consistency of assessments being undertaken by RTOs. That is, concerns have been raised about comparability of standards.

Aim To develop a series of products that would: I mprove the consistency in assessment decisions within VET; Increase the level of confidence in industry in assessment in VET; Increase awareness of, and consistency in, the application of reasonable adjustments in making assessment decisions; Increase capability in RTOs to demonstrate compliance with AQTF 2007 Essential Standards for Registration, Standard 1.

Products Guide for Developing Assessment Tools Code of Professional Practice for Validation and Moderation Implementation Guide: Validation and Moderation http://www.nqc.tvetaustralia.com.au/nqc_publications

Changes to the AQTF User Guide Validity Reliability Assessment tool Validation Moderation

The Guide for Developing Assessment Tools

Essential Characteristics of an Assessment Tool An assessment tool includes the following components: The learning or competency unit(s) to be assessed The target group, context and conditions for the assessment the tasks to be administered to the candidate An outline of the evidence to be gathered from the candidate The evidence criteria used to judge the quality of performance (i.e., the assessment decision making rules); as well as the The administration, recording and reporting requirements .

Ideal Characteristics The context Competency mapping The information to be provided to the candidate The evidence to be collected from the candidate Decision making rules Range and conditions Materials/resources required Assessor intervention Reasonable adjustments Validity evidence Reliability evidence Recording requirements Reporting Requirements

Competency Mapping The components of the Unit(s) of Competency that the tool should cover should be described. This could be as simple as a mapping exercise between the components within a task (eg each structured interview question) and components within a Unit or cluster of Units of Competency. The mapping will help determine the suffiency of the evidence to be collected as well as the content validity

Decision Making Rules The rules to be used to: Check evidence quality (i.e., the rules of evidence) Judge how well the candidate performed according to the standard expected Synthesise evidence from multiple sources to make an overall judgement

Reasonable Adjustments This section should describe the guidelines for making reasonable adjustments to the way in which evidence of performance is gathered without altering the expected performance standards (as outlined in the decision making rules).

Validity Evidence Validity is concerned with the extent to which an assessment decision about a candidate, based on the performance by the candidate, is justified. Requires determining conditions that weaken the truthfulness of the decision, exploring alternative explanations for good or poor performance, and feeding them back into the assessment process to reduce errors when making inferences about competence Evidence of validity (such as face, construct, predictive, concurrent, consequential and content) should be provided to support the use of the assessment evidence for the defined purpose and target group of the tool. .

Reliability Evidence Reliability is concerned with how much error is included in the evidence. If using a performance based task that requires professional judgement of the assessor, evidence of reliability could include providing evidence of: The level of agreement between two different assessors who have assessed the same evidence of performance for a particular candidate (i.e., inter-rater reliability). The level of agreement of the same assessor who has assessed the same evidence of performance of the candidate, but at a different time (i.e., intra-rater reliability). If using objective test items (e.g., multiple choice tests) than other forms of reliability should be considered such as the internal consistency of a test (i.e., internal reliability) as well as the equivalence of two alternative assessment tasks (i.e., parallel forms).

Examples Write Say Do Create Portfolio Interview Observation Product

Quality Checks Panel Pilot Trial

A Code of Professional Practice for Validation and Moderation

Assessment Quality Management Quality Assurance Quality Control Quality Review

Focus - Tool Has clear, documented evidence of the procedures for collecting, synthesising, judging and recording outcomes (i.e., to help improve the consistency of assessments across assessors [inter-rater reliability]). Has evidence of content validity (i.e., whether the assessment task(s) as a whole, represents the full range of knowledge and skills specified within the Unit(s) of competency. Reflect work-based contexts, specific enterprise language and job-tasks and meets industry requirements (i.e., face validity). Adheres to the literacy and numeracy requirements of the Unit(s) of Competency (construct validity). Has been designed to assess a variety of evidence over time and contexts (predictive validity). Has been designed to minimise the influence of extraneous factors (i.e., factors that are not related to the unit of competency) on candidate performance (construct validity).

Focus - Tool Has clear decision making rules to ensure consistency of judgements across assessors (inter-rater reliability) as well as consistency of judgements within an assessor (intra-rater reliability). Has a clear instruction on how to synthesise multiple sources of evidence to make an overall judgement of performance (inter-rater reliability). Has evidence that the principles of fairness and flexibility have been adhered to. Has been designed to produce sufficient, current and authentic evidence. Is appropriate in terms of the level of difficulty of the task(s) to be performed in relation to the skills and knowledge specified within the relevant unit(s) of Competency. Has outlined appropriate reasonable adjustments that could be made to the gathering of assessment evidence for specific individuals and/or groups. Has adhered to the relevant organisation assessment policy.

Focus - Judgement Check whether the judgement was too harsh or too lenient by reviewing samples of judged candidate evidence against the: Requirements set out in the Unit(s) of Competency; Benchmark samples of candidate evidence at varying levels of achievement (including borderline cases); and the Assessment decision making rules specified within the assessment tools. Desirable for validation, mandatory for moderation

Types of Approaches – Assessor Partnerships Validation only Informal, self-managed, collegial Small group of assessors May involve: Sharing, discussing and/or reviewing one another’s tools and/or judgements Benefit Low costs, personally empowering, non-threatening Weakness Potential to reinforce misconceptions and mistakes

Types of Approaches - Consensus Typically involves reviewing their own & colleagues assessment tools and judgements as a group Can occur within and/or across organisations Strength Professional development, networking, promotes collegiality and sharing Weakness Less quality control than external and statistical approaches as they can also be influenced by local values and expectations Requires a culture of sharing

Types of Approaches - External Types Site Visit Versus Central Agency Strengths Offer authoritative interpretations of standards Improve consistency of standards across locations by identifying local bias and/or misconceptions (if any) Educative Weakness Expensive Less control than statistical

Types of Approaches - Statistical Limited to moderation Yet to be pursued at the national level in VET Requires some form of common assessment task at the national level Adjusts level and spread of RTO based assessments to match the level and spread of the same candidates scores on a common assessment task Maintains RTO-based rank ordering but brings the distribution of scores across groups of candidates into alignment Strength Strongest form of quality control Weakness Lacks face validity, may have limited content validity

Summary of major distinguishing features Validation is concerned with quality review whilst moderation is concerned with quality control; The primary purpose of moderation is to help achieve comparability of standards across organisations whilst validation is primarily concerned with continuous improvement of assessment practices and outcomes; Whilst validation and moderation can both focus on assessment tools, moderation requires access to judged (or scored) candidate evidence. The latter is only desirable for validation; Both consensus and external approaches to validation and moderation are possible. Moderation can also be based upon statistical procedures whilst validation can include less formal arrangements such as assessor partnerships; and The outcomes of validation are in terms of recommendations for future improvement to the assessment tools and/or processes; whereas moderation may also include making adjustments to assessor judgements to bring standards into alignment, where determined necessary.

Principles Transparent Representative Confidential Educative Equitable Tolerable

Associate Professor Dr Shelley Gillis Deputy Director Work-based Education Research Centre Victoria University Email: shelley.gillis@vu.edu.au Phone: 0432 756 638 Andrea Bateman Director Education Consultant Bateman Giles Pty Ltd Email: andrea@batemangiles.com.au Phone: 0418 585 754 Berwyn Clayton Director Work-based Education Research Centre Victoria University Email: berwyn.clayton@vu.edu.au Phone: 0411 138 205

NQC Presentation On Validation And Moderation

More Related Content

What's hot (20)

Similar to NQC Presentation On Validation And Moderation (20)

NQC Presentation On Validation And Moderation