Moving Beyond Student Ratings to Evaluate Teaching

Moving Beyond Student
Ratings to Evaluate Teaching
Vicki L. Wise, PhD
1

Where we are going in this presentation
• What is effective teaching
• Why collect evidence of effective teaching
• What evidence to collect
• How to bring evidence together for decision making
• One form of evidence: Student Ratings of Instruction (SRIs)
• Recognizing and addressing faculty concerns
2

What is effective teaching
Arreola (2007) defined five broad skill dimensions of college teaching: content expertise,
instructional design skills, instructional delivery skills, instructional assessment skills, and course
management skills.
L. Dee Fink (2012) distilled Ken Bain’s research on What the Best College Teachers Do down to four
fundamental tasks of teaching: Knowledge of subject matter, course design, course
management, and interactions with students.
The work of these researchers overlaps and reflects much of what is used in higher education.
3

Why collect evidence of effective teaching
1. To improve teaching
2. To improve the student learning experience
3. To meet expectations for continuous employment
4

What evidence to collect
Student ratings are only one source of data; they must be combined with
additional evidence so that administrators can make an informed
judgment about teaching quality.
-Hoyt and Pallett (1999). Appraising Teaching Effectiveness: Beyond Student Ratings.
No one piece of evidence can stand on its own. Evidence of teaching quality needs to take into
account multiple sources, as teaching is multidimensional. Moreover, the likelihood of obtaining
reliable and valid data and making appropriate judgments are increased with more evidence.
5

A portfolio (collection) of teaching effectiveness evidence could include at
least three forms of evidence.
Formative assessment or evaluation during a course, program or service provides information
useful in improving learning or teaching while it is still occurring. Examples are student reflection
exercises or teacher self reflections.
Summative assessment or evaluation is conducted at the end of a program, service, or experience
to make determinations of quality, worth, and meeting targeted outcomes. An example is a final
grade or end of course evaluations.
Direct methods of collecting information require a display of knowledge and skills. Indirect
methods require a reflection on learning, behavior, and attitudes rather than to demonstrate.
6

Evidence
Direct or
Indirect
Student Ratings I
Observations of Teaching D
Student Interviews I
Peer Ratings D
Teacher Self Evaluations I
Personal Goals for Teaching Improvement D
Teaching Curriculum Map D
Exit and Alumni Ratings I
Employer Ratings I
Teaching Scholarship D
Student Learning Outcomes Results D
Teaching Awards D
Philosophy of Teaching D
Innovative and Creative Teaching Techniques D
Course Materials D
Sample Student Work D
Mentorship of Students and Faculty D
Participation in Faculty Development Activities/ Consultations on Teaching D
Scholarship of Teaching and Learning D
7

How to bring evidence together for decision
making
Help, I have collected multiple forms of evidence and need to make
sense of it all for decision making.
• If a portfolio system is used, it could follow a basic template, and faculty
would be asked to provide evidence of meeting specific criteria. Department
chairs would use a rubric (checklist or scale) to rate the quality of evidence in
pre-determined categories.
• A rubric can also be used on each piece of evidence. The scale would specify
what constitutes as “meeting expectations”.
8

One Form of Evidence: Student Ratings of
Instruction (SRIs)
• Plenty of research supports that overall the evaluation of teaching generally
provides reliable and valid evidence of teaching effectiveness. (meta-analytic
review Gravestock & Gregor-Greenleaf, 2008)
• Research does not support using teaching evaluation as a indicator of
student learning though (Uttl, White, & Wong Gonzalez, 2016. Meta-analysis of
faculty's teaching effectiveness: Student evaluation of teaching ratings and student
learning are not related).
• Significant concerns about the effective use of such evaluation systems
continue to exist.
9

Concerns about SRIs
• Faculty: Anxiety provoking, outright hostility, suspicious of use, evaluations are
biased, students not competent evaluators, ratings influenced by grade expectations,
to name a few.
• Administrators: Generally positive. Concerns about validity. Administrators’ own
ideas of teaching and learning may bias their judgments of teaching effectiveness.
• Students: Generally positive. Believe process is valid and they can be good
evaluators. Unclear if and how data are used.
10

11
Faculty
Concerns in
Detail
 Overly prominent role they play in personnel decisions (tenure, post-tenure review, etc.).
 Reliance on them to make hiring, firing, and promotion decisions, rather than as
performance improvement tools.
 Initial poor ratings by students who later feel: "I-hated-you-for-forcing-us-to-learn-this-
material-but-now-I'm-glad-you-did-thank-you-and-keep-doing-what-you're-doing"
phenomenon.
 Reliance on summative evaluation, rather than student ongoing reflection throughout the
term. Not capturing instructional successes and failures in real time.
 Problem of small numbers
 Problem of response scale (inappropriate, not understood, not well defined).
 Teaching effectiveness not well defined. Not measuring what is valued
 Measure of student satisfaction not student learning
 Not used for learning about value added to the student from being in the course.
 Assessing popularity and/or easiness of instructor.
 Not looking at visible outcomes of a course (e.g., increase in student conference
presentations as a result of a course).
 Bias in scores:
o Gender bias in student scores, especially towards female instructors.
o More difficult classes tend to get lower scores.
o Younger appearing professors tend to get lower scores.
o Introductory classes tend to get lower scores.

Addressing Concerns
1. Follow best practices in instrument development
2. Standardize administration and policy
3. Plan for use of findings
12

1. Follow best practices in instrument
development
• Clearly define effective teaching as well as the components
• Develop questions that measure these components
• Measure behaviors/practices that are observable
• Develop response scale
• Develop two forms: one for formative assessment and one for summative
• Test questions with the intended audience and use to modify each form
• Determine how to analyze data appropriately, based on scale and use
• Determine how results will be reported and used
• Determine who will provide results and in what form
• Collect contextualizing information, especially if comparisons are made (discipline, yrs. taught, …, course
elective or required, course level)
13

2. Standardize administration and policy
• Standardize administration process:
• Set stage for importance (and use) with students and faculty; use a core set of questions; use
motivators rather than incentives (consider equity in this online vs face-to-face); ensure anonymity;
standardize method of delivery.
• Suggest response rate needed: at least 10 students per course or 2/3 of class whichever is
higher. May need to combine scores across terms for the same course.
• Standardize analyses based on data type (i.e., treat ordinal variables appropriately).
• Provide data interpretation and distribution guidelines
• Determine audience for results and level of information needed.
• Determine policy around data storage and access
14

3. Plan for use of findings
• Develop recommendations for formative use to improve teaching.
• Develop guidelines for summative use with other evidence to improve
decision making.
15

Moving Beyond Student Ratings to Evaluate Teaching

More Related Content

What's hot (20)

Viewers also liked (18)

Similar to Moving Beyond Student Ratings to Evaluate Teaching (20)

Recently uploaded (20)

Moving Beyond Student Ratings to Evaluate Teaching

Editor's Notes