January 16, 1998

New Research Casts Doubt on Value of Student Evaluations of Professors

Studies find that faculty members dumb down material and inflate grades to get good reviews


As an assistant professor of marketing, Robert S. Owen knows the importance of keeping the customer satisfied. His job depends on it.

That's why, in his courses at the State University of New York College at Oswego, he gives multiple-choice rather than essay exams and asks students to evaluate research papers rather than write their own. A student who questions the fairness of a question on a test might receive extra credit simply for expressing interest.

"If students come to my office," he says, "I have to make sure they walk out happy."

Dr. Owen learned that lesson the hard way. Three years ago, he lost his job at Bloomsburg University of Pennsylvania when students gave his teaching mixed reviews. He describes himself as a casualty of an era in which administrators increasingly rely on student evaluations of teaching to decide who gets tenure and who doesn't.

"The student in college is being treated as a customer in a retail environment," he says, "and I have to worry about customer complaints."

Now academics are paying a lot of attention to two new studies that raise questions about the validity of student ratings of teaching and the tendency of professors to "teach to the evaluations." The reports on the studies, published last fall in Change magazine and American Psychologist, say professors who want high ratings have learned that they must dumb down material, inflate grades, and keep students entertained. The ratings can make or break a professor's career even though they do not always accurately measure teaching skills, the authors say.

"Evaluations may encourage faculty to grade easier and make course workloads lighter," says Anthony G. Greenwald, a psychology professor at the University of Washington who wrote the article in American Psychologist. He and Gerald Gilmore, director of the university's Office of Educational Assessment, examined student ratings of hundreds of courses at Washington and found that professors who are easy graders receive better evaluations than do professors who are tougher.

The Washington study and the one described in Change -- which showed that by being more enthusiastic, a professor sharply improved his student ratings even though the students did not actually learn more -- have shaken a long-standing consensus among researchers. Dozens of scholars in the United States and abroad have agreed for years that student evaluations are a good measure of a teacher's skills. Nearly 2,000 studies have been completed on the topic, making it the most extensive area of research on higher education.

As the number of studies supporting the value of student evaluations has grown, so has their use. Only about 30 per cent of colleges and universities asked students to evaluate professors in 1973, but it is hard to find an institution that doesn't today. And student ratings carry more and more weight, especially on campuses where the focus is on teaching. Such evaluations are now the most important, and sometimes the sole, measure of an instructor's teaching ability.

Good evaluations don't guarantee a professor tenure, particularly at research universities. "If you get good evaluations, that is just one hurdle you've cleared," says Tom Dickson, an associate professor of journalism at Southwest Missouri State University. But "if you don't get good evaluations," he says, "it doesn't matter how else you do." Dr. Dickson is trying to persuade administrators at his university to give less importance to student ratings.

Teaching evaluations were initiated on many U.S. campuses in the 1960s, as students clamored for more of a say in their education. Many institutions in Europe, Canada, Australia, and Asia have since adopted them as well. Typically, evaluation forms are passed out before the final exam in a course. Students are asked to rate a professor's communications skills, knowledge of the subject matter, ability to organize material, and fairness in grading. Many of the forms are designed to be analyzed by a computer, which responds with a numerical rating for each professor.

Many of the forms ask students to rate a professor's teaching techniques on a scale from "very effective" to "ineffective." They also ask students to give written comments, which can be the most devastating to professors' careers, and which often have nothing to do with teaching.

Undergraduates have been known to comment on a professor's clothing, hairstyle, and personal hygiene. Stephen J. Ceci, a Cornell University professor who wrote the article on the study in Change, the monthly magazine of the American Association for Higher Education, says one student suggested in an evaluation that he stop wearing a pair of orange corduroy pants. "You look like you work at Hardees," the student wrote.

Dr. Ceci acknowledges that the pants were a bit ostentatious. He also remembers what another student wrote in evaluating a female professor who had given a lecture in one of his courses. Asked if she should change anything about her presentation, the student wrote: "She shouldn't wear that outfit. Her hips are too big."

When students do stick to the subject, they can be just as critical. They complain that courses are boring and put them to sleep. Sometimes the evaluators get really nasty. A few years ago, a physics professor at Cornell who was using a pendulum as part of a lecture on friction had to duck to get out of its way. One student later wrote that the course would have been better had the pendulum hit the professor instead of the wall behind him.

Wendy L. Williams, a professor of human development at Cornell who wrote the Change article with Dr. Ceci, says just a few "bitter, nasty comments" can raise questions about a professor's competence. "It is worth examining whether we want people's entire careers to be derailed by a bunch of snitty undergraduates who didn't want to do an extra term paper," she says.

Proponents of the evaluations say mediocre teachers elicit low ratings from students. They note that service in most industries is judged by the customer. Besides, they contend, no one has developed a better measure of professors' performance. Faculty members have always been squeamish about passing judgment on each other's teaching, and most instructors don't relish having colleagues sit in on their lectures. "Research on peer review shows that if you have three or four different people go in and sit in on lectures by the same teacher, there will be relatively little agreement among them," says Herbert W. Marsh, a professor of education and dean of graduate research studies at the University of Western Sydney, in Australia, who has studied student evaluations for 25 years. "With student ratings, you've got someone who's sat through 40 hours of a course."

Undergraduates can be sincere in their comments, offering praise and acknowledging that a course has changed their lives. Dan Mansfield, a junior majoring in psychology at the University of Michigan, says students' comments about teaching are worthwhile. "I always try to be thoughtful about what I write. If I'm able to tell my professors what I like or dislike, I'm going to ultimately get a better education."

Michael Theall, an associate professor of educational administration at the University of Illinois at Springfield, calls the evaluations "valid measures of students' satisfaction with their experience." He and other researchers point to a set of about four dozen studies that have tested the validity of student ratings. Researchers compared the quality of teaching in several sections of the same course and gave students in each section the same final exam. The studies found that sections of students who did well overall tended to give higher ratings to the instructor than did those of students who did poorly. Many scholars read that data as confirmation of the worth of student evaluations as a measure of how well professors teach. For the researchers, it also served to show that how severely a professor grades has little to do with students' comments, because in these experiments, the exams were designed and graded by outsiders.

Dr. Marsh, the professor at the University of Western Sydney, has found that professors are likely to agree with students in choosing the best teachers on a campus. His research in Australia, he says, persuades him that high ratings are related to effectiveness. Professors have raised their ratings, he has found, by getting help to improve areas of their teaching that students have complained about.

Scholars say the tenure-and-promotion system works to counteract any incentive on the part of professors to inflate grades in order to improve their ratings. "If a promotion-and-review committee suspects somebody is grading too high, they really get dinged," says Wilbert J. McKeachie, a professor of psychology at Michigan. "Even though they may have gotten higher teaching evaluations, it is not popular among peers to give higher grades than other people are giving."

According to the article in Change magazine, however, giving high grades isn't the only way to boost evaluations. Dr. Ceci had taught developmental psychology at Cornell for nearly 20 years and was drawing mediocre reviews. Administrators had even asked him to attend a workshop with a "media consultant" to try to spice up his lectures.

To Dr. Ceci, the situation became an opportunity for research: What if he could improve his ratings simply by being a more enthusiastic lecturer, as the media consultant had advised? What would that say about the value of student ratings?

During a recent spring semester (Dr. Ceci would not identify which one), the professor taught developmental psychology covering the same material as in the previous semester, and using the same textbook he had used for years. But he added more hand gestures to his teaching style, varied the pitch of his voice, and generally tried to be more exuberant. The outcome was astounding: Students' ratings of Dr. Ceci soared. They even gave higher marks to the textbook, a factor that shouldn't have been affected by differences in his teaching style.

Despite the higher ratings, however, Dr. Ceci found no real improvement in students' performance on exams in the spring compared to those in the fall. He concluded that his new teaching style was probably no more effective than his old one.

In their article in Change, Dr. Ceci and Dr. Williams offered a blunt indictment of evaluations: "Student ratings are far from the bias-free indicators of instructor effectiveness that many have touted them to be. Student ratings can make or break the careers of instructors on grounds unrelated to objective measures of student learning, and for factors correctable with minor coaching."

Their findings have been dismissed by many scholars who have spent their careers assessing the validity of student evaluations. Of course a professor who is enthusiastic will satisfy students, and may even encourage them to retain more information, proponents of evaluations say. That's simply common sense and a basic tenet of good teaching. It doesn't mean that a bad teacher can get great ratings simply by being entertaining, they say.

What the Cornell professors found, though, rings true in the everyday lives of many other professors. Peter Sacks has written a book in which he recounts how poor evaluations from students almost cost him his job as an assistant professor of journalism at an unidentified community college on the West Coast. He salvaged his career, he says, after changing his teaching style.

"In my mind, I became a teaching teddy bear," he writes in Generation X Goes to College (Carus Publishing, 1996). "Students could do no wrong, and I did almost anything possible to keep them happy, all of the time, no matter how childish or rude their behavior, no matter how poorly they performed in the course."

His colleagues, too, told him that if he wanted to improve his ratings and get tenure, he should be more entertaining, both in and out of class, Mr. Sacks writes. "After the field trip, meet the class for a pizza dinner," one of his colleagues suggested. "Bring donuts," advised another.

The superficial changes he made in his teaching style led to significantly better ratings from students, he writes, although he doesn't believe that students learned any more. He earned tenure in 1995, but the following year -- after four years at the community college -- he left teaching for free-lance writing. (While teaching at the college, he used his first name, which he won't reveal now; Peter is his middle name.) "This was higher education as a consumeristic, pandering enterprise," he says in an interview. "The love of learning was completely whitewashed out."

Paul A. Trout, an associate professor of English at Montana State University, says most professors have learned how to get good ratings from students. "Some professors stick to their guns and get punished. But an awful lot of people have figured out how to get their numbers high enough so that the evaluations are not a liability to them. People are changing their teaching, the rigor of their courses, to insure they get tenure."

As a young assistant professor of political science at Kalamazoo College, Jeremy D. Mayer is acutely aware of how important student ratings are. He says he has always received topnotch ratings from students, in part because he is a good teacher. But he is also aware that this generation of students, raised on Sesame Street and MTV, wants to be entertained. "A college professor today, if he wants to be effective, should be able to be a bit of a Quentin Tarantino in the classroom," he says.

Mr. Sacks says a student once told him: "We want you guys to dance, sing, and cry." But a culture that allows students to determine what is good teaching, he says, "does not lend itself to the kind of critical, messy thinking that we need to be encouraging in higher education."

Copyright (c) 1998 by The Chronicle of Higher Education
Date: 01/16/98
Section: The Faculty
Page: A12