A Princeton faculty committee recommended yesterday that the university rescind its ten-year-old grading guideline that advises faculty to assign grades in the A range to at most 35% of students. The committee issued a report explaining its rationale. The recommendation will probably be accepted and implemented.
It’s a good report, and I agree with its recommendation. Princeton would be better off without its grading quota.
Before explaining why, it’s worth defusing some of the likely responses. Many commentators seem to presuppose that the distribution of grades is too high—but whether that is true is part of the debate. Similarly, many commentators assume that today’s students perform no better than students in the past, even though what evidence there is tends to point toward today’s students being better. Like many debates about academia, this one tends to be short on data, and people’s positions tend to be driven more by emotional and political predisposition than fact-based reasoning.
As an example, it is widely asserted without data that the rise in average grades is a relatively recent development, usually tied to some cultural trend that the speaker dislikes. But the available information seems to show that grades have been rising for a long time. The best data I have seen on this is in Harry Lewis’s book, Excellence Without a Soul, which shows that grades at Harvard have been increasing slowly and steadily since at least the 1930’s and probably longer.
Princeton’s recent experience, as recounted in the new faculty committee report, adds a new data point to the debate. The current policy, which is officially a “guideline” of 35% A’s rather than a formal quota, went into effect in 2005. And the data show that grades declined noticeably in the period from 2002 to 2004. After the policy went into effect in 2005, grades were flat for a few years and then started rising again. So what changed grading practices was not the 35% guideline but the simple fact that faculty were discussing and thinking more deeply about grading policy during the period before the current policy was even a concrete proposal. The policy that worked was “grade mindfully”, not “give 35% A’s”.
Part of the problem with the current policy is that there isn’t a clearly stated theory of how it is supposed to operate. Partly this is because some of the policy’s most vocal advocates have tried to avoid admitting that in order to succeed the policy would have to give some students a B even if the faculty instructor thought they deserved an A. I once heard a sub-dean say that faculty weren’t supposed to change individual students’ grades; they were only supposed to lower the average grade. But of course you can’t lower the average without lowering some individuals. And if the goal is to get faculty to give different grades than they would give on their own, then the policy cannot succeed without changing some faculty grading decisions. The policy was a voluntary guideline rather than a quota—and some faculty chose to ignore it entirely—but still, if the policy was to have any effect at all, this could only occur by getting faculty to change some A’s to B’s.
This tension is exposed in yesterday’s report, where the following story told by a student is described as “reveal[ing] poor behavior on the part of the faculty”:
I received a 91 on a midterm exam in a [particular department] course this past fall (my concentration [i.e., major]), but the 91 was scratched out and replaced with an 88. When I asked my professor why he reduced my score, he told me that normally the paper would be an A-, but due to grade deflation, he was forced to lower several students’ grades to a B+.
This kind of thing—changing an A- to a B+—had to happen if the policy was going to succeed. So one suspects that the professor’s “poor behavior” here was not changing the grade, but telling the student what was really going on.
The report’s bottom line on the current policy is the same as mine: the policy’s goal of making grading more thoughtful and consistent was a good one, but the policy was not effective in achieving that goal. Now, I hope, we can learn from experience and kick off a new discussion of what grades are for and how we should assign them.
Call me completely uneducated as to how college grading works or does not work, I am certainly one of those mentioned that “tends to be short on data, and [my] positions tend to be driven more by emotional and political predisposition than fact-based reasoning.” And yet I have some fact-based reasoning below. Fact based against the whole topic.
I didn’t go to college for various reasons. One of those reasons included what was termed “grading on a scale” when I was in junior high/middle school. And, as I understood it at the time that was the “normal” grading method at colleges. To me it makes no sense. Grade deflation was followed by grade inflation I suppose. For many years I have been hearing people complain about the latter as it portended a “dumbing down” of education (people getting A’a easier suggests that an A is worth less than previously); and you mention that over time grades have gone up but did not address what has caused them to go up (dumbing down vs. smartening up).
And now, to hear of a policy where an A grade has to be “deflated” to a B just to keep a quota. To me the whole idea of grading on a curve, or quota grading… giving A’s, B’s, C’s to certain percentages, is just outrageous and has no merit in an institution of education.
In my mind of “justice” and “fair” the only grading that makes sense is grading on merit alone. If a person gets 91% on a midterm exam showing they learned an approximately 91% of the knowledge pertaining to (half) the course; that 91% should be 91% (which in the A-F scale would be A- if I recall correctly). Not only should the professor NOT change the grade (up nor down), he should award that grade to ANY and ALL students who got that grade. That is, it SHOULD be possible for 100% of the class to get an A grade if 100% put in the effort to learn what is to be taught.
Otherwise college isn’t about “education” its about “competition.” Grading on a scale or with a quota is really saying only a few get to excel, most will be “average” and some will be pushed down based on how well they compete with each other. While competition has its place in society, it SHOULD NOT be in an institution that proclaims it is about education. The two are mutually exclusive concepts; and hence so long as colleges show that their purpose is competition and not education; I will not and cannot in good conscious give my money to them for the purpose of furthering my education. I will educate myself, and leave the competition to the market that may or may not employ me.
Data? The only data we have are grades and they’ve been adjusted. The number of grade A students at Princeton has certainly dropped. 🙂
But I’ve always thought that the problem was that grading was relative but the range of student quality was getting narrower, at least over the last 50-100 years. The sorting process is more serious and people are more willing to travel or entertain colleges outside of their social circles. All of the colleges used to draw from a much narrower corner of the country and that meant that they enrolled a much greater range of students.
In my classes over the years, I’ve found it hard to distinguish much between the best and the worst. They’ve grown to be very similar. And so how you can you assign grades that fit that range?
Peter,
Your comment gets to one of the key questions about grading, which is whether grades are supposed to reflect a common standard relative to today’s students, or a common standard that is invariant over time. If it’s the latter, and students are getting better on average, then consistency would require us to increase average grades over time.
It’s probably the case, as you suggest, that the average performance level of our students is increasing and the variance is decreasing. This could be a consequence of our student body being selected from a much larger applicant pool than in past eras. Even if the distribution of student performance in the overall population hasn’t changed, the fact that we are selecting from a larger subset of those students would mean that the students we select will be more tightly clustered toward the top of the distribution. (Certainly they’ll be more tightly clustered in quality as perceived by the admissions process, which probably correlates decently with later performance.)
The reporting in the NYT said in the headline that it was going to end the quotas, but in the article it said it was going to just push the decision to the individual departments. This is not a great solution, if you ask me, because it removes the way that the quotas harmonized the grading across the university. When I was at PU, many spoke about the fact that the hard sciences graded on a harder curve than the humanities. If you wanted a good GPA (for law school etc), well, it meant staying away from the hard sciences.
Unless I’m mistaken, there are now markedly more math majors and probably more hard science majors than when I was there.
If you ask me, letting each department decide provides an incentive to use grade inflation to lure more students, get more credit hours and thus hire more faculty.
One more quick thing, wish there was an edit button. Sentence should read “whether grades are supposed to reflect [1] a common standard relative to today’s students, or [2] a common standard that is invariant over time.” With my final proposition that grading should not reflect a common standard at all but rather:
“Grading should reflect how well a student gained knowledge of a subject relative to its common standard.”
Ed,
At the end of your post you wanted to “kick off a new discussion” as to the purpose of grading, and you put forth two possible purposes in response to Peter “whether grades are supposed to [1] reflect a common standard relative to today’s students, or [2] a common standard that is invariant over time.”
I would provide a third purpose of grading; what I call the True Purpose of Education. Some definitions to start:
“education >noun 1 the process of educating or being educated. 2 the theory and practice of teaching. 3 information about or training in a particular subject. 4 (an education) informal an enlightening experience.”
and
“educate >verb 1 give intellectual, moral, and social instruction to. 2 give training in or information on a particular subject.”
Shouldn’t grading be about determining the extent and effectiveness of each student’s education? A measure of the True Purpose? You use the terms “common standards” but then assume the “standards” could be either relative to people or invariant over time both presuppositions are false in my eyes. Isn’t that comparing apples (people) to oranges (time). That doesn’t even make sense to me.
Educational “standards” should be variant over time (as the subject matter varies over time, be it ever changing law or ever changing technology or any other discipline as mankind’s cumulative knowledge expands and cultures change). But grading should be based on concrete “educational standards.” That is, what training or information on a particular subject is set to be the standard and thus how well the student learns the subject. If a student learns well a student should be graded well, if a student learns poorly, graded poorly, and fails to learn then fails to be graded [e.g. F].
Common standards then should be based on common knowledge of the subject. Not relative to people but relative to knowledge that varies over time. But for any given class specific and concrete standards clearly and explicitly defined and even put forth in the syllabus; so every student knows what it is that he/she will be expected to learn.
That is IMHO what grading and higher education SHOULD be about.
There are indeed multiple goals that grades might be trying to further. Grades might be intended to signal student performance (a) to the student, (b) to others within the university, or (c) to outsiders. Signaling might be designed to allow comparisons over time, e.g. so the student can measure their own progress, or others can compare student performance across time. The thoughtful argument for this kind of signaling is that although learning is difficult to measure, even an imperfect metric is useful—and we should be working to improve our metrics.
An alternative view is that grades are meant to motivate students. It is often true that students will work harder on something when they believe it will affect their grade. I also suspect that even if grades weren’t reported to anyone but the student, they would still affect motivation. In this view, grades reflect a kind of gamification of education. And although gamification sometimes seems like a cheap trick, it is a trick that often works.
Current grading systems aren’t optimally designed for any one purpose, in part because they serve all of these purposes at the same time.
I agree that there could be multiple purposes for grading.
Motivation is a good one for sure; and that is a topic I could go off about, but I think I already write plenty of my own opinions on this post, and won’t bore you with anymore diatribes here.
I just don’t see how any policy reflecting percentages and grading on a curve accomplishes any of the purposes I hold dear. The only purpose I can see by such grading are signalling one’s abilities in comparison to others [e.g. competition].