Heterodoxy:

Against Criterion-referenced Assessment

We have seen that setting objectives is an error, and that what happens in educational institutions is relatively trivial in the overall scheme of things, and yet we get worked up about the so-called validity and reliability of the assessment process. Most assessment is merely about performance on a course, not about the whole learning process, and hence fundamentally flawed. Concern about the minutiae of its operation is yet another case of re-arranging the deck chairs on the Titanic.

The alternative to criterion-referencing is norm-referencing—a basically competitive exercise in which the top marks or grades go to the best students, and those at the bottom of the ranking (the bottom 10/15/20% or even 50%, according to institutional policy) fail. This used to be the model for UK General Certificate of Education and Advanced Level examinations, and in their hearts those who complain about declining standards in education are bemoaning its passing.

Achievement in norm-referenced assessment is therefore relative to the rest of the cohort, which is felt to be a Bad Thing. After all, if you have the misfortune to be part of a good year's intake, your chance of failing is increased — and vice versa.

A former colleague of mine, previously a head-teacher, told of one of his brightest pupils, who scored 99% in her mathematics A-level in those days, but only got a "B". When he queried this with the examinations board, he was told it was unfortunate but it was an exceptional year, and all the quota of "A"s had been allocated to those who got 100%.

So now we—supposedly—go for "absolute" standards. Pre-determined criteria are set, and everyone who meets them gets the appropriate grade, even if everyone passes. (Phillips' [1996] epigraph of the caucus race in Alice shows where her sympathies lie—how can a prize be worth anything if everyone gets one?)

But—anomalies are rare. Each cohort of students is likely to be similar to its predecessor and sucessors. The profile of grades is likely to be very similar, and as markers we are probably better at judging the relative merits of our students than their supposedly absolute performance against notional absolute standards.

Indeed, the establishment of absolute standards is fraught with difficulty. The Quality Assurance Agency for higher education in the UK has published a set of "Subject Benchmarks" which specify what graduates in any discipline should know and be able to do on completion of their course. They are notoriously waffly and imprecise. There is a pretence that assessment can ever be other than subjective (more so in some subjects than others, of course), and in the final analysis the process is that of middle-aged academics trying and failing to find a formula to capture absolutely the self-adjusting norm-referenced formula which pertained when they graduated.

Because norm-referencing is not dead. School examination boards in the UK are in competition for schools' business; and schools understandably go to those boards which are most likely to award a good grade to their pupils. Competition here is at the organisational level, but it contributes to a process of "dumbing down" in which boards compete to display the most favourable results profile—which must mean the lowering of standards.

Much the same thing happens in colleges and universities. There is a premium on student progression and achievement (in the days of the much feared "subject reviews" it was one of the six scales on which the QAA graded departments in the UK). So it is in no-one's interest to let a student fail. Requirements are surreptitiously adjusted to yield a favourable profile—it is norm-referencing by the back door.

Everyone knows this, and employers in particular compensate. No-one really believes that a degree from Thames Valley University (against which I have nothing personally—it just happened at the time of originally writing this to be at the bottom of the "league-tables", which are of course norm-referenced), is equivalent to one from Cambridge (at the top). The system does no-one any service.

Really, it probably tells me more as an employer (or an academic recruiting students for higher degrees) to know that someone is in the top 10% of their year, than that they have a first-class honours judged against criteria which have been diluted to ensure practically no-one fails. After all, I may have a target of a number of graduates to recruit that year; naturally I want the best.

The UK government had a target of 50% participation in higher education by the 18-30 age group by 2005. Wasn't that norm-referenced? (It missed the target and fudged it, of course.)

The fundamental problem is that norm-referencing embraces the possibility of failure, and in a mistaken effort not to hurt anyone's feelings we have rejected that. But, in the absence of mechanisms to prolong periods of study until students can meet realistic criteria, we are stuck with a trade-off between failure and lowering standards. The nature of the system is that criterion-referencing inevitably leads to the latter. 

To reference this page copy and paste the text below:

Atherton J S (2013) Doceo; [On-line: UK] retrieved from

Original material by James Atherton: last up-dated overall 10 February 2013

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.

Search Doceo and associated sites:

Delicious Save this on Delicious        Print Click here to send to a friend    

This site is independent and self-funded. The site does not accept advertising or sponsorship (apart from what I am lumbered with on the reports from the site Search facility above), and invitations/proposals/demands will be ignored, as will SEO spam. I am not responsible for the content of any external links; any endorsement is on the basis only of my quixotic judgement. Suggestions for new pages and corrections of errors or reasonable disagreements are of course always welcome.

Back to top