Monday, November 9, 2015

The Good Assessment Paradox

So, unless your group of students is extremely non-standard, it's a practice in the Law of Diminishing Returns to design "good" assessments.

This is something that I have for years categorically rejected by the way. Even to this day it is hard for me to not ask questions that are designed the way they're supposed to be instead of defaulting to the regurgitation standard so commonly upheld through math classes.

The problem with this is that, typically, the easiest and possibly the most effective way to ask these questions is also the hardest to grade.

Like, really hard.

But it is possible to design effective multiple choice questions and entire multiple choice tests.

It's a lot more difficult, and the time-consumption is on the front-end instead of the back-end, but it's possible.

And if you throw in having incorrect responses causing a penalty score, it greatly improves the accuracy of the exam. (although don't get me started on hearing the students complain about that one).

None of this is the paradox by the way.

The paradox is that it doesn't matter what assessment you use.

It doesn't matter if the questions are good or bad—so long as they are on-topic of course—unless your students are way above average or way below average on average, then they're going to fall along a normal curve for the most part. You'll have a majority of the class at average, another 13.5% one standard deviation above that, 13.5% one standard deviation below that, and your outliers symmetrically aligned on either end (amounting to an additional 5% total).

Furthermore, unless you're willing to fail the lot of them, you're probably going to apply a curve of some sort, and it probably won't be the traditional normal curve curve that aligns grades F-D-C-B-A according to the groups described in the paragraph above. It will probably be a flat curve that just adds to their scores. If the standard deviation is low enough, this might mean that a good portion of your class is all at Cs by this point.

I would like to be willing to fail an entire class. I really would.

But I don't have tenure, or even an option to someday have tenure. I like to think that my superiors are aware of my abilities, but I'm pretty sure if I had a class where a significant portion of them were failing I know what would happen. I don't think I'd be let go, at least not at first, but I'm pretty sure I'd be getting a visit telling me that I need to start making my tests easier or giving a curve right around the time the 3rd or 4th student went to complain.

The reason I'd like to be willing to fail an entire class is not because I'm a sadist, it's the opposite in fact. I still haven't found a way to detach myself from caring for my students. Me spending so much time trying to help them learn and share what I know with them compels me to care about my students. The reason I'd like to be willing to fail an entire class is that once the students believed in the failure, and if they didn't have the option of an easier professor, they'd be motivated to actually improve. Study after study has shown that expectation is one of the most powerful tools in education (although that expectation comes from multiple sides, the expectation of the teacher is one of the only things you can directly control).

But I can't fail an entire class, so that means I have to curve the assessments I give, which then means that it doesn't matter what sort of assessment I give because without much deviation based on format, the scores are going to be "normal", and after applying a curve so that the average is where I'd like it to be, they'll be high enough that the students shouldn't complain too much.

And with homework and class participation grades, any student that does at least average is not just guaranteed to pass with a C, but probably a B. If they take part in any sort of extra credit or do exceptional on a particular assessment, maybe an A!

Does that mean they know an A-level of the material for the course?

Not bloody likely.

So here I am grading an exam that took me a long time to write and much longer to  grade, finding that the students might not even understand basic reasoning and each freaking problem taking me m-i-n-u-t-e-s to grade. So each test takes several minutes—on the order of half an hour or more to be honest. And I already know that the scores are going to be on average failing, and not far from "normal", with 68% within one standard deviation of whatever the failing average is, another 13.5% within another standard deviation above and another 13.5% below, and another 2.5% above that and another 2.5% below that. It's just a matter of separating them into those groups.

And with the curve, the students aren't motivated to do any better than that.

Well, some are I suppose, and that's why I started by saying it's an exercise in the "Law of Diminishing Returns". Because if you make good assessments and people aren't doing well on them before the curve, then there are a few that will be motivated enough to try harder and do better and maybe learn some strategies they wouldn't otherwise try for.

So why do I make so-called "good" assessments?

Because I'm a sucker.