Good Teacher, Bad Data
I think it’s safe to say that there’s a bit more mathematical calculation in your normal English classroom pedagogy than there was, say, five years ago.
And you know what? That’s a good thing—a great thing if you’ve found meaningful ways to use the data gathered from formative and summative assessments.
But data can also be pretty misleading.
The idea of using data to improve instruction has always been presented as a simplistic and elegant solution: gather data that shows which students miss which questions and, voila!, you know where to direct differentiated instruction, to help every student reach mastery of the learning goals.
To wit: An easy question about the tone of an author yields 90% of your students who correctly identify and explain the tone, but the second tone question on the same assessment—testing the same learning goal but providing a much more challenging passage—reveals that only 50% of your class can really decipher tone when the going gets tough (or the tone gets subtle).
This is really fantastic information to have! Ten percent of your kids need to go back and review their notes and probably do some formative practice. But there’s another 40% who need to work on applying their newfound skill. They clearly know what tone is, but at some point when the tone isn’t smacking them in the face, they actually aren’t that great at recognizing the trait in writing. The needs of these two groups are different, but now you know whom to direct to which formative task!
The Signal, The Noise, The Headache
The funny thing about data, though, is that numbers aren’t as clear and objective as all those charts and bar graphs would have us believe. If you don’t want to take an English teacher’s word for that, get ahold of Nate Silver’s excellent book The Signal and the Noise, which reveals just how difficult it can be to get data to tell you the truth.
Or, for that matter, believe your own experience, since I’m fairly certain you’ve also experienced the sort of data debacle I’m about to describe.
A few years ago, my professional learning community rewrote all of our assessment questions so that they were clearly labeled by learning goal. When we tested a student’s ability to support an argument using textual evidence, the question might look like this:
Using Evidence: Using at least one quote, explain how Jon Krakauer establishes Chris McCandless’s desire to live a more primitive lifestyle in Into the Wild.
Now everything should be clean and easy to parse—if kids get the question right, they have mastered the use of textual evidence. If they get it wrong, they have not. And if they can explain Krakauer’s methods but fail to use a quote, we can presume they’re halfway there.
So would it surprise you to learn that my PLC ended up getting incredibly muddled data from this question? And that we eventually had to rethink how we were interpreting much of the data? Here are some of the issues that we encountered:
- How can you tell when a student lacks a skill versus when they lack vocabulary? Three of my stronger students asked me what primitive meant—in my first period alone!
- Did all the students recognize the implicit meaning of the verb explain? Have you been clear about what various verbs (contrast, analyze, challenge) demand of them in an assessment?
- How do you decide whether a student just hasn’t written enough? And what should the takeaway be when students can vocalize an answer that is thorough and accurate?
- How much should you be concerned when a student’s example is the one you’ve already used in a class discussion? What if that brand of example shows up on every single assessment a student takes?
- If you give the students one passage to focus on, is a correct answer an indication of mastery of this skill or only partial mastery (since on their own they might not have been able to select the relevant part of the text from, say, an entire chapter)?
Any of these are good reasons to have a careful data discussion in your PLC. But let’s just take that first one—lacking a skill versus lacking vocabulary—as an example.
I couldn’t write off as a trivial minority the students who asked the question (what primitive meant)—these were the grade-concerned kids who were good about asking questions. If they didn’t know the term and said so, then there was a good chance that A LOT of the other kids also didn’t know the meaning of primitive. They just didn’t bother to ask.
Is Data Doomed?
All of a sudden, our data about this fundamental writing skill seemed really murky. And this was a learning goal we thought was pretty transparent and objective! There was a sudden temptation to go back to the more instinctive, less numbers-driven approach to gathering feedback about students.
Even though gathering good data in English is tougher than it seems, it is both possible and essential for effective instruction. I’ll revisit my own case study in my next blog post, in order to elucidate a few of the counter-measures my PLC took to help avoid “fuzzy” data points.
In the meantime, think about the next assessment you give to students. Whatever data you take from it, ask yourself whether more than one “theory” about the kids’ performances on it would fit the data you’re staring at.
Michael Ziegler (@ZigThinks) is a Content Area Leader and teacher at Novi High School. This is his 15th year in the classroom. He teaches 11th Grade English and IB Theory of Knowledge. He also coaches JV Girls Soccer and has spent time as a Creative Writing Club sponsor, Poetry Slam team coach, AdvancEd Chair, and Boys JV Soccer Coach. He did his undergraduate work at the University of Michigan, majoring in English, and earned his Masters in Administration from Michigan State University.
Literacy & Technology Notes from the Classroom Professional Learning