Bad Data, Good Data, Red Data, Blue Data
Back in part one of this post, I explored a problem that my PLC had while attempting to gather accurate data from student assessments.
This post, while still recognizing some problems with data, is more upbeat, and provides some reassurance that you can (and should!) continue to gather and use data.
In my last post, I described how the word primitive was a foreign term even to some “A” students, and how this proved to be a problem for an assessment on the use of textual evidence. Beyond such content-specific vocabulary, there’s a secondary issue of assessment lingo: we ask kids to examine, analyze, compare, and evaluate, but few teachers directly instruct exact meanings for these terms.
The solution here is simple, but often bothers English teachers: define terms for the kids. When students ask you what you mean by contrast, you should be willing to explain that for them every time.
Why? Because the term itself isn’t the skill you want data about. If diction is the learning target, independence is obviously an expectation. For all the other assessments, though, you’re damaging your own data if you don’t make sure the students understand every word.
Aim Small, Miss Small
Here’s one of the most regular, self-inflicted data failures we bring upon ourselves: writing questions that attempt to assess too many things at once.
If I’m writing a short-answer question for an assessment about a passage’s tone, my expectation is for:
- complete sentences;
- a clear response to the question;
- a quote (embedded and cited) to help prove the answer is correct;
- and an analysis of the quote to tie it all together.
Even without getting into partially correct responses, you can see where my expectations have created six (!) potential point reductions.
But what have I done to my data if I take off one of two possible points for, say, not including a quote? If students paraphrased the text effectively and were right about the tone of the passage, then they’ve actually provided me two separate pieces of data about two different learning goals; they have mastered tone analysis, but they are deficient in using textual evidence to prove their arguments.
When we conflate the two and give them a ½ on the question, we have provided ourselves a sloppy data point. And by the time we’ve graded a set of 120 of that assessment, we might come to a wrong-minded, broad conclusion that sets the class back needlessly. Do they even know what tone is? Or are they just averse to quotes?
Consider that tone example once more. Does the question need to be rewritten? Maybe not.
As long as you’re willing to grade the assessment question for only the core skill (tone or textual evidence, but not both at once), then it can provide you some excellent data.
Writing questions that address one clear skill is ideal. But sometimes a question that entails multiple skills can be highly useful—as long as you aren’t attempting to score it for every skill at once.
Logic suggests a problem with this, though. Even if I narrow the learning target I’m assessing, it doesn’t clarify the problem’s source. Did a student choose her quote poorly because she doesn’t know tone, or because she lacks the ability to choose textual evidence well?
The solution to this, I think, is the post-assessment tool box most teachers already put to use. Conference with students for a couple minutes. They can speak effectively to where things went wrong, and data then becomes highly reliable.
When you don’t have time for one-on-one conferencing, having students self-reflect while you go over the assessment as a class can be just as useful. Ask students to make follow-up marks that you can look over later (“T” meaning “I didn’t understand the tone that well,” or “Q” for “I didn’t know what quote to select.”). This might seem like an inelegant solution, but think about what you’ve created: a robust data set that includes your initial impressions of their skills, alongside a self-evaluation where students have provided input on exactly what skill failed them.
There are obviously dozens of other solutions to the problem of inexact data, but I think the simplest takeaway is to be vigilant about communicating what you want your students to know, and explicit about how your grading rubric measures each learning goal in isolation.
It takes time to fix these sorts of systemic problems. But I’d argue that it amounts to less time than we spend reviewing concepts in class that we’ve misidentified as problematic, having listened to the lies of Bad Data.
Michael Ziegler (@ZigThinks) is a Content Area Leader and teacher at Novi High School. This is his 15th year in the classroom. He teaches 11th Grade English and IB Theory of Knowledge. He also coaches JV Girls Soccer and has spent time as a Creative Writing Club sponsor, Poetry Slam team coach, AdvancEd Chair, and Boys JV Soccer Coach. He did his undergraduate work at the University of Michigan, majoring in English, and earned his Masters in Administration from Michigan State University.