Offside Assessments

Or, No More Cherry Picking (and How to Stop Doing That)

The Goal of Standards-Based Grading (SBG): Use standards to give students specific, useful feedback that will help students know what to practice to improve their understanding and performance and you (the teacher) know how to help the students and what to keep testing.

The Temptation: Exclusively (or almost exclusively) create assessment questions that test individual standards in place of ones that test synthesis and using skills in context.

Warning signs: You think of tests you are writing now as “SBG assessments”.  Your tests ask questions to intentionally isolate every skill. You label each individual test question with the skill the student should use to solve it.

While there isn’t One True Way to grade and assess, and while there are times when a question that does a good job of isolating a skill is necessary, this post starts from an assumption that in general, good tests ask students to use skills in context and in concert with other skills. Let’s tally this one as part of my Teaching Manifesto series. (In other words, a thinking-out-loud of what I’m trying to do in my classroom and why I’m trying to do it.)

Does SBG make the class easier or harder?

But first, an aside. I think this topic of isolating skills for testing is related to the hypothetical criticism of SBG that I’ve heard about the grading system “watering down” the class or making it a “baby” class. In this context, I can start to understand that fear better. For the rest of the post, I’m going to argue that making tests simpler is a choice that should be avoided, but first, I need to comment on the idea of easy versus difficult.

In my experience, SBG makes a class much more difficult, not easier, if difficulty is related to how much content a student needs to really master and how good a student must be at consistently solving problems. On the other hand, SBG makes a class much easier, if difficulty is related to how successful students are in the class (on average and/or individually).

My Physics and Honors Physics exams of the past few years have been no joke. In fact, they have been much more challenging than the ones I gave in my first couple years of teaching physics. As my students have gotten better at learning physics, their work during the year has demanded the change. More on that later, but now back to that main issue of the work during the year—

Isolate the feedback, not the assessments

It is important to test multiple skills in context while giving more targeted and specific feedback on individual skills. That is, a single question on a quiz should probably be testing at least 3 or 4 skills, but later in the year, probably even more than that as the content keeps layering back on itself.

Every (or almost every) question should usually be forcing the student to make the choices about how they will model the situation in the problem, which representations they will draw, which fundamental principle they will use, etc.

A student should be able to get a question wrong, while still showing that he knew how to use some of the required skills. That is, I should be able to say “you’re doing this well, and here‘s where you’re having trouble.” I can use the standards while giving feedback to help isolate the trouble a student had on a question. And by testing the standards multiple times on an assessment as well as through the student’s history with that assessment, you can notice patterns about their problem solving skills that can be addressed and fixed.

Example: Unbalanced Force quiz problem

Here’s the back of a quiz done by an Honors Physics student last December. The pencil marks are his work on the quiz, the green pen marks are his notes after the quiz (while looking at my solution), and the blue pen marks are my notes. In class, we were working on the Energy Transfer Model, the 7th of 7 models in our first semester.

N2L Quiz Example Grading

Though I only listed the four Unbalanced Force Particle Model skills in the objective list*, the problem also tests at least four other, older skills**. I didn’t write every skill tested in the chart because at that point in the year (near the end of the first semester), nearly every Honors Physics student would have already shown current mastery on them. A quick scan of ActiveGrade would have shown me who needed to show those objectives, and I could then note them on those specific quizzes. In the other direction (as above), I can easily add another objective to the list if the student shows me that he doesn’t have mastery on it.

Back to the quiz. So the kid (let’s call him “Wallace”) starts off doing all the right things. After reading that question, the first five things he wants to do involve drawing diagrams (a free body diagram, a force vector addition diagram, position-, velocity-, and acceleration-time graphs). Right on, Wallace. And, in fact, he not only wants to draw diagrams, but he values them enough to spend the time to get them right (see the mistakes he corrected in both of his force diagrams—direction of forces, balanced/unbalanced forces), to care about details like relative sizes of forces on the FBD, and to do some annotation on them. So far, so great.

Then he runs into trouble with the velocity graph. He focuses on only part of the area (something he notices and notes in his correction) which leads him to write an incorrect equation. But the biggest problem is that he doesn’t use the force diagrams to find the slope on that velocity graph, so he’s left with only 2 equations for his 3 unknowns. Aha! That’s a really key skill—Newton’s 2nd Law is a pretty big idea. And I’m sure that Wallace knows that Fnet = ma. If I’d isolated the skill—given him a problem that specified two of those quantities and asked for the third, he almost definitely would have been able to do that problem in December. He’s got all the pieces, and at that point, he’s still working to put them all together.

Going forward, the next time I see Wallace having trouble making the quantitative connection between forces and motion, I can start to identify the pattern. If I see it again, I can write a more targeted note on a quiz and/or have a quick conversation with him. I have a pretty good picture of his strengths and weaknesses now, and I can share that with him and give him some specific coaching.

How many standards?

The number of standards in a unit or year or semester should be determined by how grainy the feedback should be, not by ease of grading or testing. It should be normal for a student to use every skill from a unit while solving a single problem (though not every problem will be a good test of every skill from that unit)—the standards are highlighting the various aspects of using a model that require attention and consideration. So, one is too few. Twelve is probably too many—a student needing to pay close attention to twelve different things at once in a problem is probably in over his head; conversely, specifying twelve aspects of using a model might mean that you’ve started using standards to specify “types” of problems instead of skills that apply to using that model to represent any applicable problem.

Make a standard for things that you see the best students do so that you can help struggling students find where to place their focus. Example: one of the objectives on Wallace’s quiz, above (UBFPM.2  A  My FBDs look qualitatively accurate (balanced or unbalanced in the correct directions, relative sizes of forces).) helps students develop good thinking habits by pointing their attention to the relative size of forces when drawing their diagrams. The best students have always done this kind of thinking during their qualitative work, and it has helped them set up their work quantitatively and judge whether their answer makes sense. It’s worth taking the time to give feedback on that aspect of a student’s diagrams and to prompt a student who isn’t getting mastery scores on that objective to stop and think about what she could do better.

Which brings us to the next topic—how should students be practicing skills if they are going to be tested on them in an integrated fashion?

Good practice can be isolated and complicated

In general, to practice a skill students should find and try some problems that involve using that skill. If they can’t identify problems where they can practice a certain skill, they should be meeting with a teacher ASAP. Providing some extra practice for a model is probably good. Providing extra practice for specific skills is probably not helping the overall cause of being able to identify when to use a skill, why to use a skill, etc. That should be fixed with more coaching of the student.

Even while practicing multiple skills together, a student can be focusing mainly on a particular skill. They can put the teacher’s specific feedback into use by really examining their thinking and work with a certain skill. Probably the most isolated that practice gets is when a student is struggling with a qualitative skill (what I call A objectives in my conjunctive SBG system).

What about struggling students?

Students who are in the midst of the most struggle need help (extra coaching) breaking down problems to set up isolated, targeted practice when we’ve identified a specific difficulty that needs to be ironed out before we can move forward. If you can’t consistently draw velocity-time graphs, you’re going to hit a stopping point. Ditto FBDs, LOLs, etc.

Here’s the key—when you isolate a skill, you need to practice more subtly, more deeply than you would likely need for a quantitative problem. If you want to be sure you can use that skill, consistently, as the basis for moving on to the rest of a problem, you need to be able to consistently do complex qualitative work.

And the second key—that’s not the only practice you should do. Indeed, when it is the only practice a student has done, it is not only obvious from their work, but that student is rarely able to be successful on subsequent tests.

Example: Practicing free body diagrams (FBDs)

Let’s say a student is struggling with that important skill mentioned above, depicting the relative sizes of forces on a free body diagram. She’s come in to test on it a few times, but she isn’t making a lot of progress. Time for some extra coaching.

Here’s what you want to do: Go back through old packets and draw several FBDs for every object in the problem. Take “sweaty man” from our BFPM packet, for example. Classic.

Sweaty Man BFPM problem

Ignore the numbers for now. Draw an FBD for the package. Then draw one for if it’s slowing down instead of moving at a constant speed. Then draw one for if it’s speeding up. How do the forces on the object change? How do the sizes of those forces change?

Now do the same thing, but for the sweaty man in each of those cases.

Then find other old problems and give them the same kind of treatment. Then bring your work to show me (or show it to one of my Honors Physics students from last year) and we can look at it together.

Whoa. That’s great isolated practice.

Practice in general

So in physics, just as in sports or music, you isolate something you know you aren’t doing correctly. You slow it down, you examine it, you play with it. But you don’t exclusively practice and/or test it. In fact, you only ever really “test” it in context. The following snippet from a Grant Wiggins blog post about learning and soccer highlights this idea pretty well—

I often tell the story of Liz, my former co-captain whom I yelled at in a game: “Use all the drills we worked on this week!!!” In the middle of the game, she stopped running, looked at me and yelled back: “I would, but the other team isn’t lining up the way we did the drills!!!”

So while good practice can be isolated, it shouldn’t be exclusively isolated. And it should definitely be more complex than they think they might encounter on a test.

Extra test, extra caution

Extra tests (aka reassessments, etc) come the closest to cherry picking. The students sign up for a particular set of objectives, so they know (or think they know) just what skills they should be using when they arrive.

The A objective tests in particular are the most isolated—but for the same reasons as above, they also need to be much harder. The skills tested on that extra quiz need to be each tested more than once and from multiple perspectives. Those tests need to be set up to really poke at all of a student’s weak spots in her understanding of the diagrams and concepts. When I’m writing an A objective test, I’m trying to think of classic struggles with those skills, and I want to try to make sure that the student and I won’t be tricked into thinking he is ready for using those skills to solve problems when he really isn’t quite there yet. It’s about setting students up for success in the larger picture.

Cautionary anecdote: I know a teacher who, when he was new to teaching physics and to using SBG, gave students reassessments that he knew they would be able to do based on what they had done the first time. After a while of trying that (and having the same students needing subsequent reassessments on the same skills), he realized that he needed to be giving them reassessments that they shouldn’t be able to do (given what they did the first time) so that when the students passed them, he and they were confident that they had made progress in their understanding.

Again, the key is setting students up for success in the big picture.

Extra tests aren’t re-tests

And on that note—I try to use the term “extra test” instead of “reassessment” because the latter implies (to many students and some teachers) taking the same test again. If a mastery score on a standard indicates confidence in a student’s ability to use that skill consistently on new problems, then a good test would involve problems that look very new to the student. Different numbers don’t make a problem new. Different names don’t make a problem new. One of the trickiest things to watch for is students solving an old problem instead of engaging with what is actually in front of them (that’s not going to lead to long-term success)—and giving them the same test again will make it nearly impossible for you to catch that problem.

Test early and often

When I started out with SBG, I gave unit tests after finishing each model.

When I switched to the process that I’ve used for the past couple of years (having a quiz every week (with everything always fair game) and no unit tests), I was nervous that my students would not be prepared to succeed on such a long and comprehensive test as the semester exams.

I am pretty convinced that the reason they had no problem with the exams was that they had been testing in context all along. I was rarely (in regular physics) and pretty much never (in honors classes) giving standards-isolating questions. In fact, the variety and mixture of content on the quizzes was much greater than it had been on unit tests. And actually, the difficulty of the quizzes was much greater than the unit tests as well—since I knew I would be testing the same skills again (and soon), I felt more free to push students with challenging problems. And getting a bit beat up on a quiz, while not strictly fun for the students, wasn’t nearly as spirit-crushing as getting a bit beat up on a more summative-feeling unit test.

Using SBG doesn’t necessarily mean writing all brand-new assessments. It probably doesn’t mean writing all new questions. One way it has changed how I write assessments is that I no longer write them in advance. I look at ActiveGrade before I write a test. I think about what we aren’t doing well yet. I hit the weak spots again and again.

As I say to my students—I don’t learn much from testing you on something that I know you know well. But we both learn a lot when I test you on things that you just barely can’t do perfectly or consistently yet. And we’re playing the long game here. You want to find out now what you can’t do, when you have time to practice it, instead of finding out on the exam, when the class is finished.

* I’d really rather not print the objective list on the same paper as the quiz, but I gave in to not wanting to use so much paper by printing it separately. I have other ideas, but I’m still searching for a more perfect solution, there.

** Namely:
BFPM.1  A  I draw properly labeled free body diagrams that show all forces acting on an object.
BFPM.3  A  I relate balanced/unbalanced forces to an object’s constant/changing motion.
CAPM.1  A  I can draw and interpret diagrams to represent the motion of an object moving with a changing velocity.
CAPM.3  B  I can solve problems using kinematics concepts.

11 thoughts on “Offside Assessments

  1. As a teacher attempting to try SBG for the first time in the upcoming school year I wanted you to know that I LOVED this entry. It hit on so many ideas I’m sure would hit me halfway through the school year, but now I have them in mind before I even begin! Any advice for someone just starting SBG?

    1. Thanks! Hmm. Here are some thoughts—

      Starting advice would be to of course spend as much time as you can give it toward thinking through how the process will work in the environment of your classroom and school. You know your students and your school, what students’ schedules are like, what the rhythm of the year is, etc—and anything you can do to make your process fit in for them will only make things easier.

      Don’t spend the first day talking about grading with your students. Toss it into a handout somewhere and don’t mention it right away. Grades probably shouldn’t be an issue for the first few days, anyway, right? So they don’t need to get into all the details right away. It’s really overwhelming and anxiety-causing for some kids when you tell them that you are doing things differently (but then they can’t actually live through a cycle of it for a while yet). Actually, not making a huge deal of it is probably the best. And giving some details out in 5 minute chunks here and there when the class seems interested and to want it. I try not to let them bring up grading until maybe 5 to 10 minutes before the end of any class period so that I know we can’t throw the whole class meeting away working ourselves up into anxiety over ridiculous hypotheticals. But yes, making sure the structure is available to them in a handout in their binders somewhere is probably a good idea so that they can sit down with it by themselves at some point to figure it out if that’s what they need to do.

      And probably one more thing—think through your side of the process before you get started. How will you keep track of everything? How will you handle extra tests (if at all)? How often will students be involved in determining what they should be testing, etc. All the technical parts of it, and how complicated and time consuming they might be—that’s all good to have an idea about before are in the middle of things so that you aren’t overwhelmed with it while you want to be spending your time on getting ready for class the next day, etc.

      I’m not sure if that all made sense or was helpful, but just some off-the-top-of-my-head thoughts.

      Oh, one more thing—the core structure to what you want to do is really important, and you want to share that with the kids early on (as in, “Mastery and consistency are important—only what you can do well at the end matters, not how early in the year you can show a skill once”), but the nitty-gritty process of how often you’ll test, how to handle reassessments, etc—that might definitely morph to make things work for your class and for you as the year goes on. So I guess I’m doubling down here on not spending the first day outlining every detail of the whole system. It’s the heart of the system that it’s important to communicate, and you what yourself and them to be a little flexible in feeling out how the particular implementation is going to work best for all of you. If you promise something at the start that turns out not to make sense, it can be really upsetting for kids when they feel you keep changing the rules. So negotiating that kind of flexibility while you together try to strive for the spirit of the feedback system is probably the most key thing to making it work the first time through.

  2. Kelly, Great Post! I really struggled integrating both modelings and SBG in my classroom this past year. My students really struggled with the use of skills in context, i.e. “knowing how to solve the problem.” As a result, I gave them opportunities on tests and quizzes to show mastery of skills in both isolated and integrated contexts. By the end of the school year, I was working very hard to make it very clear how to solve the problems during “instruction” and would often break down a problem by asking for each skill individually, while leaving the connection between the skills and solution to the students intuition. What are your thoughts on this practice (if you follow my explanation)?

    Also, my standards were broken into C,B, and A categories. C standards were the basics (Core), B standards were the more difficult skills or concepts (Building) and the A standards became “advanced solving problems with integration and application of multiple skills and concepts” from the unit. I think that isolating the skills/standards gives the students the opportunity to show that they can do some things, without always “getting the problem wrong.” Then putting it all together becomes the ticket to earning the A in Physics. Again, what do you think?

    I’m hoping to improve for this coming school year by focusing my standards and assessments to help the students see the connection between the assessments, grades, and their learning. Thanks for the feedback (if you have some time) and for continuing your posts. They are extremely insightful and inspiring, the latter being almost more important while in the trenches. Blessings at your new school!

  3. Kelly,
    This was very validating for me since it mirrored a lot of my thoughts and experiences. I ran in to a similar concern about including learning objectives on assessments so now I have a separate grade sheet that I attach to assessments later. It may use an extra sheet of paper but it has some other advantages. I blogged about my grade book system at LOBA Grade book.

  4. Hi Kelly,
    I have been using ASU’s Modeling Physics curriculum for two years (I went to a summer training on the Modeling approach as well) and have been dissatisfied with the exams that are included in the curriculum. As a result, I have tweaked a lot of the quizzes and exams and also attempted to write some of them myself. However, I struggle to design these assessments. How do you go about writing your tests/quizzes? Do you use your creative genius and just make up the questions off the top of your head? Do you look at other physics textbooks and tweak their questions? Any helpful advice would be MUCH appreciated!

    p.s. Your blog has been a LIFESAVER for me…I used a lot of your materials during my second year of teaching physics and I saw students understanding skyrocket! Thank you!

    1. Hey Liz,

      I know what you mean about the tests in the Modeling materials. Many of the (non-multiple-choice) questions were at a good level for my regular sections, but were under the level I was trying to hit with the Honors Physics kids. I don’t have any really well-thought-out system for making tests, but I can tell you my general process.

      1) Sort ActiveGrade by class average on the standards and see which ideas I need to keep hitting that week.
      2) Choose a few things I want to test (or make up a few flavor choices for the week).
      3) Go through my archives of old tests and various resources (including text books, things I’ve found online, the Modeling materials, etc) and find some problems that hit the standards that I want to test (and not in isolation). I also try to make sure I’m getting questions that I think will be just beyond what the kids are comfortable doing, that ask things in a different way than they’ve seen, etc. I modify questions that I find, just find an image from a problem and write a new problem for it, or I sometimes have my own idea (usually inspired by finding things close to, but not quite, what I want) and pull that together.
      4) Put two or three problems on a page, print a copy, and do the problems. Sometimes I don’t realize until I do a problem that it is actually testing something different from what I wanted (or is too easy, or won’t really tell me that a kid has mastered an objective, etc). Then edit if necessary, then print a bunch of copies for the kids (obviously).

      You’re definitely asking good questions here. I should probably spend more time thinking more deliberately about a process for creating assessments. The only big idea I have there right now is that I make the quiz just before I give it (so I can make sure I am testing the objectives that most need to be tested for each group of students). If two sections of the same class are having trouble with really different areas, I’ve sometimes even given entirely different tests (on different topics) in a given week.

      Hope that helps (a little)! 🙂

      Keep in touch and let me know how it’s going this year!


  5. I don’t know if you’re still checking this site. This is the first I’ve heard of SBG (I’m not a trained teacher but am taking a course in science education). It seems like concept mapping would be a good way to assess in this situation. Or does that get too far away from using skills in context. Thoughts?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s