A breakdown of a more reliable grading scheme for skills based grading
You can't rely on standardized tests or common assessments.
Marzano presented research which shows that large standardized tests are reliable when looking at general trends for a whole school (87% reliability), but are incredibly unreliable when using the same data to examine class or individual performance (33 to 57% reliability). Multiple choice tests are hugely unreliable. Using a single assessment is unreliable. Solely using common assessments is unreliable - formative assessment has to take into account a complete picture of teacher-student activities.
You can't rely on the 100-point scale.
Marzano went through an interesting exercise with the entire crowd. He asked everyone to grade a hypothetical test consisting of 10 simple questions, 5 complex questions, and 2 higher order questions that go beyond what was taught in class. Everyone gave a grade to a student who got all the simple questions right, half of the complex questions right, and none of the higher order questions right. The results? The highest grade from the audience: an 83%. The lowest grade from the audience: a 20%. The bottom line: percents don't mean anything with regards to student knowledge! The exact same student with the exact same knowledge demonstration had a 60% error in their final grade. So if we can't use percents to determine knowledge, what can we use?
A new reliable rubric for grading skills
Marzano's proposed rubric uses a 4 point scale. Broken down into student friendly language, it that looks like this:
- 4: I can do it better than the teacher taught me
- 3: I can do it exactly how the teacher taught me
- 2: I can do the simpler problems
- 1: I can do the problems with help
- 0: I can't do it even with help
The real power of this grading system is that the grade actually reflects the student's knowledge (as opposed to the meaningless percent grade). This is crucial for students to track their own learning. Marzano goes on to outline different ways to use this grading system for students to track their own learning. These approaches are outlined in his book (we have several copies if you are interested!) There is also a proposed conversion to letter grades, since we still must assign a summative score for report cards:
- 3.0 - 4.0: 90% - 100%
- 2.5 - 2.9: 80% - 89%
- 2.0 - 2.5: 70% - 79%
- 1.0 - 1.9: 60% - 69%
- 0.0 - 1.9: 0% - 59%
It was only a minor point in his keynote, but well worth noting. Marzano pointed out that our current educational climate consists of high stakes test after high stakes test - students are tested out. His proposal is to assess more, test less. His claim is that using the 4-point grading scale, you use a 5 minute probing discussion to get a more reliable assessment of student knowledge than compared to a 30 minute pencil and paper test. In this way you can actually get more information about student knowledge while simultaneously reducing the amount of testing. Win-win!
Here are some rubrics that use the 4-point scale.


Peter's thoughts:
This is huge. I can imagine some people will disagree, but at the heart of it Marzano is saying that the percent grades we give students is completely meaningless. A 90% does not mean a student understands the material. It could mean the test had a lot of easy questions. It could mean that the teacher values certain questions more than others - the percent grade is a reflection of the assessment and the teacher grading it as much as it is a reflection of student knowledge! If you believe this, then you have to fundamentally change the way you grade things!
I have recently tried to design my quizzes with the different levels in mind. The 4-point grading scale for skills seems to work perfectly for math! Overall, I was really surprised to find that many of the things Marzano brought up were things that I have been saying for a while - percents have no meaning, I can get a better assessment of my students through a 5 minute discussion than a 40 minute test, etc. I'm really going to try and push forward with some of these ideas for skills based grading and student self tracking!
This is something we should look at more in-depth as a staff. Had I not been at the Keynote, I don't think I would fully understand how skewed averages and percentages can be, and why it is so important to change our own grading scales. In order to change the scales, we would have to discuss how they would translate into PowerTeacher. I see this as being a school-wide initiative.
ReplyDeleteI think everyone should definitely consider the 0-4 grading scale, even if you don't plan on using it for your grading! I think it's worth sharing with each other what you think a level 2, 3, or 4 question looks like. Especially level 4 - going beyond what you taught could be a simple synthesis of two concepts you taught to some much higher order skills - we need to have some consistency when we grade this way. Another reason we should all consider this approach is that it builds in scaffolding and SPED modifications. For students with an IEP, you can specifically outline the skills and levels they should be shooting for. Again, I don't think it's necessary for everyone to change their grading system to match this one, but I do think it's worthwhile to design assessments with this framework in mind and have discussions within departments as to what it looks like.
ReplyDeleteLisa - changing PowerSchool to match this grading scheme is a tricky issue. Ideally I'd like to enter all my grades from 0 to 4, but the percents that PowerSchool computes won't match up the percents outlined above. So I've converted the 0-4 scale to percents before entering grades. In other words, we'll still use percents in a summative way, but the 0-4 scale should be used formatively so students will understand where they need to go. It's not 100% ideal, but it can work for now.
I am concerned about grade inflation and Marzano's integer grading scale. Do scales like this inflate grades? How do you ensure that students who pass have enough understanding of the material?
ReplyDeleteI'm asking these questions because the integer rubric appears to be very forgiving for underachieving students.