How can student work be used to strengthen assessments?
By Lauren Stoll and Jill Wertheim | October 4, 2023
In our first blog post, Student Work is Gold, we wrote about how student work from assessments can become the key to unlocking instruction that is truly responsive to students and advances their three-dimensional learning. But student work can also be used to make the assessment itself a better and more equitable measure of students’ learning, in turn providing deeper insights for instruction.
We all want to construct challenging, meaningful assessments that provide the support students need to show what they know and can do. So, when we initially draft assessments, we use our best professional judgment to meet that goal. But how do we actually know if it does? This is where the power of student work again comes into play. Students will ultimately tell us what the strengths and weaknesses are and how to improve.
This year, we partnered with Vivayic to support six development teams to pilot and analyze student data to strengthen six performance assessments that they developed using agricultural phenomena. When analyzing student data, we used indicators based on our assessment design criteria to see empirically whether or not these criteria were met. Each of the teams found substantial ways to revise their assessment for all three quality criteria. We’ll take you through just a few examples for each criterion to illustrate some common trends and how they addressed them.
Criterion 1: Assessment tasks require sense-making using the three dimensions. This criterion focuses on how students are using the dimensions in service of sense-making about a phenomenon. A common strength across all the assessments was that they were successful at engaging students in making sense of a phenomenon, and analyses indicated that students were motivated by working with the phenomena. One student who took the California Wildfires assessment wrote: “I liked the assessment because this is a real problem that we can fix if more people know about it. If more people know about it, we can help California with these wildfires.” Seeing many responses like these in the student feedback is a strong indicator that a majority of students find the phenomenon worthwhile and motivating.
However, while students were clearly engaging with phenomena they were interested in, student work for the original tasks also indicated that students were not always sense-making using the dimensions developers intended to assess. For example, in the Monday, Tuesday, Happy Graze assessment, the team found that there was very little evidence of the targeted engineering design concepts in student responses to any of the original prompts. To address this issue, they revised the task to include data that would create an opportunity for students to reason about design constraints for grazing systems. This revision will provide evidence of how students use constraints such as cost to evaluate different natural resource management strategies (Figure 1).
Figure 1.Data tables added to Monday, Tuesday, Happy Graze for students to use to calculate the impact of grazing strategies on ranchers’ profits.
Criterion 2: Assessment tasks are fair and equitable. When drafting an assessment, we can only guess how students will interpret what we write, so this criterion is often where we identify the greatest areas for improvement to the assessment. In both the California Wildfires and Farming Systems original assessments, for example, student responses showed that they really struggled to make connections between the data provided and the problem they were trying to solve. This clear trend prompted assessment developers to add more targeted scaffolds to help facilitate these connections. In Farming Systems, students are prompted to select a tilling method to enable food to be grown more sustainably. Figure 2 shows a table from the original version, where students record their ideas about factors that affect soil health. This table was revised to provide scaffolding that helps students consider the implications of each factor on their decision.
Figure 2. The original graphic organizer (left side) from the Farming Systems assessment was modified (right side) to scaffold how students considered the impact of each soil health factor with respect to the assessment phenomenon.
Analyzing student work often illuminated a need for more scaffolding, but there were also places where it indicated a need for more clarity. In the More Cheese, Please and Frog Fungus assessments, for example, the designers noticed numerous blanks for some of their prompts. This can mean a variety of different things, and teacher observations and student feedback provide crucial additional insights to pinpoint the specific issue. In the More Cheese, Please assessment, students were supposed to compare two models of digestive systems to help them figure out why a person experiences lactose intolerance (Figure 3). The initial assessment task provided the same image twice and expected students to interpret how the digestive system would be different, leading to student feedback like this: “I didn’t like how Kim’s digestive system didn’t have a change in it. We could have seen the difference between a healthy intestine and a not so healthy intestine. I really hoped to see what the difference would look like and to learn more about the difference.” A majority of this team’s revisions focused on developing clearer, more differentiated models for students to gather information from, and providing a side-by-side structure to facilitate their comparison.
Figure 3. Models from the More Cheese, Please assessment that were modified to show clear differentiations for students to gather information from and placed side-by-side to facilitate their comparison.
Criterion 3: Assessment tasks are coherent and meaningful. When developers of the Food Fermentation assessment analyzed their pilot data through the lens of this final quality criterion, two key themes emerged. First, this assessment surfaced a common issue with coherence: over-scaffolding. Students seemed to be writing similar things for multiple prompts in the assessment. Again, student feedback provided helpful insights, such as, “I felt as if the questions were repetitive, and I kept having to say the same thing about lactobacillus and all the other bacteria.” The prompts seemed necessary from the designers’ viewpoint to lead students through the storyline, but to the students, they were redundant. This insight helped developers make decisions about removing repetitive prompts (see Prompt 3 in Figure 4). Reducing the length of the assessment also had an additional benefit in making the assessment more accessible to students.
The student work also indicated to developers that the storyline of the assessment wasn’t quite coherent to students — in this case, they didn’t really understand how to synthesize the disparate sets of data meaningfully into their argument. In the original culminating prompt, when students decide whether lactobacillus should be used to prevent food spoiling (see Prompt 4 in Figure 4), most students described the evidence requested, but they did not use reasoning to explain how each piece of evidence contributed to their decision. Thus, the designers clarified and simplified the problem throughout the assessment so it was explicit to students how each step built toward the culminating prompt, and they revised the final prompt to provide clearer guidance about the role each set of data could play in supporting the claim.
Figure 4. The culminating prompt from Food Fermentation was revised to remove unnecessary prompts and show students how each data set they analyzed during the assessment can be used to support reasoning about their claim.
The Magic of Student Work
By the end of the student work analysis process, it was clear to the assessment developers how essential this step is as part of the development process. One developer shared the following advice: “Always use actual student data if available. Don’t be afraid to revise, revise, revise.”
Centering the development cycle around what students actually show us about the strengths and weaknesses of an assessment is essential to transform a good assessment into a high-quality assessment, ensuring that it is truly student centered. And the magic of student work reaches further still than the assessment itself. Revising tasks based on student work not only improves assessments, it also impacts teacher practice. As one educator noted, “I will be a better educator for having gone through this experience. I am also more confident in interpreting and using the NGSS.”
To see these free agriculture-focused performance assessments, please visit this task library.
For more support in using student work analysis to strengthen assessments, please enroll in our free EdX course that explains how to use this SCALE Science Tool.