Assessing Work in which Students Use AI
The reading below emphasizes assessing the students' process over their final deliverables. Read more to see why this is important when teaching with AI and how to implement this type of assessment.
For the reading below, we've included the estimated reading time (
), the resource citation ( ), a main idea quote ( ) pulled from the resource, and key questions to consider ( ) as you think about why and how to implement these technologies.ChatGPT is old news: How do we assess in the age of AI writing co-pilots?
Reading Time: 17 minutes Liu, D. & Bridgeman, A. (2023, June 8). ChatGPT is old news: How do we assess in the age of AI writing co-pilots? Teaching@Sydney.
Main Idea Quote:
The next big thing on the generative AI front that we need to pay urgent attention to are AI writing co-pilots that will be directly embedded into productivity suites like Microsoft Office. These “co-pilots” are AI-powered assistants designed to assist you to generate content...
Jason Lodge Links to an external site. from the University of Queensland and colleagues, and Michael Webb Links to an external site. from Jisc UK’s National Centre for AI, have written about the main options regarding assessments. Webb writes that we can either avoid it, try and outrun it, or adapt to AI Links to an external site....
Outrunning it involves trying to design assessments that AI has more difficulty completing – but the risks are that our redesigns will only be temporarily effective Links to an external site. as the pace of AI development accelerates and will make the assessment more inequitable for many of our students. Modifying stimulus (e.g. using images in questions) has been suggested, but GPT-4’s Links to an external site. ability to parse images will be released Links to an external site. to the public. Modifying the content of assessment is also a popular suggestion, such as connecting with personal events, writing reflections, or linking to class material – but recent work has shown that GPT writes higher-quality reflections than humans... Links to an external site.
Adapting to it means we need to rethink how we assess – as Lodge writes, this is a more effective, longer-term solution Links to an external site., but also much harder...
Much of the early dialogue around generative AI has been around its (mis)use in the bottom right quadrant [of the image] – high AI contribution and low human contribution. As generative AI becomes more inescapable, we need to consider where along the top half of the diagram each of our assessments sit – and they will necessarily sit at different places, depending on the year level, learning outcomes, and other factors...
An integral part of the suggested model... is to be able to evaluate the process of making something with AI. Some of the following ideas may be useful criteria to include in the marking rubric, alongside other criteria. Ensure that the criteria that you include in your rubric align with your learning outcomes...
- AI prompt design that demonstrates disciplinary expertise: how thoughtfully the student has designed the prompt(s) for AI and considered the complexity and clarity of prompts. High distinction could include a well-structured and deep understanding of disciplinary concepts that are demonstrated in effective prompt design.
- Critical evaluation of AI suggestions: how effectively the student evaluates and utilizes AI suggestions, as in whether they simply adopt AI-generated content or make conscious choices about what to include. High distinction could include critically making nuanced and evidence-based decisions about what to accept, modify, or reject.
- Revision process: how the student has revised AI suggestions and demonstrated their critical thinking skills and disciplinary expertise. High distinction could include an insightful and critical reflection on where AI generated content needed improvement, and why. It could also include a demonstrably significant improvement in the quality of the work.
- Information and digital literacy: how the student has evaluated AI-generated content through relevant scholarly sources to enhance the rigor and reliability of the output. High distinction could include the integration of high-quality sources that are appropriately critiqued.
- Documentation and reflection on the co-creation process: how the student has recorded appropriate decisions and interactions with the AI co-pilot, and analyzed the strengths, weaknesses, and future improvements to these interactions. High distinction could include a clear and ethical articulation of decisions and their reasoning, deep insight into the role of AI in the co-creation process, and suggestions for future practice.
- Ethical considerations: students’ awareness of the reliability, biases, and other limitations of AI generated content. High distinction could indicate a strong understanding of these issues, with suggestions for mitigating potential problems.
Key Questions to Consider
- What impacts might AI co-pilots have on your discipline?
- What has been your response to AI (avoid, outrun, or adapt)?
- How is AI being used in your field? How can you use this to inform your teaching?
- How might you use the four quadrants idea to design your assessments? Where could your current assignments fit in this image?
- What authentic tasks could your students co-create using AI? What can you do to assess the process rather than the outcome?
Sample Rubrics
Review the sample rubrics below for more ideas on how to assess student work that involves generative AI.