The second wave of the AI and assessment crisis

Researchers Thomas Corbin, Sue Sharpe, and Phillip Dawson warn that wearable AI will trigger a second wave of the assessment crisis, as devices like smart glasses provide real-time cognitive assistance that cannot be easily detected or excluded from invigilated exams. The authors argue that unlike screen-based AI tools, which require users to shift attention away from tasks, wearable AI operates transparently within ongoing activity, making it phenomenologically invisible and impossible to police without invasive bodily scrutiny. With Meta selling 7 million smart glasses in 2025 alone and the technology rapidly expanding, the study concludes that traditional methods of ensuring assessment authenticity through physical exclusion of technology are no longer tenable.

In this paper Thomas Corbin, Sue Sharpe & Phillip Dawson https://www.tandfonline.com/doi/full/10.1080/02602938.2026.2661367 suggest that wearable AI will bring a second wave of the assessment crisis. In the first wave, there has been a reliance on the idea that physical examination provides a backstop which can underwrite authenticity: “the physical exclusion of technology at the point of performance” pg 1 . They argue that wearable AI will make it vastly more difficult to enact that exclusion because they can provide real-time cognitive assistant without external markers which indicate they are being used for this purpose. This is still a new field but it is rapidly growing. Meta sold 7 million smart glasses last year https://www.theverge.com/tech/878184/meta-sold-7-million-smart-glasses-in-2025-thats-triple-2023-and-2024-combined , with signs suggesting growth is accelerating. These are just manufacturer within a broader field of wearable AI that is receiving huge investment. So while someone might be able to spot Meta’s Ray-Ban glasses it’s unfeasible to imagine that every wearable device could be reliably spotted. There also equity issues which arise from the fact these serve real assistive functions for many users: they are dual use in a way which precludes ethical exclusion. The assumption we would ratchet up oversight in order to prevent them being brought into invigilated spaces raises all manner of ethical, legal and political questions. As they put it, “A regime that extends scrutiny further than simply glasses must decide how far into the student’s embodied presentation it is willing to reach” pg 13 . A commitment to excluding these devices necessitates a form of “bodily adjudication” based on two conditions which are decreasingly tenable. From pg 12: First, invigilators must be able to identify which objects on a student’s person are relevant candidates for scrutiny. Secondly, they must then be able to determine whether those objects are AI-enabled or not. Under conditions of wearable AI, neither condition can be assumed. The issue is not simply that smart glasses may be difficult to distinguish from ordinary eyewear. Rather, it is that the relevant class of wearable technologies no longer maps neatly onto a small set of visibly exceptional devices. The deeper transition they are pointing to here involves a shift from AI as a discrete tool to one which is embedded in practice in a way that might not ultimately be separable. In this sense I think we can see inline automation tools Copilot 365 and Grammarly etc which offer ambient assistant to users as another vector of this transition. I thought this was really important on pg 6-7: Screen-based AI is structurally different. Consulting ChatGPT on a laptop or smartphone requires directing attention away from the task at hand, engaging with a separate interface, reading a response, and returning to the task. Even when this process becomes routine, it remains episodic. The tool cannot become phenomenologically transparent because the architecture of use requires repeated explicit engagement with a separate object. The user must turn to it, attend to it, and return from it. Smart glasses differ because they operate within, rather than alongside, ongoing activity. They have the architectural capacity to become phenomenologically trans- parent, to withdraw from user awareness and become part of the structure through The episodic character of user-model interaction for chatbots is exactly what makes meta-cognition possible. They demand articulation, even if minimally, while also making the interaction itself available as an object of reflection that can inform that articulation. This is why it’s possible to use chatbots in an active way. In contrast inline automation tools insert themselves into the flow of activity in a manner which is intended to render this episodic experience unnecessary. This is literally baked into the metaphor of the Copilot. It’s possible to meta-reflect while you’re in flow but I don’t think it’s possible for learners to do this: the space is crucial for developing this capabilities in the first place. For this reason I’d suggest we see the second wave of the assessment crisis as responding to three factors: a the declining burden of articulation in chatbots b the parallel growth of inline automation tools c the rise of AI wearables. This is how they describe the distinction between the first wave and the second wave. From pg 9 The first wave, exemplified by screen-based systems such as ChatGPT, created a crisis of practice within an intact institutional framework. Tasks had to be redesigned, expectations renegotiated, and academic integrity policies rewritten, but the basic shape of the problem remained familiar: students were using an external tool, that tool produced identifiable outputs, and institutions could still, with effort, separate students from the tool at particular assessment events. The first wave was a harder version of a problem assessment had encountered before.The second wave is different in three ways, and each of them matters inde pendently. First, the property itself is structurally new. Screen-based AI is episodic by architecture. The user must turn to it, attend to it, read its response, and return. Even a heavily reliant user is engaging with the tool as a discrete object on discrete occasions. Wearable AI, as the previous sections have argued, has the structural conditions for incorporation. It does not function as a tool the user consults but as a capability the user inhabits. This is not a difference of degree. It is a difference in the kind of relationship a user can have with the technology, and it is not a difference any previous educational technology has had to confront at scale. Once AI use is no longer “external, episodic, and, at least in principle, distinguishable from the student’s own ongoing activity” pg 10 then assessment strategies built around exclusion become fundamentally untenable. It’s another argument that supports the notion that fundamental assessment reform has to happen so we might as well get on with it. The problem is that I still don’t believe that processual assessment is adequately scalable https://markcarrigan.net/2026/05/27/we-need-structural-changes-to-assessment-rather-than-discursive-changes/ within mass higher education. So the vice tightens 😬 This is what my book with Milan Sturmer is essentially about. The short-form version of the argument is that post-training has made chatbots vastly more able to infer user expectations without deliberate and expansive prompting. Therefore the user has to articulate themselves much less to get what they want.