Large language models in pre-university education

May 31, 2023

Miles Berry

Chat GPT and Bard are examples of large language models, in which the computer responds through text to prompts provided by the user. Both have used unsupervised learning to determine the next word to output given all the words in the prompt and the response so far. Some supervised training occurs, with human trainers providing feedback to the algorithm on which alternate response is better.

Teachers’ use of LLMs

Although there has been no requirement and limited encouragement for them to do so, many teachers have already found ways of using LLMs to support them in their work. Uses focus on creating resources for teaching and reducing some of the administrative workload of their professional role. LLMs are convincing in the creation of text in response to tightly specified prompts. Teachers are well-skilled in constructing a stimulus to which pupils respond, and thus they are often adept at the ‘prompt engineering’ needed to obtain the most helpful response from an LLM. Some professional development may help with constructing such prompts, as well as providing a better understanding of the limitations and implications of this technology.

Teachers have found LLMs helpful in, for example, drafting ‘intent’ statements for school subjects, generating medium term plans and individual lesson plans. LLMs have been useful in suggesting ways in which planning might be adapted to meet the needs of pupils with particular special educational needs or who would benefit from additional challenge. The use of LLMs to help reduce teachers’ workload is supported by DfE advice, and would potentially allow teachers to spend more time on personal interaction with pupils.

LLMs are helpful in creating texts for pupils, e.g. for a given reading age, or in summarising existing text. LLMs are also used to produce low-stakes assessment materials, such as multiple choice quizzes or comprehension questions, and for producing model answers to exam questions.

LLMs can draft general and individual letters to parents. LLMs can be used to provide feedback on pupils’ written work and to generate individual school reports, although there are potential intellectual property and data protection issues here, and over-reliance on LLM generated responses would compromise the ‘personal touch’.

Pupils’ use of LLMs

Secondary pupils’ own use of LLMs is already extensive, particularly in secondary education, although little of this is happening in school. The terms and conditions of Chat GPT prohibit its use by under 13s, and Bard currently requires users to be 18 or older. Perplexity provides access to GPT through and API without registration and thus can be easily accessed by under 13s. My AI (powered by Open AI’s GPT) in Snapchat is provided to all the platform’s users, with many of its features appear designed to appeal to a younger user base. The roll outs of CoPilot across Office 365 and Bard for Google Workspace seem likely to increase the exposure of secondary school (and younger) pupils to LLMs.

Some uses seem entirely appropriate and in line with the support that pupils might otherwise expect, such as explaining or simplifying complex texts, suggesting ideas for essays or stories, suggesting improvements to the child’s work, to quiz the child on their understanding, or to engage in a Socratic dialogue on a particular topic. A well-motivated pupil could use LLMs effectively to support their own independent study and revision.

On the other hand, there is a strong temptation for many pupils to have the LLM do the work set for them: to answer the exam questions, to write the story or essay or to do the coursework. LLMs generate very plausible text, particularly if well-prompted, and it seems likely that many pupils will be able to get away with submitting work done by the LLM, without having learnt the material on which this work was set. Teachers should emphasise that work is set to help pupils learn, and that getting someone / something else to complete a task on their behalf won’t help with this.

Ethical issues

The widespread access to increasingly sophisticated LLMs raises a number of ethical issues for educators and their pupils. DfE advice highlights that use of LLMs should not compromise safeguarding, and the Data Protection Act places limits on what can be done with personal data, particularly for minors. The 18+ age limit for Bard seems wise, even if this will not be always adhered to, but few parents, teachers or teenagers seem equipped to consider the implications of interactions with My AI in Snapchat.

What use of LLMs constitutes cheating? Discussions with pupils suggest that using LLMs in a way that goes beyond the support that teachers, tutors, parents or peers would offer would cross an ethical line. Similarly, whilst few would argue against LLMs generating lesson plans or assessment materials, teachers’ use of LLMs to provide summative feedback, or to award examination grades, might raise legal and ethical issues.

Until recently, Open AI (GPT’s developers) were to semi-transparent about the training corpus and methods used, but how should educators, the wider population and regulators best ensure that the values expressed by LLMs are unbiased and in line with widely accepted ethical standards? It is easy to imagine how bias in the training process or corpus could subtly shift the values expressed in LLM generated texts, and thus attributed to, or even adopted by, those using them.

More philosophically, whilst we might argue that the LLM has no ‘understanding’ of material, it is hard to say what features of its responses show that such understanding is lacking. We assess a pupil’s understanding by their ability to answer questions through language, which LLMs are particularly suited to do.

Wider educational implications

As yet, we haven’t seen extensive integration of LLMs into education-specific platforms, but such application of now widely available technology seems likely in the near future: the adaptive presentation of content and assessment has previously been limited to costly individual tutoring but will soon be accessible to all. There is potential here for the role of teacher to be moved away from knowledge transmission and assessment to facilitator or mentor. I think it unlikely that teachers will be replaced by LLM-powered tutoring systems just yet, due to the vital motivational and non-cognitive roles that teachers fulfil: school teaches pupils how to be grownups, and other grownups seem better at this than computers.

Assessment currently plays a dominant role in driving learning and teaching in school. The prioritisation of assessment reliability and widespread access to LLMs means that, at least in the short to medium term, we are unlikely to see more extensive use of non-exam assessment (e.g. projects and portfolios), and where these are used, stringent supervision seems likely. Alternative, possibly more valid assessment, might be reformulated to take into account the use of LLMs, such as through projects and portfolios!

It’s early days for LLMs, but it seems likely that this technology, and subsequent developments in AI, will have a profound effect on many jobs, raising the question of how best to prepare pupils for a very different future. Some understanding of how to use AI, how AI works, and its ethical implications seem essential for this, but perhaps more extensive curriculum change to prioritise affective, compassionate and inspirational aspects over knowledge may be the appropriate response.

Concluding remarks

Despite the very recent introduction of Chat GPT and Bard, we are already seeing these tools used effectively by teachers and pupils. Regulators and awarding organisations are concerned to preserve the reliability of assessment, which in the short term may place limits on the use of these tools. The curriculum should include some understanding of how LLMs work and how they can be used effectively and ethically. Any future use in or for school should seek to enhance teaching and learning, rather than replacing human interaction.

Originally written as a discussion paper for the Royal Society’s Education Committee

« ChatGPT for pupils Engagement in computing »