Adapting College Writing for the Age of Large Language Models such as ChatGPT: Some Next Steps for Educators
Large language models (LLMs) such as ChatGPT are sophisticated statistical models that predict probable word sequences in response to a prompt even though they do not “understand” language in any human-like sense. Through intensive mining, modeling, and memorization of vast stores of language data “scraped” from the internet, these text generators deliver a few paragraphs at a time which resemble writing authored by humans. This synthetic text is not directly “plagiarized” from some original, and it is usually grammatically and syntactically well-crafted.
From an academic integrity perspective, this means that “AI”-generated writing
1) is not easily identifiable as such to the unpracticed eye;
2) does not conform to “plagiarism” as that term is typically understood by teachers and students; and
3) encourages students to think of writing as task-specific labor disconnected from learning and the application of critical thinking.
Many teachers who assign writing are, thus, understandably concerned that students will use ChatGPT or other text generators to skip the learning and thinking around which their writing assignments are designed.
In the future, the producers of language models may offer tools for identifying texts generated by their systems. Such tools may arise independently, like a recent app that claims to identify ChatGPT’s outputs. It is also possible that government regulators and other policy-making bodies will become involved in overseeing the use of LLMs in educational settings. New York City schools have already banned ChatGPT, and many experts argue that the abuse of LLMs could extend far beyond the impact on students.
As a teacher and textbook author, one of us (Anna Mills) has been collecting multiple perspectives on the topic for the Writing Across the Curriculum Clearinghouse. Another of us (Lauren Goodlad) is the chair of the Critical AI @ Rutgers initiative and the editor of Critical AI. Though both of us feel strongly that unsupervised use of LLMs for student assignments is detrimental to learning, we believe that, in the short run, a combination of the practices described below will effectively discourage students from submitting machine-generated writing as their own. At the very least, any student determined to use text generation will encounter significant obstacles.
In the long run, we believe, teachers need to help students develop a critical awareness of generative machine models: how they work; why their content is often biased, false, or simplistic; and what their social, intellectual, and environmental implications might be. But that kind of preparation takes time, not least because journalism on this topic is often clickbait-driven, and “AI” discourse tends to be jargony, hype-laden, and conflated with science fiction. (We offer a few solid links below.) In the meantime, the following practices should help to protect academic integrity and student learning. At least some of these practices might also enrich your teaching.
Common practices that can be updated in the current context
- Encourage intrinsic motivation. Most educators already strive to make their assignments engaging, but it’s worth emphasizing that students who feel connected to their writing will be less interested in outsourcing their work to an automated process.
- Highlight how the writing process helps students learn. Make explicit that the goal of writing is neither a product nor a grade but, rather, a process that empowers critical thinking. Writing, reading, and research are entwined activities that help people to communicate more clearly, develop original thinking, evaluate claims, and form judgments.
- Update academic integrity policies to make them explicit about the use of automated writing tools. Academic integrity policies and honor codes should specify what, if any, use of automated writing assistance is appropriate (teachers may wish to consult departmental or institutional policies).
- Ask students to affirm that their submissions are their own work and not that of another person or of any automated system. This practice has long been used to deter plagiarism and can be adapted to include text generation. One of us asks students to add the following statement along with their initials when they turn in written work: “I certify that this assignment represents my own work. I have not used any unauthorized or unacknowledged assistance or sources in completing it including free or commercial systems or services offered on the internet.”
Practices we recommend
- Let students know that detectors for identifying “AI”-generated text exist and are improving. While these tools should not be relied upon as a “silver bullet,” students should know that detection is possible. However, teachers who use these tools as a check on potential policy violations should bear in mind that the detectors may produce false positives and false negatives. Students should know that this technology is rapidly evolving: future detectors may be able to retroactively identify auto-generated prose from the past. No one should present auto-generated writing as their own on the expectation that this deception is undiscoverable.
- Assign prompts that state-of-the-art systems such as ChatGPT are not good at. The below tasks are either impossible for ChatGPT to perform reliably, or require the student to supervise and edit in ways that entail significant expertise, rhetorical skills, and time. As such, students who are simply eager to earn a good grade with minimal effort will likely find such assignments difficult to automate. These requirements may also make assignments more robust in other ways.
- Requirement for verifiable sources and quotations. ChatGPT currently fabricates sources and quotations (though it may occasionally hit on a real author or title). GPT-3 is even more prone to this “hallucination.” Students using either model would need to find and input sources and quotations themselves.
- Analysis of specifics from images, audio, or videos. Students would need to describe these kinds of media in detail in order to generate automated outputs about them.
- Analysis of longer texts (too large for the limited windows for prompting automated systems). Although tools exist for summarizing longer texts, using them adds another obstacle to easy automation. Such programs may introduce errors that make automated text easier to detect.
- Analysis that draws on class discussion. Assigning this criterion requires the student to input notes from class discussion, involving time and effort.
- Analysis of recent events not in the training data for the system. Students would need to do their own research and then feed that information into the automated system.
- Assignments that articulate nuanced relationships between ideas. Such assignments could entail comparing two passages that students themselves choose from two assigned texts. Students might be asked to explain a) why they chose these particular passages; b) how the chosen passages illuminate the whole of the texts from which they were excerpted; and then c) how the two passages compare according to instructions that bear on the course themes or content. LLMs usually cannot do a good job of explaining how a particular passage from a longer text illuminates the whole of that longer text. Moreover, ChatGPT’s outputs on comparison and contrast are often superficial. Typically the system breaks down a task of logical comparison into bite-size pieces, conveys shallow information about each of those pieces, and then formulaically “compares” and “contrasts” in a noticeably superficial or repetitive way.
- Assign in-class writing as a supplement to or launching point for take-home assignments. Students may be more likely to complete an assignment without automated assistance if they’ve gotten started through in-class writing. (Note: In-class writing, whether digital or handwritten, may have downsides for students with anxiety and disabilities).
Additional practices you might wish to undertake
- Assign steps in the writing process and/or reflection on that process. Many instructors already include these practices in their approach to teaching writing. Note that ChatGPT can produce outputs that take the form of “brainstorms,” outlines, and drafts. It can also provide commentary in the style of peer review or self-analysis. Nonetheless, students would need to coordinate multiple submissions of automated work in order to complete this type of assignment with a text generator.
- Hold individual conferences on student writing or ask students to submit audio/video reflections on their writing. As we talk with students about their writing, or listen to them talk about it, we get a better sense of their thinking. By encouraging student engagement and building relationships, these activities could discourage reliance on automated tools.
- If you are curious about the technology, test your own writing assignments. Once you sign up for an account, it is straightforward to test an assignment using ChatGPT by feeding in the instructions and other required information (such as a short text). Users of these models can learn to improve their outputs by prompting the model to add or revise. (Anna has compiled a set of examples).
Practices you might wish to undertake once you learn more about language models
- Teach students about text generators and “AI.” Students are more likely to misuse text generators if they trust them too much. The term “Artificial Intelligence” (“AI”) has become a marketing tool for hyping products. For all their impressiveness, these systems are not intelligent in the conventional sense of that term. They are elaborate statistical models that rely on mass troves of data—which has often been scraped indiscriminately from the web and used without knowledge or consent. No matter how seemingly magical, text generators do not understand language (or anything else) the way that humans do. For the same reason, LLMs often mimic the harmful prejudices, misconceptions, and biases found in data scraped from the internet. (See the below resources for some preliminary introductions to the subject.)
- Show students examples of inaccuracy, bias, logical, and stylistic problems in automated outputs. We can build students’ cognitive abilities by modeling and encouraging this kind of critique. Given that social media and the internet are full of bogus accounts using synthetic text, alerting students to the intrinsic problems of such writing could be beneficial. (See the “ChatGPT/LLM Errors Tracker,” maintained by Gary Marcus and Ernest Davis.) Note that teaching students about LLMs in this way is a different practice from using ChatGPT as a means of teaching students how to write (a practice that some are recommending but which we regard as having limited value). Since ChatGPT is good at grammar and syntax but suffers from formulaic, derivative, or inaccurate content, it seems like a poor foundation for building students’ skills and may circumvent their independent thinking. The tool seems more beneficial for those who already have a lot of experience writing–not those learning how to develop ideas, organize thinking, support propositions with evidence, conduct independent research, and so on.
- Join in discussions with fellow teachers, policymakers, and industry to explore regulations and tools that will help educators support student learning in the age of “AI”-text generators.
Practices we do not recommend
- Requiring handwritten submissions. Writing by hand is difficult for many students, especially students with certain disabilities and students who use voice-to-text software. To be sure, some studies show that students remember things better when they take notes with pen and paper rather than laptops or other devices. Writing by hand might be offered as an option for some assignments.
- Adopting surveillance tools. Some are advocating software that records the students’ entire writing process–systems that have been shown to be biased and inaccurate. Such for-profit surveillance tools are highly intrusive and (like language models themselves) potentially exploitative. They may also be ineffective and inequitable.
- Adopting systems trained to recognize specific student writing. In theory it is possible to determine the authorship of a text by training a system to recognize an individual student’s patterns of word choice and syntax. Like the surveillance systems above, such technology is prone to unreliability, exploitation, and abuse.
Further resources on text generators
Note: Good journalism on language models is surprisingly hard to find since the technology is so new and the hype is ubiquitous. Here are a few reliable short pieces.
- “ChatGPT Advice Academics Can Use Now” edited by Susan Dagostino, Inside Higher Ed, January 12, 2023
- “University students recruit AI to write essays for them. Now what?” by Katyanna Quach, The Register, December 27, 2022
- “How to spot AI-generated text” by Melissa Heikkilä, MIT Technology Review, December 19, 2022
- The Road to AI We Can Trust, Substack by Gary Marcus, a cognitive scientist and AI researcher who writes frequently and lucidly about the topic. See also Gary Marcus and Ernest Davis, “GPT-3, Bloviator: OpenAI’s Language Generator Has No Idea What It’s Talking About” (2020).
The below academic article is now a classic:
- “On the Dangers of Stochastic Parrots” by Emily M. Bender, Timnit Gebru, et al, FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, March 2021. Association for Computing Machinery, doi: 10.1145/3442188.
- A blog post summarizing and discussing the above essay derived from a Critical AI @ Rutgers workshop on the essay: summarizes key arguments, reprises discussion, and includes links to video-recorded presentations by digital humanist Katherine Bode (ANU) and computer scientist and NLP researcher Matthew Stone (Rutgers).
A sample of resources on “AI” more generally (Updated 1/19)
- AINow Institute, A New Lexicon,
- Emily M. Bender, “On NYT Magazine on AI” (distinguished University of Washington linguist responds blow-by-blow to a misleading New York Times feature ).
- Ruha Benjamin, “Assessing Risk, Automating Racism” Science (2019).
- Frederico Bianchi et al. “Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale.” Arxiv 2022.
- Meredith Broussard, Artificial Unintelligence: How Computers Misunderstand the World MIT Press (2018),
- Timnit Gebru, “Is Ethical AI Possible?” (podcast with co-author of “Stochastic Parrots” and director of the DAIR Institute).
- Andrew Hundt et al. “Robots Enact Malignant Stereotypes.” 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT 22). 743-756.
- Frank Pasquale, New Laws of Robotics: Defending Human Expertise in the Age of AI. Harvard UP, 2020.
- Billy Perrigo. “OpenAI Used Kenyan Workers on Less than $2 Per Hour to Make ChatGPT Less Toxic.” Time January 18, 2023.
- Meredith Whittaker, “The Steep Cost of Capture” Interactions 28.6 (December 2021): 50-55.
To share your ideas or offer advice please feel free to comment below (the comments are moderated) or write to one or both authors.
Anna Mills: firstname.lastname@example.org
Lauren Goodlad: email@example.com
In addition, the Modern Language Association and the College Conference on Composition and Communication have convened a task force on this topic (Anna is a member). Anyone who would like to contact the taskforce can write to Paula Krebs (MLA director) or Holly Hassel (CCCC director).