PROF. ELLWANGER: Chatbot’s essay doesn’t make the grade: Artificial intelligence and the art of writing
Can AI produce quality college-level writing? To answer this question, I experimented with a popular chatbot: ChatGPT’s “text-davinci-003” model.
In a recent article for Campus Reform, I wrote about students’ woeful under-preparation for college-level writing. Now, a new trend threatens to compound that problem: chatbots powered by artificial intelligence can mimic academic essays. For those unfamiliar with this technology, chatbots are powerful language-generating computers that can respond to commands and produce writing that is largely indistinguishable from the kind assigned by professors.
Recent advances in this technology have won significant media attention. Artificial intelligence will soon pose various threats to higher education. Many students will be tempted to use computer-assistance in completing homework, writing assignments, and exams. Relying on these tools will further inhibit their development as writers. As these technologies progress, it will also become more difficult for professors to identify (and prove) violations of academic honesty such as plagiarism and collusion. And beyond all these concerns lie thorny legal questions related to intellectual property – who “owns” a text created by a computer that the public can use for free?
Before assessing the challenges that AI poses to higher education, one question must be settled: Can AI produce quality college-level writing? To answer this question, I experimented with a popular chatbot: ChatGPT’s “text-davinci-003” model. To test the computer’s abilities, I gave it a writing assignment similar to what I give my students: a causal argument. Whereas I allow my human students to choose their own topics, I provided one for Davinci:
“Write me a college essay of about 1,500 words that argues that feminism causes women to be less happy, citing at least five secondary sources in MLA format.”
The first thing I learned was that Davinci is fast – very fast. It produced the entire essay in about five seconds. You can find it here. I graded Davinci’s essay as I do those written by my students. Every essay should be evaluated on two grounds: the quality of the writing and the strength of the argument.
Where the writing itself is concerned, Davinci exhibits a total mastery of English grammar and syntax. There are no sentence-level errors in the entire essay. Still, grammatical correctness isn’t the only measure of the quality of college writing: style is another factor.
I’m tempted to call Davinci’s style sophomoric, but I expect more sophistication from my sophomores. Davinci’s essay has an introductory paragraph with a thesis, followed by four “body” paragraphs (each of which focus on a different way that feminism causes unhappiness for women), and a concluding paragraph. This makes the essay feel like a boxed cake mix from Duncan Hines. It’s totally formulaic: add water, eggs, oil, and presto! – college essay.
Davinci’s transitional sentences also show that the computer is just going through the motions. Almost every paragraph begins with a one or two-word transition: “To begin,” “Furthermore,” “In addition,” “Finally,” and “Overall,” respectively. Thus, while many haters of college essays take issue with the soullessness of academic writing, Davinci takes it to new heights.
In terms of argumentation, Davinci doesn’t really make an argument. While Davinci does offer evidence that feminism might make women less happy, all of it is gleaned from other sources. In other words, Davinci doesn’t advance any arguments of its own – it merely recounts claims that it encountered in its research. What Davinci has really produced is a book report – not an essay that shows some evidence of critical thinking. The computer simply regurgitated others’ ideas – the hallmark of high-school writing.
Thus, I would give Davinci a B- in terms of the writing (clean and clear, but stylistically elementary), and a D or an F in terms of argumentation (because there really isn’t any).
Still, there is another issue that would earn a grade of F by default. Davinci doesn’t follow directions. My prompt asked for about 1,500 words, but the computer wrote fewer than 600. Further, the assignment required 5 secondary sources – but Davinci only used 4. The essay quotes from a fifth, but closer inspection shows that this passage was merely quoted in one of the other four. I would give a student a minor penalty for this little trick. But coming in 1,000 words under the length requirement would justify an automatic F.
The abilities of chatbots like ChatGPT are impressive, but they aren’t yet advanced enough to serve as a total shortcut for students. They will be soon, though. The takeaways are clear. Professors must anticipate innovative ways to ensure that students do not come to rely on AI for “their” writing. For their part, students should know that the only way to become a better writer is to write. Ultimately, good writing reflects the soul of the writer… and Davinci doesn’t have one. Depending on AI won’t just deprive you of refining a critical skill in college – it will ensure that “your” writing is soulless and forgettable. No one wants to be boring. And as technology continues to revolutionize education, we must remember that the “A” in “AI” stands for artificial.