197636554_teaching-with-ai cover
Education

197636554_teaching-with-ai

by José Antonio Bowen, C. Edward Watson, José Antonio Bowen

19 min read
7 key ideas

AI can already write a passing essay—so what exactly are you still teaching? This practical guide shows educators how to redesign assignments, rubrics, and…

In Brief

AI can already write a passing essay—so what exactly are you still teaching? This practical guide shows educators how to redesign assignments, rubrics, and policies so that human thinking becomes the irreplaceable deliverable, not just a formality AI can bypass in seconds.

Key Ideas

1.

Verify AI detector reliability before accusations

Run your own writing through an AI detector before accusing a student — the tools flag pre-ChatGPT peer-reviewed papers as '100% AI content', and false accusations cause documented mental health harm

2.

Raise standards above freely available AI

Redesign your rubric so that work AI can produce at a C level is graded as an F — the standard for passing must now exceed what is freely available in seconds

3.

Replace binary AI policies with disclosure spectrum

Replace the binary 'AI or no AI' policy with an Acknowledgment Exercise: students disclose their level of AI interaction on a spectrum, mirroring professional norms where acknowledging tools is integrity, not failure

4.

Make human iteration the primary deliverable

Use the Process Assignment Template: have students generate AI output, critique it, iterate on the prompt, then explain in writing what human value they added — making process the deliverable rather than the final product

5.

Structured prompts scaffold learning through questions

Build feedback prompts with five components — Role, Task, Goal, Relationship, Process — to force AI to scaffold learning through questions rather than just producing answers

6.

Close AI literacy equity gap

Treat AI literacy as an equity priority: first-generation and non-native-speaking students are currently less likely to use AI and more likely to be falsely flagged by detectors — the gap will compound if left unaddressed

7.

Frame prompt engineering as problem formation

Frame prompt engineering to students as problem formation — diagnosis, decomposition, reframing, constraint design — so they understand they are practicing the oldest and most transferable intellectual skill, not a soon-to-be-obsolete tech trick

Who Should Read This

Readers interested in Teaching and Artificial Intelligence, looking for practical insights they can apply to their own lives.

Teaching with AI: A Practical Guide to a New Era of Human Learning

By José Antonio Bowen & C. Edward Watson & José Antonio Bowen

12 min read

Why does it matter? Because AI already passed Harvard, and your rubric didn't notice.

Most faculty approached AI with a detection problem. Run the essay through GPTZero, flag the suspicious paragraphs, schedule the uncomfortable meeting. Reasonable response — except it targets the symptom while the actual crisis quietly metastasizes beneath it. The real problem isn't that students are submitting AI-generated work. It's that a language model can now produce competent, organized, properly cited prose for free, in seconds, and most rubrics in use today would reward it with a B. When 'adequate' becomes effortless and ubiquitous, the entire architecture of how we define, assign, and grade academic work collapses — not at some future inflection point, but right now, in courses running this semester. What the book offers instead is a map of what higher education has to become — not someday, but now.

The Cheating Framing Is Wrong — This Is a Quality Crisis

A Harvard sophomore named Maya Bodnick ran an experiment in 2023. She took real assignments from her freshman courses, had ChatGPT write them all, then submitted the AI's work to her actual professors for regrading — telling them only that the essays might be hers or might be AI-generated. Every single one was ChatGPT's. The results came back: a 3.34 GPA. One TA grading a Conflict Resolution essay was particularly impressed, praising the 'concrete diagnoses' and 'prescriptive strategies' as 'specific, compelling, useful, and operational.' What he didn't know was that the conflict itself — a roommate supposedly using AI to cheat — had been entirely fabricated by the AI. The student, the moral dilemma, the emotional stakes: all invented. The only real thing was the A.

The instinct is to read this as a cheating story. It isn't. Notice what happened in the one case where a professor gave ChatGPT a C: the Freshman Expository Writing instructor dinged the essay because its claims were 'unmoored from analysis' and its key terms were never clearly defined. That professor accidentally wrote a diagnostic of how large language models actually work. These systems generate text by predicting what a clear, serviceable response looks like based on an enormous range of human writing. They're built to produce the statistical center — competent, organized, and ultimately shallow. The C was correct. The A was also correct, given what the rubric was actually measuring.

That's the problem educators are sitting with, and it has nothing to do with enforcement. If your rubric rewards a coherent thesis, supporting evidence, and proper citations, AI will satisfy those criteria every time, instantly and at no cost. The question isn't whether students will use it to cut corners. The question is why you're still giving points for work a probability engine produces on demand. Bowen and Watson put it bluntly: the days of hiring someone to do C-level work are over. AI has set a new floor, and any grading system that treats that floor as passing is quietly issuing a credential for work that costs nothing to produce. Fixing this requires deciding, specifically and publicly, what 'better than AI' actually looks like — and then grading only for that.

The detection instinct isn't dead yet — which is worth examining, because it's actively causing harm.

The Detector Is Already Flagging Your Own Papers as AI

A graduate student receives an alert on a Thursday morning: she may have violated her university's academic honesty policy. The meeting is set for Monday. Her professor had found one of her discussion board posts — a nonmajor writing with unusual sophistication — suspicious enough to run through two AI detectors. GPTZero gave the post a 50% probability of being AI-generated. CopyLeaks reported 100% AI content for all three of her posts. She knew she had written every word herself, but the tools said otherwise, and she had four days to figure out how to prove it.

So she ran the professor's own peer-reviewed abstract through the same tools. GPTZero gave it a 36% AI probability. CopyLeaks returned 100% AI content. The paper had been published ten months before ChatGPT existed.

At Monday's meeting, the professor dismissed the counterevidence without looking at it. He suggested, incorrectly, that his paper had probably been absorbed into some database — confusing how plagiarism detection works with how AI detection works. After an hour of argument, the complaint was withdrawn.

The tools didn't catch a cheater. They manufactured a case against an innocent student and wasted everyone's time doing it. And this wasn't a freak outcome — it was the detectors working as designed. Stanford researchers found that these same systems misclassified more than half of TOEFL essays written by non-native English speakers as AI-generated, because the limited vocabulary those writers use resembles the statistical patterns AI favors. The false-positive rate dropped from 61% to 11% only when students used AI to improve their word choice first. To avoid being accused of using AI, you have to use AI. That's the trap the system built.

Vanderbilt and Michigan State eventually disabled Turnitin's AI detection feature after the company revised its stated false-positive rate upward, from a promised 1% to a disclosed 4%. Four percent sounds small until you picture every hundredth student sent to an honor court over work they actually did — and until you notice that the students most likely to be flagged are the ones writing in their second or third language, without access to the paid tools that make AI-generated text undetectable. Detection doesn't neutralize the inequity of AI cheating. It reproduces it, with consequences attached.

Human Judgment Is Noisier Than You Think — And AI Knows It

Human judgment is the standard AI gets measured against — but that standard is far shakier than most people realize. Daniel Kahneman asked 828 CEOs how much two equally qualified insurance underwriters would diverge when pricing the same case. The executives guessed around 10%, meaning one underwriter might quote $9,500 where another quoted $10,500. The actual median difference was 55%. The same file, the same data, the same professional credentials — and your premium could land anywhere from $9,500 to $16,700 depending on which human happened to open the folder. Judges sentencing from identical case files handed down terms ranging from 30 days to 15 years, and their harshness tracked whether they were hungry, whether the weather was hot, whether their team had lost the previous weekend.

Noise is the right word for this — not bias. Bias implies a consistent lean: some underwriters systematically favor certain demographics, some judges have a philosophy. Noise is random, structureless inconsistency that produces wildly different outcomes for people in identical situations. Research on essay grading shows the same pattern: studies have found that scores on identical papers can vary by a full letter grade depending on the reader, with some variance tied to factors as arbitrary as the order in which a stack was graded. When you imagine AI-assisted student work threatening some reliable human standard, the evidence suggests that standard was never as stable as it felt. AI produces consistent, average-quality output — what Bowen and Watson call C-level work — and that consistency, applied to a noisy system, is a correction as much as a threat. Whether we've been honest about how much human judgment was varying all along is the harder question.

The Most Creative Move Is the One You Were Taught Never to Make

AlphaZero was trained with no knowledge of human strategy at all. Its predecessor, AlphaGo, had studied thousands of games played by human masters, which meant it also absorbed human assumptions — including the consensus that certain moves were mistakes. AlphaZero was given only the rules and told to play itself, millions of times. The result was a system that defeated AlphaGo by inventing hundreds of strategies no human had ever tried, because no human had ever been permitted to try them. The move that stunned every watching professional during the World Championship match — a board placement all of them had been explicitly taught was wrong — was simply AlphaGo drifting toward territory human training had never walled off. AlphaZero lived in that territory permanently.

Human expertise is always a bundled package: you get the skill and the inhibition that comes with it. That's AI's real creative edge, and it has nothing to do with processing speed. Knowing how to play Go means knowing what not to do. AI trained without human examples skips the second part entirely.

The discomfort for educators — and the opportunity — is that this same dynamic applies to students right now. AI can generate the tenth idea, the hundredth, the move every rubric has implicitly discouraged. What it cannot do is decide which of those moves matters, or why, or whether the problem was worth solving in the first place. Formulating the right question, recognizing which unexpected idea has genuine value, knowing when 'wrong' is actually interesting — that's the liberal arts skill AI can't replicate. Prompt engineering isn't a workaround; it's a restatement of what critical judgment always was, made newly visible because the machine will cheerfully do everything else.

Prompt Engineering Is Just Problem Formation With a New Name

The World Economic Forum named 'prompt engineer' the top job of the future in 2023, which triggered the predictable academic anxiety: is higher education now supposed to train people to type better questions into a chatbox? The authors' answer is that it already does — it just called the skill something else.

The Bank of America case makes this concrete. When the bank asked how to attract more customers to savings accounts, every answer was a variation on more advertising. The real breakthrough came when a design team stopped accepting that frame and asked a different question: why do people avoid savings accounts in the first place? The answer was shame — many felt they didn't have enough money to justify one. That reframing produced 'Keep the Change,' a program that rounded up purchases and moved the difference automatically. The prompt was the innovation. The execution was secondary.

The same logic governs AI. A vague request produces confident mediocrity. A precisely reframed question — specifying task, format, voice, and context — pulls genuinely useful work out of the same system. Google DeepMind researchers found that telling an AI to 'take a deep breath and work through this step by step' measurably increased accuracy, which is a strange sentence to write about software, but the point stands: how you set up the problem changes what you get back.

Teaching students to diagnose problems before solving them, to question whether the obvious frame is the right one, to decompose complexity before reaching for answers — that's not a concession to AI. It's the oldest argument for a liberal education, newly impossible to ignore.

The Equity Trap: AI Helps the Students Who Need It Most — and Might Hurt Them Too

AI is a democratizing force for novice workers — and that may be exactly the problem. Studies of customer support agents and mid-level professionals found productivity gains of 14% to 40% when AI assistance was introduced, with the largest improvements concentrated among the least experienced workers. The same pattern appeared in healthcare counseling, legal support, and professional writing tasks. Heard one way, this sounds like a genuine equalizer: the person who used to struggle at the back of the pack now keeps pace with veterans. Heard another way, it's a warning.

The distinction that sharpens this tension comes from cognitive science. Some tools are complementary — they amplify your ability in ways that persist after the tool is gone. Arabic numerals work this way: the practice of using them builds numerical intuition you retain. Other tools are competitive — they substitute for a capability rather than building it, and their absence leaves you worse off than before. GPS is the clearest example. The more fluently you navigate by satellite, the less your internal sense of direction develops. The tool handles the task so completely that the underlying skill quietly atrophies.

Which type AI turns out to be is the question this book raises and declines to fully answer. If writing is genuinely how people discover and clarify what they think, then outsourcing the struggle to an AI doesn't just shortcut the assignment — it skips the cognitive work the assignment was designed to produce. The first-generation student who uses AI to polish an essay may out-compete a wealthier peer on the rubric while missing the reasoning practice the rubric was supposed to measure. Short-term, the gap narrows. Long-term, the cognitive muscle that narrows it for good may never form.

Ban C Work, Not AI: How to Raise the Floor Instead of Guarding the Door

The most direct lever you have isn't detection — it's your rubric. If your current rubric gives passing marks to work that is organized, coherent, and evidenced, you are grading for exactly what a large language model produces every time, for free. The fix is to reclassify that output as a failure and grade only for what sits above it.

Bowen and Watson propose a concrete mechanism: take whatever you currently accept as C-level work and move it to the 50% column — which means F. The reasoning is clean. A serviceable thesis, clear paragraph structure, and citations in the right format are now a commodity. Rewarding commodity output with a passing grade doesn't evaluate students; it evaluates whether they showed up. An adjusted rubric reserves credit for originality of argument, evidence of genuine analysis, and a distinctive voice — qualities that require a human to have actually thought about something. Grammar drops off the rubric almost entirely, because spellcheckers and AI handle it without effort.

The policy layer that makes this workable is what Bowen and Watson call the Acknowledgment Exercise. Instead of asking students whether they used AI — a binary that invites a binary lie — you give them a range of statements to select from: 'I did all of this on my own,' 'I used AI to generate ideas that I then developed,' 'I used AI to draft sections that I then rewrote.' Students choose the one that fits and describe what they contributed. This mirrors how professional work actually gets disclosed: film credits, research acknowledgments, and company reports all document which tools and collaborators shaped the result. The point isn't surveillance. Deciding which statement is true requires a student to have a clear picture of what their own contribution was — which is the beginning of taking responsibility for quality.

That framing connects to Eaton's distinction: humans can hand off the prose to an AI, but they cannot hand off accountability for what the prose says. The standard shifts from 'did you write this' to 'do you own this' — a harder question, a better one, and one that forces the real conversation about what passing should mean. Better rubrics set the floor. But students still need feedback that helps them clear it, which is where the infrastructure around grading has to change too.

The AI Teaching Assistant Students Nominated for an Award

In 2016, Professor Ashok Goel at Georgia Tech quietly added a new teaching assistant to his online AI course — an AI named Jill Watson, built on IBM's platform and trained on thousands of prior student questions. Students liked her. One nominated her for a teaching award. Only 10% figured out she was a bot, and 50% concluded that a second, more personable bot was too friendly to be a real computer science TA. Their ability to detect the machine was worse than a coin flip.

By 2023, the experiment had evolved into something more interesting: a two-step where Jill Watson fact-checks the more conversational but hallucination-prone ChatGPT before students ever see its answers. Neither system solves the problem alone.

What this points toward is less about the technology than about the shape of good feedback. Bowen and Watson describe the ideal as a tennis net: objective, immediate, and specific. Human teachers manage two of those three on a good day. AI handles all three at 2 a.m. on a Tuesday, for every student simultaneously. Harvard's CS50 chatbot was built around exactly this — round-the-clock support designed to scaffold rather than solve, asking questions rather than handing over answers.

The prompt structure that makes this work is simple in principle and counterintuitive in practice. You specify a role — not just 'helpful tutor' but 'vicious debater who steelmans the weakest part of my argument' — and you tell the AI to ask one question at a time rather than rewrite anything. That second constraint is the one most instructors skip. Without it, students paste in a draft and get back a better draft, which is editing, not learning. First-generation students, who rarely have an expert at home running this kind of informal Socratic loop, gain the most when the AI is forced to hold back.

Design the Work So That Doing It Is the Point

Think of a physical therapy session. The exercise is the treatment — not a means to some separate end you'll reach once you're done with all those annoying repetitions. A patient who cheats by reducing the range of motion doesn't fail the session; they just don't heal. The work and the benefit are the same thing. That's the reframe educators need right now.

Most assignment redesign conversations start in the wrong place: how do we make tasks AI-proof? The better question is whether the task's value lives in the output or in the doing. When the answer is the doing, cheating becomes structurally pointless — not because it's harder to pull off, but because the deliverable is the struggle itself.

Bowen and Watson's Process Assignment Template makes this concrete. Instead of asking students to write an essay, the assignment asks them to generate several AI drafts, audit each one for what's missing or wrong, argue which version comes closest to good and why, then produce a human-edited result and explain — as if addressing a job interviewer — what value they personally added to the chain. That last question is the one that does the work. An AI can't answer 'what did you contribute here?' because it has no self to speak from. The student has to have actually thought something in order to respond. The transcript of that process becomes the evidence of learning, and submitting a clean AI essay with no trail simply fails to address what the assignment asked. The template also hits every motivational lever that gets humans to do hard cognitive work: finding what's wrong with AI-generated ideas requires domain knowledge, arguing for one version over another requires genuine evaluation, and explaining your contribution to a hypothetical employer makes the stakes real and personal.

This is what post-plagiarism assignment design looks like in practice: not a locked door, but a room where doing the work is what the work produces.

AI Literacy Is Now an Equity Issue, Not a Tech Issue

Who gets to learn how to work with AI? That question sounds like it belongs in a policy document, but the answer is already writing itself in the data. Early usage data shows that male, non-first-generation, White, and Asian students are currently the most likely to have used AI tools — which means the gap between students who graduate fluent in these systems and students who don't is tracking the same fault lines that structured educational inequality long before any chatbot existed. The students with the most to gain from AI — first-generation students, students who lack professional networks, students who've never had a mentor available at 2 a.m. — are precisely the ones least likely to arrive with this literacy already in place. If colleges treat AI as an optional enrichment skill, they hand that advantage to the students who would have been fine regardless.

The skill employers are rewarding right now is the ability to think alongside AI — to diagnose problems precisely enough to prompt well, to evaluate what comes back critically, to iterate toward something better than the machine's first attempt. That is not a technical credential. It is the liberal arts repackaged as a job requirement, and it is what every student, regardless of major, needs to practice before they leave. If graduates leave without it, the degree certifies less than it used to.

The institution that teaches this to everyone — not just the students who already know it — has a claim on continued relevance. The one that doesn't is quietly becoming optional.

The Question Higher Education Has to Answer Before AI Answers It First

Here is the uncomfortable truth the book leaves you with: everything higher education was supposed to develop — the ability to frame a problem precisely, question an obvious answer, recognize when something that looks right is actually shallow — turns out to be exactly what AI cannot replicate and employers now desperately need. The institutions still debating whether to allow ChatGPT are asking the wrong question. The ones that matter in ten years will be the ones that treated this moment as the clearest argument for rigorous thinking they'd ever been handed, and then taught it — deliberately, equitably, to every student, not just the ones who already arrived knowing how. AI didn't change what a good education was for. It just made the cost of skipping it impossible to ignore.

Notable Quotes

so you can do more of what AI can’t

Explore diet plans in a paper

write a paper about diet plans.

Frequently Asked Questions

What is 'Teaching with AI: A Practical Guide to a New Era of Human Learning' about?
"Teaching with AI" (2024) argues that educators must stop resisting AI and instead redesign courses around what it cannot replace — original thinking, critical judgment, and authentic process. The book provides faculty concrete tools to navigate this transition. Rather than implementing blanket bans, the authors propose that educators should fundamentally reconsider assignment design, grading rubrics, and assessment practices to make human intellectual effort visible and irreplaceable. The guide emphasizes that AI literacy is an equity priority, as certain student populations are less likely to use AI and more vulnerable to false detection.
How should teachers respond to potential AI use in student work?
Replace the binary 'AI or no AI' policy with an Acknowledgment Exercise where students disclose their level of AI interaction on a spectrum, mirroring professional norms where acknowledging tools is integrity, not failure. Before accusing a student of using AI, run your own writing through an AI detector first — these tools flag pre-ChatGPT peer-reviewed papers as '100% AI content', producing false positives that cause documented mental health harm. This approach treats AI literacy as a skill to develop, not a violation to punish.
How should grading rubrics be redesigned to account for AI?
Redesign your rubric so that work AI can produce at a C level is graded as an F — the standard for passing must now exceed what is freely available in seconds. This forces educators to define what constitutes human intellectual contribution beyond basic competence. By raising the bar for passing grades, you ensure that student work demonstrates original thinking and critical judgment rather than merely meeting minimum standards. This creates a clear distinction between acceptable AI-assisted work and work that demonstrates genuine learning and development of higher-order thinking skills.
What is the Process Assignment Template and how does it work?
The Process Assignment Template has students generate AI output, critique it, iterate on the prompt, then explain in writing what human value they added — making process the deliverable rather than the final product. This approach makes intellectual labor visible and measurable by requiring students to demonstrate their thinking at each stage. Students learn to engage with AI as a tool that requires human judgment and refinement. By focusing on the process of interaction with AI, educators can assess students' critical thinking, problem-solving abilities, and ability to judge and improve machine-generated outputs—skills that directly demonstrate human intellectual contribution.

Read the full summary of 197636554_teaching-with-ai on InShort