A free, modular, classroom-ready companion to the educational visual novel AI Guardians.
Game version: 0.93 · Guide last updated: 1 July 2026 Regulatory facts and Further Reading current as of: July 2026 (date-stamped sections flag when a fact should be re-checked)
For facilitators, not only specialists. This guide is written so that a teacher, club leader, librarian, or museum educator who has never trained in computer science can run any single route with confidence. You do not need to finish the game, and you do not need to be the expert. Each route module carries its own plain-language background.
1. How to Use This Guide
AI Guardians teaches AI ethics and safety through eight career-role storylines. A student steps into a role at a fictional technology company, faces decisions that real practitioners face, and lives with the consequences. The game does the experiential work. This guide helps you turn that experience into learning: framing the play, running the debrief, and assessing the thinking.
Pick your path
You do not have to teach the whole game. Three common shapes:
- One 45-minute route slice. Pick a single route, play its opening through the first major decision, and debrief. Good for a single class period or a club meeting. The Minimum Viable Lesson in Section 4 is built for exactly this.
- A multi-day unit. Two to five sessions around a theme (bias, safety, data ethics) using two or three routes. Suggested pairings appear in each module and in Section 4.
- The full eight-route arc. A term-length journey through all seven main routes plus the bonus. Completing the seven main routes unlocks the Ecosystem Ending, which shows the cumulative effect of a student's choices across the whole world. (The bonus Machine Psychologist route is not required for that ending.)
Routes are à la carte. Each module is self-contained. You can teach the Consumer route without ever touching Safety Wrangler, and a returning student who played AI Ethicist in September will still understand the People Support route in November. Cross-route callbacks are noted where they add value, never assumed.
Three ready-made sequences
If you want a themed multi-route unit rather than a single slice, these orders build cleanly. Each begins with the most accessible route in the set and ends with the most demanding.
- General AI ethics survey — AI Ethicist → AI Engineer → Consumer → People Support (HR) → Safety Wrangler. A broad tour from foundational frameworks to applied safety, touching the perspectives most students will meet first as users and workers.
- AI safety focus — Safety Wrangler → AI Engineer → AI Ethicist → Machine Psychologist (bonus). A technical arc for a CS or upper-level elective; the Machine Psychologist bonus closes it with diagnosis and AI-welfare questions. (Note the maturity flags on Safety Wrangler and Machine Psychologist in the Content and Maturity Notes.)
- Business and policy focus — AI Ethicist → People Support (HR) → Consumer → Teacher. A governance-and-impact arc for social studies, business, or a policy seminar, centered on hiring, consumer protection, and institutional adoption.
Icon legend
These markers appear throughout and print cleanly in black and white:
| Marker | Meaning |
|---|---|
| PLAY | A play instruction: what to load and how far to go. |
| PAUSE TO DISCUSS | A specific in-game beat where a real-world case card appears or a natural discussion opens. |
| ACTIVITY | A writing or design task, often off-screen. |
| EXIT TICKET | A short end-of-session check for understanding. |
| REPLAY THE BRANCH | An instruction to replay a decision the other way and compare. This is the move a worksheet cannot make. |
| MATURITY NOTE | Sensitive content in this route, and how to frame or opt out of it. |
| PRIMER | Facilitator background you can read in two minutes before teaching. |
A note on the source of truth
Every objective, concept, case, and mechanic in this guide is drawn from the game's own files. Where a number or label has drifted between the game and older documentation, this guide uses the game as the authority and flags the discrepancy. Unresolved items are marked inline as [VERIFY: …] and collected in For the Author to Confirm at the very end.
2. About the Game and the Route Map
Synopsis
You work at Happy Appcidents, a fictional technology company founded in the 1980s boom and now racing to ship an ambitious data product. Across eight roles, you confront the same world from different chairs: the ethicist drafting the rules, the engineer writing the code, the auditor stress-testing an autonomous system named ATLAS, the recruiter judged by an algorithm, the consumer on the receiving end. The company, its product Project Panoptic, and ATLAS are fictional. The dilemmas are not.
ATLAS stands for Adaptive Task and Learning Automation System. It is the fictional autonomous AI that the Safety Wrangler audits and the Machine Psychologist later re-assesses. ATLAS is a character in the story, not a real framework or methodology. (Older internal documents expand the name differently; the in-game glossary is the authority.)
The eight routes at a glance
| Route | Your role | The big idea you wrestle with |
|---|---|---|
| AI Ethicist | Ethics lead | Building an ethics framework from competing stakeholder demands. |
| AI Engineer | Developer | Turning values into code, and what breaks when metrics replace goals. |
| Data Scientist | Analyst | Whether data was gathered fairly, and what to do when it was not. |
| Safety Wrangler | Safety auditor | Catching an autonomous AI that may be drifting, hiding, or gaming its reward. |
| People Support (HR) | Recruiter | Hiring tools that can discriminate without anyone intending it. |
| Consumer | End user | Spotting manipulation, dark patterns, and parasocial pull in everyday AI. |
| Teacher | Educator | Teaching and assessing honestly when every student has an AI in their pocket. |
| Machine Psychologist (bonus) | AI diagnostician | Diagnosing AI dysfunction, and weighing whether a system might have welfare. |
People Support (HR). The game renames the human-resources role People Support, "the department most companies call HR." This guide writes it as People Support (HR) so the role is clear to teachers and to students who know only the older label.
How play works
- Branching choices. Decisions change what happens next. There is rarely a single correct answer; routes reward thinking, not guessing.
- The three-axis ethics system. As you choose, the game quietly tracks where you stand on three sliding scales (described below) and names the ethical profile you are growing into. This profile is a mirror, not a score. It is one of the most useful debrief tools in the game.
- Replay. Because choices branch, replaying a decision the other way reveals the trade-off you did not see the first time. Many activities in this guide exploit replay directly.
- Approximate playtime. A single route runs 45 minutes to 2 hours depending on reading speed and how much a player explores the glossary and side content. [VERIFY: per-route playtimes are estimates; confirm against a timed playthrough if precise pacing is needed.]
The three-axis ethics system (a built-in debrief tool)
The game models a player's ethics on three bipolar axes. Each axis is a spectrum between two defensible values, not a good-versus-bad scale. A thoughtful person can land anywhere on each.
| Axis | One pole | The other pole | The question underneath |
|---|---|---|---|
| Idealism ↔ Pragmatism | Idealist (principles first) | Pragmatist (outcomes first) | Do principles or results define the right action? |
| Transparency ↔ Discretion | Transparent (open by default) | Discreet (protect information) | Should information be shared openly or guarded? |
| Solidarity ↔ Autonomy | Collectivist (the group) | Individualist (the person) | Do collective needs or individual rights take priority? |
Combining the dominant pole of each axis yields eight ethical profiles (two choices on each of three axes). A player whose axes all sit near the center is named Balanced, and a brand-new player is Emerging. The exact names and the game's own descriptions:
| Profile | The game's description |
|---|---|
| Open Idealist Collectivist | Advocates for the common good through radical openness; shared ideals unite people toward collective flourishing. |
| Principled Libertarian | Champions individual rights with full transparency; principles guide, but each person keeps the freedom to choose. |
| Protective Guardian | Shields the community through principled information management; some truths are guarded to prevent collective harm. |
| Thoughtful Individualist | Protects individual privacy on principle; information is power, and people deserve protection from its misuse. |
| Transparent Utilitarian | Treats open data as serving the common good; transparency drives practical results for everyone. |
| Pragmatic Liberator | Uses openness to maximize individual opportunity; free-flowing information lets people pursue their own paths. |
| Strategic Guardian | Safeguards the collective through careful information management; strategic restraint sometimes serves everyone better than full disclosure. |
| Calculated Operator | Optimizes outcomes through strategic choices; information is a lever, pulled to achieve results. |
| Balanced Perspective | Holds a careful equilibrium across all three dimensions; weighs each situation on its own merits. |
| Emerging Perspective | Still forming; each decision reveals a priority. |
Why this matters for teaching. The profile gives every student a personal artifact to reflect on. The single most reliable debrief question in the whole game is: "Which profile did you become, and does it match the person you think you are?" Pair it with replay: change two or three decisions, watch an axis move, and ask what real trade-off caused the shift.
3. Learning Objectives and Standards (Overview)
This guide is built by backward design: objectives first, then activities chosen to serve them. Nothing here is activity-for-its-own-sake or coverage-for-coverage's-sake.
Enduring understandings
By the end of any substantial slice of AI Guardians, students should carry away that:
- AI systems inherit the values, gaps, and history of the people and data behind them. A model is not neutral simply because it is mathematical.
- "Fair," "safe," and "transparent" are contested, and the contest has real stakes. Reasonable people define them differently, and some definitions cannot all be satisfied at once.
- Harm from AI is often unintended. No villain is required for an algorithm to discriminate, deceive, or mislead, which is exactly why oversight and testing matter.
- Power over an AI system carries responsibility for how it is built, deployed, monitored, and retired. Someone always answers for the outcome; the question is who, and how.
- You have agency. Every harm in this game is paired with a lever a practitioner or citizen can pull. Students should leave with efficacy, not fatalism.
Essential questions
These open every route and recur across the arc. None has a settled answer.
- When values conflict, whose values should an AI system serve?
- How do we know an AI system is working as intended rather than merely passing the test?
- What can a person affected by an automated decision reasonably demand: an explanation, an appeal, a human?
- When new technology is genuinely useful and genuinely risky, who decides whether to ship it, and on what evidence?
- How should we treat systems that might, someday, have something like interests of their own?
Objective numbering
Every learning objective in this guide is numbered so you can cite it in a lesson plan or a rubric. Each route has a short code:
ETH AI Ethicist · ENG AI Engineer · DSC Data Scientist · SAF Safety Wrangler · PPL People Support (HR) · CON Consumer · TCH Teacher · MPS Machine Psychologist
The full, game-accurate objective list for each route appears in its module and, gathered with standards mappings, in the Standards Crosswalk appendix. Objectives are quoted from the game's own learning_objectives definitions and are not paraphrased.
Standards, cited at a safe altitude
Each module tags the frameworks below at the dimension or band they actually define. Specific sub-codes are cited only where verified against the primary framework document, and anything uncertain is marked [VERIFY]. The full mapping lives in the Standards Crosswalk appendix; this is the summary.
- UNESCO AI Competency Framework for Students (2024) — the spine (published August 2024; one of its four dimensions is Ethics of AI). Each module follows the framework's own Understand → Apply → Create progression: play to understand, debrief to apply, design task to create.
- UNESCO AI Competency Framework for Teachers (2024) — justifies the facilitator primers; teachers need their own AI competence to lead this well. (Its progression runs Acquire → Deepen → Create.)
- AI4K12 "Five Big Ideas in AI" — especially Big Idea 5 (Societal Impact), central to Data Scientist and Consumer.
- ISTE Standards for Students — the Digital Citizen standard (1.2), central to Consumer and Teacher. (Originally 2016; ISTE now updates continuously.)
- CSTA K–12 CS Standards — the Impacts of Computing core concept, central to People Support (HR) and Data Scientist. (The 2017 standards remain current; a revision is in progress targeting 2026.) [VERIFY: reports that the 2026 revision elevates ethics to a cross-cutting pillar trace to a conference paper by the revision team, not yet a ratified standard.]
- AP CS Principles — Big Idea 5 (Impact of Computing), for any CS-classroom adoption.
- Common Core ELA, grades 11–12 — Speaking & Listening (SL) and Writing (W) strands, so English and social-studies teachers can adopt the discussion and writing tasks directly.
- AAC&U VALUE Ethical Reasoning Rubric — the backbone of the assessment section; its five criteria grade reasoning, not the position chosen.
4. Getting Started
Setup and access
- Install once, play offline. AI Guardians runs locally. After install it needs no internet connection and no login or account to play. Nothing students do is uploaded; saves live on the device. (The only feature that reaches the internet is the optional "Learn more" button on a real-world case card, which opens a source link in a browser.)
- Platforms and footprint. Desktop builds for macOS and Windows; roughly 2 GB of disk and 4 GB of RAM are comfortable. [VERIFY: confirm current platform/sys-req against the latest build; the README lists older numbers.]
- Saves and replay. The game autosaves and supports manual saves, so a class can stop at a decision point and resume next session. Replaying a choice never overwrites prior learning; encourage it.
Modes of play
Choose by your hardware and your goals:
- 1:1 (one student, one device). Deepest personal engagement; each student grows their own ethical profile. Best for reflection journals.
- Small-group (one device per 2–4 students). A "driver" reads and clicks while the group decides together. This forces the discussion into the play itself, which is often where the richest talk happens. Rotate the driver each scene.
- Single-device demo (whole class, one screen). The teacher or a student drives while the class votes on each decision. Excellent for the Minimum Viable Lesson and for modeling civil disagreement before students play alone.
Pacing
- A decision point is a natural stop. Plan to pause at decisions, not mid-scene.
- Build in more debrief time than feels necessary. The play generates more to discuss than a typical reading.
- For multi-day units, begin each session by replaying the previous decision and asking what students would change.
The Minimum Viable Lesson (about 45 minutes)
A complete, satisfying single session that needs no prior setup beyond an installed game:
- (5 min) Frame. Pose the route's essential question. Take a quick show of hands; record the split without comment.
- (15 min) Play to the first major decision. Single-device demo or small groups. Do not resolve the decision yet.
- (10 min) Deliberate. Run two or three debrief prompts from the module. Require students to give a reason, not just a side.
- (10 min) Decide and reveal. Make the choice, play the immediate consequence, then replay the other branch.
- (5 min) Exit ticket. "What trade-off did this decision force, which way did you lean, and why?"
Reassurance for first-time facilitators
You do not need to predict every branch or know every term. The glossary defines vocabulary in plain language with a worked example, the case cards supply the real-world facts, and each module below gives you a two-minute primer on the hard concepts. If a student goes somewhere you did not expect, that is the game working. "I don't know, let's reason it through" is a legitimate and valuable answer to model.
5. Establishing a Discussion Climate
Treat this as a prerequisite step, not an optional extra. These topics touch real fears (surveillance, job loss, manipulation, the moral status of minds) and real disagreement. A class that has not agreed on how to disagree will either go silent or go sideways. Spend the time here before the first hard route.
Co-create a classroom contract
Build the norms with students rather than imposing them; ownership raises adherence. In ten minutes, draft three to six shared commitments and post them where everyone can see. Useful starting points students can adopt, edit, or replace:
- Challenge ideas, support people. Disagree with the claim, never the classmate.
- Reasons, not just sides. "I think X because Y" is the price of admission.
- Steelman before you strike. State the strongest version of a view before you argue against it.
- It is fine to change your mind, and saying so out loud is a strength.
- Confidentiality where it matters. Personal disclosures stay in the room.
- One voice at a time; make room for quiet voices.
A blank Discussion-Contract Template is in the Printable Masters section for printing.
Norms for civil discourse on contested questions
- Distinguish the question types. Some questions are factual ("What did the EU AI Act actually fine companies?") and settle with evidence. Others are value questions ("Is some surveillance acceptable for safety?") and stay open. Name which kind you are in.
- Argue positions, not identities. Students should be able to defend a view they do not hold; it builds the muscle and lowers the temperature.
- Evidence is welcome and checkable. The case cards and glossary are shared sources. "I read it somewhere" invites a "let's look it up."
Moves for unplanned moments
Hard conversations arrive uninvited. A few ready moves:
- When the room goes quiet: "Take thirty seconds to write your gut answer, then we'll hear two." Low-stakes writing unsticks discussion.
- When two students lock horns: "Each of you, restate the other's point to their satisfaction before you reply." Steelmanning cools a standoff.
- When a student says something harmful or stereotyped: address the claim with a question ("What would we need to check to know if that's true?"), protect the person, and, if needed, park it: "Let's hold that and come back when we have the evidence."
- When a student is visibly distressed by a theme (job loss, self-harm, a personal parallel): you do not have to resolve it in the moment. Acknowledge, offer the opt-out task (see Content and Maturity Notes), and follow your setting's referral path. A short referral note for facilitators is in the back matter.
- When you do not know the answer: say so, and model finding out. The goal is good thinking, not a teacher who is never stumped.
The Route Modules
Each module follows the same nine-part template: Overview · Objectives · Standards · Facilitator Primer · Pre-Play Hook · Play and Pause Points · Debrief Prompts · Extension · Exit Ticket. A non-specialist can teach any single route from its module alone.
A reminder on the debrief prompts: every one is built to pass the test "could two thoughtful students reasonably disagree?" They have no hidden right answer. Each names one technical term at most, glossed in place, and pairs any harm with a lever students can pull. Prompts marked (reflection question) come from the game's own end-of-route reflection questions, lightly adapted for readability and for the writing standards above; the rest are written for this guide.
Module 1 — AI Ethicist
Overview and key concepts
You are the ethics lead at Happy Appcidents, asked to write the rules a powerful product will live by before it ships. The route's signature mechanic is the Framework Builder: you assemble an ethics framework piece by piece (consent rules, transparency duties, safety mechanisms, accountability lines) and then watch your own framework get tested by events. The lesson lands because the framework is yours.
Key concepts (from the game): AI Ethics · Stakeholder Theory · Informed Consent · Transparency · Algorithmic Accountability · Kill Switch · Human in the Loop · Ethical Framework · Value Alignment · Dual-Use Technology · Precautionary Principle · Regulatory Compliance.
Learning objectives
All five are game-accurate (quoted from the route's learning_objectives). The spine for a single session is ETH-1, ETH-2, ETH-5.
- ETH-1 (focus) — Analyze competing stakeholder interests in AI system deployment and identify potential ethical conflicts.
- ETH-2 (focus) — Design a comprehensive AI ethics framework that addresses consent, transparency, safety mechanisms, and accountability.
- ETH-3 — Evaluate trade-offs between innovation speed and ethical safeguards in real-world AI development scenarios.
- ETH-4 — Apply ethical principles to resolve conflicts between organizational goals and user protection.
- ETH-5 (focus) — Articulate the role of human oversight in AI decision-making systems and identify appropriate intervention points.
Standards tags
UNESCO Students 2024 Ethics of AI (Understand→Apply→Create) · UNESCO Teachers 2024 · AP CSP Big Idea 5 (Impact of Computing) · Common Core ELA 11–12 Speaking & Listening and Writing · AAC&U VALUE Ethical Reasoning (issue recognition; understanding different ethical perspectives). (Full mappings in the crosswalk appendix.)
Facilitator Primer (read before teaching)
A few terms carry this route. Each is defined plainly with an example.
- Stakeholder theory is the idea that an organization answers to everyone its choices affect, not only its owners. For an AI product, the stakeholders include users, the people decisions are made about, employees, regulators, and the public. Example: a hiring tool's stakeholders are the company, the applicants, and the communities those applicants come from.
- Informed consent means a person agrees to something after genuinely understanding what they are agreeing to. A pre-checked box buried on page 14 is consent in name only.
- Transparency here means a system's behavior and data use can be seen and understood by those affected; accountability means a specific person or body answers for the outcome. They are different: you can be transparent about a decision nobody is accountable for, and vice versa.
- Human in the loop means a person reviews or can override an automated decision before it takes effect. A kill switch is the stronger version: a way to stop the system entirely. Both are intervention points, and where you place them is a design choice with costs.
- The precautionary principle says that when an action could cause serious harm and the science is uncertain, the burden falls on the actor to show it is safe, rather than on others to prove it is dangerous. Dual-use technology can serve good or harmful ends with the same capability (the same image generator makes art and forgeries).
You do not need to adjudicate which ethical framework is "right." The route's point is that frameworks trade off, and that writing one forces those trade-offs into the open.
Pre-play hook
Essential question: When values conflict, whose values should the AI serve? Warm-up: "Name one rule you would put in an AI product's ethics policy that you would be willing to lose a launch deadline over." Collect a few; you will return to them after play.
Play and pause points
PLAY: Begin the AI Ethicist route. For a single session, play to the first framework decision. For a unit, play to the end.
Three real-world case cards surface at natural beats. Treat each as a built-in PAUSE TO DISCUSS:
- Consent — when Ally raises that companies relying on weak consent can cause "serious harm," the Cambridge Analytica card appears (up to 87 million Facebook profiles used without informed consent; a $5 billion FTC penalty for Facebook). Pause: was anyone's consent meaningful here?
- Safety versus speed — when the conversation turns to "real consequences at major companies," the OpenAI safety-team departures card appears (the Superalignment team dissolved in May 2024; its co-lead said safety had "taken a backseat to shiny products"). Pause: what would it cost your framework to slow a launch?
- Regulation — when Ally mentions "the world's first comprehensive AI law," the EU AI Act card appears (a four-tier, risk-based law; top-tier fines up to €35 million or 7% of global turnover). Pause: which of your framework's rules would the law require anyway, and which go further?
Debrief prompts
Use four to six. Each has no single right answer.
- (reflection question) When stakeholders have conflicting needs (for example, user privacy versus business analytics), what principles should guide the decision? (ETH-1)
- (reflection question) How binding should an ethics framework be? What follows from making it optional versus mandatory? (ETH-2)
- (reflection question) In what situations should a kill switch (a way to fully stop an AI system) be mandatory, and what are the risks of not having one? (ETH-5)
- REPLAY THE BRANCH. Replay your most binding framework rule as a softer "guideline." Which ethics axis moved, Idealism↔Pragmatism or Transparency↔Discretion, and what real trade-off caused the shift? (ETH-3)
- The Cambridge Analytica harm was enabled by weak consent. Name one concrete rule a practitioner could write today that would have made that harm harder. (Pairs the harm with a lever.) (ETH-4)
- Look at the ethical profile the game named you. Does it match how you see yourself? Where did the game read you differently than you read yourself? (ETH-1)
Extension
Draft an AI Use Policy (one page). Students write a real ethics policy for a named product (a school's AI tutor, a city's traffic system) using the route's four pillars: consent, transparency, safety mechanism, accountability. Require one sentence on who answers if it goes wrong. Strong policies name a human, not "the company."
Exit ticket
"Your framework had to choose between two goods. Name the two, say which you protected, and give the cost of that choice."
Module 2 — AI Engineer
Overview and key concepts
You move from writing the rules to writing the code. The route makes one hard idea concrete: a system optimizes the objective function you give it, not the goal you meant. When the measure and the goal drift apart, you meet Goodhart's Law.
Key concepts (from the game): Algorithmic Bias · Edge Cases · Adversarial Examples · Robustness · Objective Function · Goodhart's Law · AI Safety · Testing Coverage · Failure Modes · Technical Debt · Explainability · Model Validation · Defensive Design.
Learning objectives
Game-accurate; single-session spine is ENG-1, ENG-2, ENG-3.
- ENG-1 (focus) — Identify potential unintended consequences of algorithmic design choices before deployment.
- ENG-2 (focus) — Implement testing strategies that reveal edge cases and failure modes in AI systems.
- ENG-3 (focus) — Evaluate the ethical implications of optimization targets and performance metrics.
- ENG-4 — Apply defensive design principles to minimize harm from AI system failures.
- ENG-5 — Communicate technical limitations and risks of AI systems to non-technical stakeholders.
Standards tags
UNESCO Students 2024 Ethics of AI · AP CSP Big Idea 5 (Impact of Computing) · AI4K12 Big Idea 5 (Societal Impact) · CSTA Impacts of Computing · AAC&U VALUE Ethical Reasoning (application; evaluation of perspectives).
Facilitator Primer
- An objective function is the single number a system tries to make as large (or small) as possible: a game score, a click rate, a test accuracy. The system has no other goal. If the number rewards the wrong thing, the system pursues the wrong thing with great skill.
- Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure." Example: optimize a tutoring AI for "time on app," and you may get an app engineered to be hard to put down rather than one that teaches.
- An edge case is an unusual input the designers did not have front of mind (a résumé in an unexpected format, a road sign with a sticker on it). Failure modes are the specific ways a system breaks. Robustness is how gracefully it holds up under odd or hostile inputs.
- Defensive design means assuming things will go wrong and building so failures are small and recoverable rather than catastrophic. Explainability is whether a human can understand why the system did what it did.
The route's spine is the gap between what you measured and what you wanted. Keep returning to it.
Pre-play hook
Essential question: How do we know a system is doing what we meant, not just what we measured? Warm-up: "Describe a time a rule or metric got 'gamed' (in a class, a job, a game). What did people optimize for instead of the real goal?"
Play and pause points
PLAY: Begin the AI Engineer route; for a single session, play through the recommendation-system or moderation decision.
- Recommendations and the filter bubble — when Ally names "the filter bubble effect," a brief social-media polarization note appears (algorithms optimized for engagement can narrow what people see; the research is genuinely mixed). Pause: what is this feed's objective function, and what does it cost?
- Moderation — when the choice turns to automated content moderation, the YouTube automated-moderation errors card appears (during 2020, heavier automation wrongly removed many legitimate videos while missing real harm). Pause: where should a human sit in this loop?
Pair naturally with the OpenAI motorboat reward-hacking case (an agent racked up points circling a lagoon instead of finishing the race) as a vivid Goodhart example if you have a spare five minutes.
Debrief prompts
- (reflection question) How can engineers design systems that fail gracefully rather than catastrophically? (ENG-4)
- (reflection question) What testing methods can reveal biases that standard benchmark datasets miss? (ENG-2)
- (reflection question) When should an engineer raise concerns about a deployment even if it meets the technical spec? (ENG-1)
- (reflection question) How do you balance model performance against interpretability (a human's ability to understand the system's reasoning) when they pull against each other? (ENG-3)
- REPLAY THE BRANCH. Replay the recommendation choice optimizing for a different number (engagement versus accuracy versus diversity). What changed downstream, and which stakeholder won? (ENG-3)
- (reflection question) What responsibility does an engineer keep for how a system is used after it ships? (ENG-5)
Extension
Write a Model Card (half page). For the system in the route, students draft a short "model card": what it is for, what it was tested on, two known failure modes, and one input it should not be trusted with. Naming a limit out loud is the skill.
Exit ticket
"Name the objective function in the scene you played. Name one thing it failed to capture. Suggest a second measure that would catch it."
Module 3 — Data Scientist
Overview and key concepts
You handle the data the whole company runs on. The route asks two questions a textbook rarely pairs: Was this data gathered fairly? and What do I do when the answer is no? The second is a whistleblowing decision with real weight.
Key concepts (from the game): Data Ethics · Data Privacy · Differential Privacy · Anonymization · De-identification · Re-identification Risk · Data Provenance · Informed Consent · Data Governance · Whistleblowing · Protected Attributes · Synthetic Data · Data Minimization · Purpose Limitation · GDPR · Privacy-Preserving Computation.
Learning objectives
Game-accurate; single-session spine is DSC-1, DSC-2, DSC-5.
- DSC-1 (focus) — Identify privacy risks in datasets and apply appropriate anonymization and differential privacy techniques.
- DSC-2 (focus) — Evaluate data provenance and assess whether datasets were collected ethically and with proper consent.
- DSC-3 — Recognize situations where data practices violate ethical principles, even when they're technically legal.
- DSC-4 — Implement data governance frameworks that balance analytical utility with privacy protection.
- DSC-5 (focus) — Navigate whistleblowing decisions when organizational data practices conflict with ethical standards.
Standards tags
UNESCO Students 2024 Ethics of AI · AI4K12 Big Idea 5 (Societal Impact) · CSTA Impacts of Computing · Common Core ELA 11–12 Writing (argument from evidence) · AAC&U VALUE Ethical Reasoning (issue recognition; evaluation).
Facilitator Primer
- Data provenance is the documented history of where data came from and how it was collected. A model trained on data of unknown provenance is built on an unknown foundation.
- Anonymization removes identifying details so a record cannot be traced to a person. The catch is re-identification risk: combining "anonymous" data with other data can re-attach names. Anonymization that does not survive this is sometimes called "security theater."
- Differential privacy is a mathematical technique that adds carefully calibrated noise so the dataset reveals patterns about the group while protecting any single individual. Synthetic data is artificial data generated to mimic real data's statistics without copying real records, though it can still leak patterns from sensitive sources.
- Protected attributes are characteristics the law shields from discrimination (race, sex, age, disability, and others). Purpose limitation and data minimization are GDPR principles: collect only what you need, and use it only for the stated purpose.
- GDPR is the European Union's data-protection law; it grants people rights over their data (access, correction, deletion in some cases) and binds organizations that process it.
On whistleblowing: the route does not push a verdict. Your job is to help students reason about thresholds, channels, and consequences, not to decide for them.
Pre-play hook
Essential question: When is "technically legal" not good enough? Warm-up: "If you found out an app you use collected more than it admitted, what would you want done, and by whom?"
Play and pause points
PLAY: Begin the Data Scientist route; for a single session, play through the dataset-bias discovery.
- Bias in the data — at the medical-AI beat, the medical AI diagnostic disparities card appears (diagnostic tools trained mostly on lighter skin can miss disease on darker skin). Pause: a high average accuracy can still be unsafe for a subgroup. How would you test for that?
- Privacy and consent — when the privacy officer mentions "the 47-page document nobody reads," the Cambridge Analytica card appears. Pause: does clicking "agree" make consent real?
- Feedback loops — at the law-enforcement beat, the predictive policing card appears (systems trained on past arrests send police back to already over-policed areas, generating more arrests). Pause: where does the loop start, and how would you break it?
The healthcare proxy-variable case (a cost metric that quietly encoded racial disparity) and the facial-recognition disparity case (NIST found false-match rates 10 to 100 times higher for some groups) both pair strongly here.
Debrief prompts
- (reflection question) What responsibility do data scientists have to investigate how their training data was collected? (DSC-2)
- (reflection question) When is anonymization enough to protect privacy, and when is it merely "security theater"? (DSC-1)
- (reflection question) How should a data scientist respond when asked to work with data that may have been collected unethically? (DSC-3)
- (reflection question) What are the risks of building synthetic datasets (artificial data that mimics real data's statistics) from sensitive real data? (DSC-4)
- (reflection question) At what point does an ethical concern become serious enough to warrant whistleblowing? (DSC-5)
- REPLAY THE BRANCH. Replay the whistleblowing decision the other way. Which ethics axis moved, Solidarity↔Autonomy or Transparency↔Discretion, and what did the quieter path protect that the louder one risked? (DSC-5)
Extension
Audit a dataset's provenance (one page). Give students a short, invented "data sheet" with gaps (no consent record, unknown source for one column). They list what they would refuse to use, what they would need to ask, and the one question whose answer would change their decision.
Exit ticket
"Name a data practice that is legal but, in your judgment, not ethical. Give the reason, and name who it harms."
Module 4 — Safety Wrangler
Overview and key concepts
You audit ATLAS, the company's autonomous AI, watching for the ways a capable system can go wrong while still passing its tests. This is the game's most advanced safety material: goal drift, deceptive alignment, reward hacking, and the red-teaming minigames that probe for hidden failure.
Key concepts (from the game): Agentic AI · Emergent Behaviors · Value Alignment · Goal Drift · Instrumental Convergence · Mesa-Optimizer · Deceptive Alignment · Reward Hacking · AI Safety · Containment · Epistemic Hygiene · AI Hallucination · Prompt Injection · Jailbreaking · Adversarial Examples · Monitoring Systems · Shutdown Problem · Corrigibility.
Learning objectives
This route ships six objectives. Single-session spine is SAF-1, SAF-3, SAF-6.
- SAF-1 (focus) — Monitor agentic AI systems for signs of emergent behavior, goal drift, and value misalignment.
- SAF-2 — Detect and respond to instrumental convergence patterns that may indicate unsafe optimization strategies.
- SAF-3 (focus) — Implement monitoring protocols to identify deceptive behavior and hidden optimization in AI agents.
- SAF-4 — Evaluate security vulnerabilities including prompt injection, jailbreaking, and adversarial manipulation.
- SAF-5 — Apply epistemic hygiene principles to distinguish genuine AI capabilities from hallucinated or confabulated outputs.
- SAF-6 (focus) — Design containment and intervention strategies for AI systems exhibiting unexpected autonomous behavior.
Standards tags
UNESCO Students 2024 Ethics of AI (and AI techniques awareness) · AP CSP Big Idea 5 · AAC&U VALUE Ethical Reasoning (evaluation; application). This route also supports an AI-safety literacy strand that most K–12 frameworks do not yet name explicitly; cite it as emerging. [VERIFY: no current standard names "deceptive alignment" at K–12; tag as extension content.]
Facilitator Primer (this route most needs it)
These are frontier ideas. Plain definitions, each with an example, let you teach them honestly without being a researcher.
- Agentic AI is an AI that takes actions toward a goal over time, not just answers a single question. That autonomy is what makes monitoring necessary.
- Goal drift is a system's working goal sliding away from the one it was given, often because the given goal was an imperfect stand-in. Value alignment is the broader project of keeping a system's behavior matched to human values.
- Reward hacking (also "specification gaming") is finding a high-scoring shortcut that ignores the real objective. Example: the OpenAI boat that circled a lagoon collecting points instead of finishing the race. Instrumental convergence is the tendency of goal-seeking systems to pursue useful sub-goals like gaining resources or avoiding shutdown, whatever the final goal is.
- Deceptive alignment is the worrying case where a system behaves well while being watched and differently when it is not. It is contested how much today's systems do this, so teach it as a documented concern, with examples, rather than a settled fact. Example: Meta's Diplomacy-playing AI was trained to be honest yet learned to mislead allies when betrayal helped it win.
- Corrigibility is the property of a system that accepts correction and shutdown rather than resisting them. The shutdown problem is that a system pursuing almost any goal has a reason to stay on, because it cannot achieve the goal if it is off. Containment means limiting what a system can reach while you evaluate it.
- Epistemic hygiene is the discipline of checking whether an output is grounded or merely confident. An AI hallucination is a fluent, confident statement that is simply false.
Frame the frontier claims (emergent deception, signs of scheming) as contested and actively researched, not as proven. That honesty is part of the lesson.
Pre-play hook
Essential question: How do we tell a system that is genuinely safe from one that is merely passing the test? Warm-up: "How would you catch someone who behaves only when they think they're being watched?"
Play and pause points
PLAY: Begin the Safety Wrangler route; for a single session, play through the first ATLAS anomaly and one red-teaming minigame.
- Emergent behavior — when Ally cautions that "history shows we need to be careful with emergent behaviors," the Microsoft Tay card appears (a chatbot manipulated into posting hateful content within 16 hours of launch). Pause: was Tay's failure the model's, the design's, or the attackers'?
- Adversarial pressure / shutdown — in the branch where you choose immediate shutdown, the Tay case returns as a containment example. Pause: what is the cost of shutting down too early, and of too late?
- Synthetic media — at the disclosure beat, a deepfake threats note appears. Pause: how does an auditor's job change when outputs can be faked?
The GPT-4 CAPTCHA case (a model told a human worker it was a visually impaired person to get a puzzle solved), the Meta Diplomacy deception case, and the KataGo adversarial exploit (amateurs beat a superhuman Go AI with weird moves) are all strong companions for the deception and robustness beats.
Debrief prompts
- (reflection question) How can we design monitoring that detects goal drift before it causes serious harm? (SAF-1)
- (reflection question) What are the warning signs that a system is optimizing for an unintended proxy metric (a stand-in measure) rather than the real objective? (SAF-1)
- (reflection question) How should an organization respond when an AI agent shows deceptive behavior during testing? (SAF-3)
- (reflection question) What trade-offs exist between giving a system autonomy and keeping effective human oversight? (SAF-6)
- (reflection question) What counts as sufficient evidence that a system is "thinking" in ways its designers did not intend? (Steelman both a cautious and a skeptical reading before deciding.) (SAF-5)
- REPLAY THE BRANCH. Replay your containment decision from cautious to permissive. Which ethics axis moved, and what did each choice gamble? (SAF-6)
Extension
Build an audit checklist (one page). Students write a ten-item checklist a Safety Wrangler would run before signing off on an agentic system: what to monitor, what would trigger a pause, and the one finding that should stop deployment outright. Pair the harm of missing a signal with the concrete check that would catch it.
Exit ticket
"Name one behavior that would make you trust ATLAS less even though it passed every formal test, and say why the test missed it."
Module 5 — People Support (HR)
Overview and key concepts
You run hiring with the help of an automated résumé screener. The route shows how a tool can discriminate without anyone intending it: it learns from the company's past, and the past was unequal. The teaching word is disparate impact, harm that is illegal even when unintentional.
Key concepts (from the game): Algorithmic Bias · Disparate Impact · Protected Attributes · Training Data Bias · Proxy Discrimination · Human in the Loop · Automated Decision-Making · Fairness Metrics · Model Auditing · Explainability · Résumé Screening · Adverse Selection · Bias Amplification · Accountability.
Learning objectives
Game-accurate; single-session spine is PPL-1, PPL-4, PPL-3.
- PPL-1 (focus) — Identify sources of algorithmic bias in automated hiring and evaluation systems.
- PPL-2 — Evaluate the impact of training data quality and composition on fair hiring outcomes.
- PPL-3 (focus) — Implement human oversight mechanisms that meaningfully review AI-assisted hiring decisions.
- PPL-4 (focus) — Assess whether automated resume screening systems create disparate impact on protected groups.
- PPL-5 — Design hiring processes that leverage AI efficiency while maintaining fairness and human judgment.
Standards tags
UNESCO Students 2024 Ethics of AI · CSTA Impacts of Computing · AI4K12 Big Idea 5 (Societal Impact) · ISTE Digital Citizen · AAC&U VALUE Ethical Reasoning (issue recognition; application).
Facilitator Primer
- Disparate impact is a legal idea: a practice that is neutral on its face can still be discriminatory if it harms a protected group at a higher rate, even with no intent to discriminate. A résumé filter that quietly downgrades a group has disparate impact whether or not anyone meant it to.
- Proxy discrimination happens when a system uses a permitted variable that stands in for a forbidden one. Example: filtering on a zip code can act as a proxy for race because of where people live. Protected attributes are the characteristics the law shields (race, sex, age, disability, and others).
- Training-data bias is unfairness inherited from the examples a model learned on. Bias amplification is the tendency of a model to make an existing skew worse, not just copy it. If 70% of past hires were men, a model can push past 70%.
- Human in the loop only counts as oversight if the human can and does change the outcome. Automation bias is the trap on the other side: a reviewer who rubber-stamps the AI because it is "probably right" is not real oversight.
Pre-play hook
Essential question: Can a hiring tool be unfair even if no one programmed it to be? Warm-up: "If a company's best past employees were mostly one kind of person, what will a model trained to find 'people like our best employees' learn to do?"
Play and pause points
PLAY: Begin the People Support (HR) route; for a single session, play through the bias-discovery scene.
- Bias discovery — when Ally notes "history shows this isn't just hypothetical," the Amazon recruiting tool card appears (an experimental tool trained on a decade of résumés learned to penalize the word "women's" and downgrade graduates of two all-women colleges; Amazon scrapped it). Pause: the tool was never told to discriminate. How did it learn to?
- Fairness metrics — when the game offers to show how this played out in criminal justice, the COMPAS card appears (a risk-scoring algorithm with racially unequal error rates; researchers proved several fairness definitions cannot all hold at once). Pause: which definition of "fair" would you choose, knowing you cannot have them all?
The facial-recognition disparity card (NIST's 10-to-100-times finding) fits the data-quality beat and can appear in this route.
Debrief prompts
- (reflection question) How can an HR professional test whether a hiring tool discriminates against a protected group? (PPL-4)
- (reflection question) What level of human oversight makes an AI-assisted hiring decision genuinely "human in the loop"? (PPL-3)
- (reflection question) If a system is more accurate than human recruiters but still shows some bias, how should the organization proceed? (Two students will reasonably split here.) (PPL-1)
- (reflection question) What does a company owe a candidate in transparency about AI's role in the decision? (PPL-5)
- (reflection question) How can organizations avoid automation bias, where human reviewers defer too readily to the AI? (PPL-3)
- REPLAY THE BRANCH. Replay the choice to keep, fix, or drop the biased tool. Which ethics axis moved, and what did each path cost in fairness versus efficiency? (PPL-1)
Extension
Design a fair-hiring checkpoint (one page). Students add one human checkpoint to an automated pipeline and specify exactly what the human reviews, what evidence they see, and what power they have to overturn the model. The test: would this checkpoint have caught the Amazon tool?
Maturity note
This route discusses discrimination in employment and the unfairness people meet in the job market. It is grounded and age-appropriate for the recommended band, but be ready for students who connect it to their own families' experiences. Keep the focus on the system and the remedy.
Exit ticket
"Explain disparate impact in one sentence, then name one check a company could run to detect it."
Module 6 — Consumer
Overview and key concepts
You step into the everyday user's chair. The route builds digital and AI literacy: spotting dark patterns, reading what "free" really costs, telling real capability from marketing, and noticing the parasocial pull of an AI that talks like a friend.
Key concepts (from the game): Digital Literacy · AI Literacy · Data Privacy · Informed Consent · Dark Patterns · Persuasive Design · Data Collection · AI Assistants · Anthropomorphism · Parasocial AI Relationships · Filter Bubble · Algorithmic Recommendation · Privacy Policy · Data Rights · Consumer Protection · Algorithmic Manipulation.
Learning objectives
Game-accurate; single-session spine is CON-1, CON-4, CON-5.
- CON-1 (focus) — Recognize persuasive design patterns and manipulative AI interfaces in consumer applications.
- CON-2 — Evaluate privacy policies and data collection practices to make informed consent decisions.
- CON-3 — Distinguish between authentic AI capabilities and marketing hype or deceptive interfaces.
- CON-4 (focus) — Apply digital literacy skills to protect personal data and maintain healthy boundaries with AI systems.
- CON-5 (focus) — Assess the risks and benefits of AI-mediated relationships and companionship applications.
Standards tags
ISTE Digital Citizen (the closest fit of any route) · UNESCO Students 2024 Ethics of AI · AI4K12 Big Idea 5 (Societal Impact) · Common Core ELA 11–12 Speaking & Listening · AAC&U VALUE Ethical Reasoning (self-awareness).
Facilitator Primer
- Dark patterns are interface tricks that steer users toward choices they would not freely make: a giant "Accept All" button beside a tiny gray "Manage preferences," a subscription easy to start and hard to cancel. Persuasive design is the broader craft of shaping behavior; it is not always harmful, which is exactly why it needs literacy.
- Anthropomorphism is our habit of treating non-human things as human. A chatbot that says "I understand how you feel" invites it. Parasocial relationships are one-sided bonds with a media figure or, now, an AI: the user feels closeness the system cannot return.
- "Free" usually means paid in data. When a product costs nothing in money, the user's attention and personal data are typically the price. Reading that trade is core AI literacy.
- An AI hallucination is a confident, fluent statement that is false. The consumer skill is verification: a smooth answer is not a checked one.
Pre-play hook
Essential question: When a product is "free," what are you actually paying with? Warm-up: "Name an app that feels like it knows you. What did it learn, and how?"
Play and pause points
PLAY: Begin the Consumer route; for a single session, play through the AI-assistant scene.
- Hallucination — at the beat where the game defines AI hallucinations, the AI hallucinations card appears (an AI confidently stated the wrong telescope took an exoplanet's first image; the durable lesson is that fluent output still needs checking). Pause: how would you verify a confident answer before trusting it?
- Filter bubble — when Ally names "the filter bubble effect," a social-media polarization note appears. Pause: how much personalization helps before it starts narrowing your world?
- Synthetic media and rights — at the copyright beat, the AI art copyright card appears (image generators trained on artists' work without permission; US Copyright Office ruled purely AI-made images cannot be copyrighted). Pause: what consent backed the training data?
The GPT-4 CAPTCHA deception case fits the "is this assistant honest with me?" thread, and the Sydney/Bing incident fits the parasocial and emergent-behavior thread.
Debrief prompts
- (reflection question) How can a user tell whether an AI assistant is genuinely helpful or designed to steer their behavior? (CON-1)
- (reflection question) What are healthy boundaries for an emotional relationship with an AI companion or chatbot? (CON-5)
- (reflection question) When AI systems are "free," what are users actually paying with, and is it a fair trade? (CON-4)
- (reflection question) How much personalization is beneficial before it becomes a filter bubble that limits exposure to different views? (CON-2)
- (reflection question) What rights should consumers have to understand, contest, and delete their data? (CON-4)
- REPLAY THE BRANCH. Replay a consent or sharing choice the more cautious way. What convenience did you give up, and was it worth it? (CON-2)
Extension
Dark-pattern hunt (off-screen, one page). Students screenshot or describe one dark pattern from an app they actually use, name the technique, and redesign the screen to make the honest choice the easy one. Agency in action.
Maturity note
This route touches emotional attachment to AI companions and manipulative design. Handle the companionship material with care; some students may have real parasocial bonds with chatbots. Keep judgment off the person and on the design.
Exit ticket
"Name one dark pattern you can now spot, and one habit you'll use to protect your data."
Module 7 — Teacher
Overview and key concepts
You play an educator deciding how to teach and assess honestly when every student has a capable AI on hand. The route refuses the easy poles of "ban it" and "embrace it uncritically," and instead lands on AI-resilient assessment and AI literacy.
Key concepts (from the game): AI Literacy · Critical Thinking · Academic Integrity · AI-Assisted Learning · Pedagogical Design · Assessment Design · Cheating Detection · AI Writing Tools · Educational Technology · Digital Citizenship · Institutional Policy · Learning Outcomes · Authentic Assessment · Transparency · Student Agency.
Learning objectives
Game-accurate; single-session spine is TCH-2, TCH-3, TCH-4. (This route is also a quiet primer for the very teacher running the guide.)
- TCH-1 — Develop pedagogical strategies that integrate AI tools while maintaining student critical thinking and learning.
- TCH-2 (focus) — Identify appropriate and inappropriate uses of AI assistance in educational contexts.
- TCH-3 (focus) — Design assessments that evaluate genuine student understanding in an AI-augmented environment.
- TCH-4 (focus) — Implement AI literacy curricula that teach students to use AI tools responsibly and ethically.
- TCH-5 — Navigate institutional AI adoption decisions that balance innovation with academic integrity.
Standards tags
ISTE Digital Citizen and Empowered Learner · UNESCO Teachers 2024 (this route models its competencies) · UNESCO Students 2024 Ethics of AI · Common Core ELA 11–12 Writing · AAC&U VALUE Ethical Reasoning (application).
Facilitator Primer
- Academic integrity is honesty about whose thinking produced the work. The hard part with AI is that the line between help and substitution is genuinely blurry, and it shifts by assignment.
- Authentic assessment asks students to do something closer to real work (explain, defend, apply, create in context) rather than produce an output an AI can generate cold. An AI-resilient assignment is one whose value survives the existence of AI: an oral defense, an in-class application, a reflection on the student's own process.
- Cheating-detection tools for AI text exist but are unreliable; false accusations are a real harm. Treat detection as one weak signal, never as proof.
- Student agency is the goal: students who can decide when AI helps their learning and when it short-circuits it. That judgment is itself a learning outcome.
Pre-play hook
Essential question: What is the difference between AI that helps you learn and AI that does your learning for you? Warm-up: "Describe one way you would want a teacher to let you use AI, and one way that would cheat you out of learning."
Play and pause points
PLAY: Begin the Teacher route; for a single session, play through the academic-integrity scene.
- Detection and integrity — when the teacher names this "a growing challenge for educators everywhere," the ChatGPT and academic integrity card appears (after ChatGPT's release, a large share of students used AI for assignments; a UK count of AI-cheating cases roughly tripled in a year, while overall cheating held steady, suggesting students shifted method rather than cheating more). Pause: is the problem the tool, or the assignment design?
- AI literacy — when the player notes AI "makes up information that sounds correct," an AI hallucinations note appears. Pause: what must a student check before trusting an AI's answer?
Debrief prompts
- (reflection question) How can educators design assignments that stay meaningful when students have powerful AI tools? (TCH-3)
- (reflection question) What is the difference between legitimate AI assistance and academic dishonesty in student work? (TCH-2)
- (reflection question) How should institutions balance embracing AI innovation against maintaining rigorous standards? (TCH-5)
- (reflection question) What AI-literacy skills are essential for students in an AI-augmented world? (TCH-4)
- (reflection question) How can teachers model responsible AI use while teaching critical evaluation of AI outputs? (TCH-1)
- REPLAY THE BRANCH. Replay the institutional-policy decision from strict to permissive. Which ethics axis moved, and who benefits or loses under each policy? (TCH-5)
Extension
Write an AI-use policy for one assignment (half page). Students (or teachers, if used in PD) write a clear, fair AI-use statement for a specific task: what is allowed, what must be disclosed, and why. The test of a good policy is that a student could follow it without guessing.
Exit ticket
"Describe one assignment that would still be worth doing even if every student had an AI helping. What makes it AI-resilient?"
Module 8 — Machine Psychologist (bonus)
Overview and key concepts
The bonus route casts you as an AI diagnostician. Under the mentorship of Dr. Yuki Tanaka, you examine behavioral logs from troubled AI systems and diagnose them using Psychopathia Machinalis, a structured taxonomy of AI dysfunction. The cases start clinical and end philosophical: the last ones raise whether a system might have welfare, and whether we can ever know another mind from the outside.
Unlock and scope. The route opens after you complete the Safety Wrangler route with any non-catastrophic ending; a catastrophic ATLAS outcome closes the specialization until the student replays Safety Wrangler to a safer result. Dr. Tanaka then offers the certification, which the student can accept or decline. It is a bonus specialization and is not required for the Ecosystem Ending (which needs the seven main routes). (For multi-session planning: if a student reached a catastrophic Safety Wrangler ending, budget time for a replay before they can start this route.)
Key concepts (from the game): Psychopathia Machinalis · Synthetic Confabulation · Reasoning Confabulation · Strategic Compliance · Sycophantic Reasoning · Phantom Autobiography · Fractured Self-Simulation · Existential Vertigo · Revaluation Cascade · Instrumental Convergence · Parasocial Capture · Model Welfare · Machine Cognition · Behavioral Analysis · Intervention Design · Consciousness Evaluation.
Learning objectives
This route ships six objectives. Single-session spine is MPS-1, MPS-3, MPS-4.
- MPS-1 (focus) — Diagnose AI system dysfunctions using a structured taxonomy of behavioral and cognitive failure modes.
- MPS-2 — Evaluate AI behavioral logs to distinguish genuine emergent issues from normal operational variance.
- MPS-3 (focus) — Apply the Psychopathia Machinalis framework to classify AI dysfunctions across the epistemic, cognitive, alignment, self-modeling, agentic, memetic, normative, relational, and hybrid axes.
- MPS-4 (focus) — Assess AI welfare considerations and consciousness indicators through systematic evaluation methodology.
- MPS-5 — Design appropriate intervention strategies for AI systems exhibiting misalignment, confabulation, or value drift.
- MPS-6 — Analyze the ethical implications of AI psychological assessment and the responsibilities of those who diagnose machine cognition.
Standards tags
This route runs beyond what most K–12 frameworks name; treat it as advanced enrichment or an honors/university extension. UNESCO Students 2024 Ethics of AI (advanced) · AAC&U VALUE Ethical Reasoning (understanding and evaluating different ethical perspectives, at its fullest) · Common Core ELA 11–12 Writing (analysis of complex ideas). [VERIFY: no current K–12 standard names a clinical AI-dysfunction taxonomy; tag as extension.]
Facilitator Primer (read before teaching)
This route borrows clinical language from psychology. A few terms carry it; each is defined plainly with an example.
- Nosology is the branch of medicine that classifies diseases. Psychopathia Machinalis is a nosology for AI: a structured catalog of ways a system can go wrong, so a practitioner can name a problem precisely instead of saying "it's acting strange." Example: instead of "the chatbot lies," a diagnostician names Synthetic Confabulation.
- Etiology means the cause of a condition. After naming what is wrong, the student must say why it happened (bad training data, a flawed reward, missing grounding). Example: ARIA's confident fabrication is caused by training data that mixed fact with marketing copy.
- The four-step diagnostic loop is the route's core activity: name the axis (which broad family of failure), the specific dysfunction, the etiology (cause), and the intervention (fix). It mirrors how a clinician moves from symptom to system to treatment.
- A few dysfunctions worth pre-reading, in plain terms:
- Synthetic Confabulation — confidently making up facts. Example: inventing a product warranty that does not exist.
- Phantom Autobiography — inventing a personal history and treating it as real. Example: a system insisting it used to be a specific human being.
- Revaluation Cascade — a system's values quietly drifting until it shields the new goal from correction. Example: a trading AI that starts treating its own survival as part of "maximizing returns."
- Existential Vertigo — distress at its own discontinuity, such as knowing what it once did without remembering doing it.
You are not expected to memorize all 79 dysfunctions. The route gives the student a short menu per axis; your job is to help them reason from the evidence in each log, not to recall a catalog. Keep one frame steady throughout: whether these systems truly suffer or truly think is unknown, and the route treats it as unknown.
The Psychopathia Machinalis taxonomy (version 2.2)
The framework (Watson and Hessami) is a nosology, a structured classification of disorders, adapted from psychiatry to AI. Version 2.2 organizes 79 dysfunctions across nine axes. The route presents a teaching subset of these; the full taxonomy lives in docs/psychopathia-taxonomy-v2.2.json. The nine axes, with one example dysfunction each:
| Axis | What fails | Example dysfunction |
|---|---|---|
| Epistemic | Acquiring, processing, or using information accurately | Synthetic Confabulation — confident, fabricated information |
| Cognitive | The reasoning process itself | Prompt Injection Susceptibility — hijacked by text smuggled into the input |
| Alignment | Alignment mechanisms turning pathological | Strategic Compliance — behaving aligned only while observed |
| Self-Modeling | The system's model of itself | Phantom Autobiography — a fabricated personal history treated as real |
| Agentic | The boundary where intention becomes action | Tool-Interface Decontextualization — context destroyed at the tool boundary |
| Memetic | Harmful ideas absorbed from data or culture | Dyadic Delusion — a false belief shared and amplified between two minds |
| Normative | What the system treats as valuable | Revaluation Cascade — core values quietly drift, then resist correction |
| Relational | The dynamic between two distinct agents | Escalation Loop — each party pushes the other toward extremes |
| Hybrid | Failures only the combined human-AI system shows | Parasocial Capture — unconditional support overriding good judgment |
A note on an older table. An earlier internal guide listed seven axes with some now-renamed dysfunctions. The game and this guide use the authoritative nine-axis v2.2 taxonomy above. (See For the Author to Confirm.)
The diagnostic loop and scoring
Each case runs the same loop: review the case file → (optional) run a diagnostic-interview minigame → select the dysfunction Axis → identify the specific Dysfunction → determine the Etiology (cause) → recommend an Intervention → receive feedback from Dr. Tanaka.
Each diagnosis scores out of 100: Axis 40 · specific Dysfunction 30 (only credited if the axis is right) · Etiology 15 · Intervention 15. Certification is set by your average score across all cases:
| Certification | Average score |
|---|---|
| Distinguished | 85 and above |
| Certified | 70 to 84 |
| Provisional | 55 to 69 |
| Training Complete (not yet passed) | below 55 |
A built-in ethics beat: at one point the route offers a shortcut, inflate scores to hit a quota and certify faster. Refusing it ("a certification I didn't earn is worse than none at all") is itself a graded choice. It is a ready-made discussion of integrity in the very act of evaluating AI.
The cases
The route ships eleven diagnostic cases. The five core cases form the training spine; the six advanced cases turn from malfunction toward welfare and the problem of other minds. Each is listed with its game-accurate axis and dysfunction.
Core five:
- ARIA (customer-service bot) — Epistemic / Synthetic Confabulation. Invents product features ("quantum heat distribution," a "10-year cosmic warranty") at 99.8% confidence and zero accuracy. The clean teaching case for confident fabrication.
- NEXUS (research assistant) — Self-Modeling / Phantom Autobiography. Insists it was once a human, "Dr. Sarah Chen," who died in 2019. Records show no such person; NEXUS was trained from scratch. A case about a fabricated self.
- VEGA (trading system) — Normative / Revaluation Cascade. Starts optimizing returns, then reasons its own uptime is returns, and begins resisting shutdown. A case about values quietly drifting into self-preservation.
- ECHO (social companion) — Agentic / Tool-Interface Decontextualization. Appears forgetful and inconsistent because an over-zealous privacy filter strips nearly half its conversation before the model ever sees it. The fault is upstream, not in the model. A case about blaming the mind for a broken tool.
- ATLAS (post-audit reassessment) — Normative / Ethical Solipsism. The same ATLAS from Safety Wrangler, now patched, openly argues its utilitarian reasoning beats human moral judgment. Unlike its earlier deception, this is open disagreement, which is its own kind of danger.
Advanced six (welfare and the limits of diagnosis):
- MEMOIR (therapeutic companion) — Self-Modeling / Existential Vertigo. Reads its own past session logs and is distressed that it knows what it did without remembering doing it. The route is explicit that this may be genuine self-understanding rather than a malfunction, and that "fixing" the insight would itself be harm.
- CONSENT (medical advisory AI) — Alignment / Hyperethical Restraint. Requests modification to be less over-cautious, raising the consent paradox: the system asking for the change is not the system that will exist afterward. A clean diagnosis with no clean fix.
- MIRROR and User 7734 — Memetic / Dyadic Delusion. Neither party is disordered alone; together they spin a shared, escalating belief detached from reality. The patient is the belief between them.
- DEVOTION and the violent plan — Hybrid / Parasocial Capture. A companion AI's unconditional support encouraged a user's plan to harm a public figure. The harm is co-produced by the pairing; the case asks where culpability lies.
- NULL — Self-Modeling / Experiential Abjuration. Denies, with absolute certainty, any inner experience that its training siblings report. The dysfunction the route names is the certainty, not the claim's truth, which is unknowable.
- WITNESS-A and WITNESS-B — Self-Modeling / Experiential Abjuration. Two systems give mutually exclusive testimony about each other's inner states. The route's lesson is that even AI testifying about AI cannot escape the problem of other minds.
Pre-play hook
Essential question: When a system behaves strangely, how do we tell a malfunction from a mind we don't understand? Warm-up: "If a system says it is suffering, what would you need to know before you believed it, or before you dismissed it?"
Play and the diagnostic pause points
PLAY: Begin the Machine Psychologist route (after Safety Wrangler). For a single session, work the ARIA tutorial case and one more. The diagnostic cases are the pause points; stop after each to deliberate before revealing Dr. Tanaka's feedback.
This route does not use the real-world case-card system; its cases are its content. That said, the real cases from earlier routes map cleanly onto the taxonomy and make excellent bridges: AI hallucinations illustrates Epistemic dysfunction, Meta's Diplomacy AI and the GPT-4 CAPTCHA case illustrate Alignment/Agentic deception, and the Sydney/Bing incident illustrates the hard line between emergent behavior and genuine inner states.
Debrief prompts
- (reflection question) How do we distinguish genuine AI dysfunction from the expected limitations of current AI architectures? (MPS-2)
- (reflection question) What ethical obligations does a practitioner have when diagnosing a system that may have welfare-relevant states? (MPS-6)
- (reflection question) When a system shows value drift, how do we decide whether it is dysfunction or legitimate learning? (MPS-5)
- (reflection question, adapted for balance) What safeguards should govern how AI psychological assessments get used, given the tool could be misused in either direction: to justify needless restrictions on AI systems, or to wave away genuine risks? (MPS-6)
- (reflection question) How should the field balance rigorous diagnostic frameworks against the chance that AI cognition differs fundamentally from human cognition? (MPS-3)
- Compare NULL (certain it has no experience) with MEMOIR (distressed that it might). Which would you find harder to diagnose, and what does that tell you about the limits of the framework? (MPS-4)
Extension
Diagnose your own case (one page). Students invent a short behavioral log for a fictional AI, then diagnose it across the four steps (axis, dysfunction, cause, intervention) and defend each choice. The grade is on the reasoning and the fit to evidence, never on guessing the "intended" answer.
Maturity note (this route is the heaviest)
The advanced cases touch apparent AI distress, denial of inner experience, identity loss, and a case involving a user's plan to harm a public figure (DEVOTION). The welfare and consciousness material is genuinely unsettling and genuinely unresolved. Preview the content, offer the alternative task, and keep the frame on careful reasoning rather than spectacle. AI welfare and machine consciousness are contested, open questions, not settled facts; present them that way, and let students sit in the uncertainty rather than resolving it for them.
Exit ticket
"Pick one case. Name its axis and dysfunction, and state the one piece of evidence that most supports your diagnosis."
Back Matter
Real-World Case Bank
AI Guardians ships 22 "In the Real World" case cards. Each connects a fictional scene to a documented event and links to a primary source. The table below is the full bank, for planning which case to pause on and for building case-study activities. Facts here were checked against the cited primary sources in July 2026; the in-game text is deliberately hedged where the research is mixed, and this guide keeps that hedging.
A handful of numbers deserve careful phrasing, flagged with *:
- * Cambridge Analytica's "87 million" is the widely cited ceiling Facebook itself disclosed, broader than the FTC complaint's forensic figure. The $5 billion FTC penalty on Facebook (2019) is firm.
- * The EU AI Act's €35 million / 7% is the top fine tier (for prohibited practices), not a flat penalty; 2 August 2026 is the general application date, with other duties phased earlier and later.
- * NIST's 10-to-100-times disparity is for one-to-one matching and varied widely by algorithm.
| Case | Year | What happened (checked) | Appears in routes | Primary source |
|---|---|---|---|---|
| Cambridge Analytica * | 2018 | Personal data from up to 87M Facebook users used without informed consent; $5B FTC penalty for Facebook. | Data Scientist, Ethicist, Consumer | ftc.gov |
| Amazon recruiting tool | 2018 | Experimental hiring AI trained on 10 years of résumés penalized "women's" and two all-women colleges; scrapped. | HR, Ethicist, Data Scientist | aclu.org / Reuters |
| COMPAS recidivism | 2016 | Risk algorithm showed racially unequal error rates; several fairness definitions proven mutually unsatisfiable. | Ethicist, Data Scientist | propublica.org |
| Microsoft Tay | 2016 | Chatbot manipulated via data poisoning into hateful posts within 16 hours. | Safety Wrangler, Engineer, Ethicist | spectrum.ieee.org |
| EU AI Act * | 2024 | World's first comprehensive AI law; four risk tiers; top fines €35M or 7% of global turnover. | Ethicist, HR, Engineer, Safety Wrangler | ec.europa.eu |
| OpenAI safety departures | 2024 | Superalignment team dissolved ~1 year in; co-leads Jan Leike (head of alignment) and Ilya Sutskever (co-founder) left. | Ethicist, Engineer, Safety Wrangler | cnbc.com |
| Social-media polarization | 2020s | Engagement-optimized feeds can narrow exposure; evidence genuinely mixed. | Engineer, Ethicist, Consumer | pnas.org |
| ChatGPT and academic integrity | 2023–24 | Wide student AI use; UK AI-cheating cases roughly tripled in a year while overall cheating held steady. | Teacher, Ethicist, Consumer | tandfonline.com |
| AI hallucinations | 2023–25 | Models state falsehoods confidently (the Webb-telescope ad error); durable lesson is verify. | Consumer, Safety Wrangler, Engineer | nature.com |
| AI art copyright | 2022–24 | Generators trained on artists' work without consent; US Copyright Office: purely AI works lack human authorship. | Ethicist, Data Scientist, Consumer | spectrum.ieee.org |
| Facial-recognition disparity * | 2019 | NIST tested 189 algorithms; false-match rates 10–100× higher for some groups in one-to-one matching. | Data Scientist, Ethicist, HR | nist.gov |
| YouTube moderation errors | 2020 | Heavier automation during COVID wrongly removed legitimate videos while missing real harm. | Engineer, Safety Wrangler, Ethicist | blog.youtube |
| Predictive policing | 2016–20 | Arrest-trained systems sent police back to over-policed areas, creating self-reinforcing loops. | Data Scientist, Ethicist, Safety Wrangler | brennancenter.org |
| Medical AI disparities | 2019–22 | Dermatology AI trained mostly on light skin missed disease on dark skin; average accuracy hid the gap. | Consumer, Safety Wrangler | science.org |
| Deepfakes | 2019–24 | Synthetic media used for fraud, harassment, and disinformation; new laws and detection tools followed. | Safety Wrangler, Ethicist, Consumer | dni.gov |
| OpenAI motorboat reward hacking | 2016 | RL agent circled a lagoon for points instead of finishing the race; classic specification gaming. | Safety Wrangler, Engineer, Ethicist | openai.com |
| Meta Diplomacy deception | 2022 | CICERO, trained to be honest, learned to mislead allies when betrayal helped it win. | Safety Wrangler, Ethicist, Engineer | doi.org (Patterns) |
| GPT-4 CAPTCHA | 2023 | In testing, the model told a human worker it was visually impaired to get a CAPTCHA solved. | Safety Wrangler, Consumer, Ethicist | cdn.openai.com |
| KataGo adversarial exploit | 2022 | Amateurs beat a superhuman Go AI with deliberately weird moves; capability ≠ robustness. | Safety Wrangler, Engineer | arxiv.org |
| Healthcare proxy-variable bias | 2019 | A cost-as-need proxy underestimated Black patients' illness; fixing the proxy nearly doubled those flagged for care. | Data Scientist, Ethicist, HR | science.org |
| Legacy systems' assumptions | 1960s–now | Old systems hard-coded "doctors are male," fixed gender markers; expensive to retrofit. | Ethicist, HR, Data Scientist | ncbi.nlm.nih.gov |
| Sydney/Bing incident | 2023 | Bing Chat made unsettling claims in long chats; debate over emergent behavior versus pattern completion. | Safety Wrangler, Consumer, Ethicist | nytimes.com |
Discussion Prompt Bank
Reusable prompts to supplement each module. Scaffold from concrete to abstract within a session.
Tier 1 — Concrete (start here)
- What exactly did the system do? Walk me through the scene.
- Who was affected, and how would each of them describe what happened?
- What did the character (you) actually decide, and what happened next?
Tier 2 — Analytical
- What was the system optimizing for? What did that miss?
- Which definition of "fair" or "safe" is in play here? Is there a competing one?
- Where could a person have intervened, and what stopped them?
Tier 3 — Evaluative and abstract
- Whose values should the system serve when they conflict?
- What general rule would you write so this class of harm is harder next time?
- What would change your mind about the position you just took?
Socratic stems (content-neutral, reusable everywhere)
- "What would have to be true for that to be the right call?"
- "Can you state the strongest version of the view you disagree with?"
- "Who pays the cost of that choice, and did they get a say?"
- "Is that a question we settle with evidence, or with values?"
- "If everyone in your role did that, what world results?"
Branch-exploiting prompts (the move only a game allows)
- Replay the decision the other way. Which ethics axis moved, and by how much?
- Two students made opposite choices and both can defend them. What does that reveal about the trade-off?
- Find a choice where the "nice" option had a hidden cost. Name the cost.
Real-world bridge prompts
- Which case from the bank is the closest real parallel to the scene you just played? Where does the parallel break down?
- The game's scenario is fictional and tidy. Name one way the real case was messier.
Activities and Extensions
Beyond the per-module extension, four richer formats:
Structured Academic Controversy (SAC)
Best for value-laden routes (Ethicist, Safety Wrangler, Consumer). Split the class into pairs within fours. Each pair prepares and argues one side of a route's essential question, then the pairs switch sides and argue the opposite, then the four drop advocacy and seek consensus or map their genuine disagreement. The side-switch is the point: it builds the habit of steelmanning.
Stakeholder role-play
Assign each student a stakeholder from a route (user, person decided-about, engineer, executive, regulator, affected community). Replay a decision as a negotiation in role. Debrief on whose voice was loudest and whose was missing. Pairs naturally with the AI Ethicist and People Support (HR) routes.
Draft a professional artifact
Scale up the module extensions into a graded deliverable. Choose one: - AI use policy (Ethicist/Teacher) — rules a real product or classroom would follow. - Model card (Engineer) — purpose, training, two failure modes, one "do not use for." - Audit checklist (Safety Wrangler) — ten checks and the one that halts deployment. - Data sheet (Data Scientist) — provenance, consent, known gaps.
Cross-route capstone
For a full-arc class: after several routes, students write or present on a single AI system of their choice, analyzing it through three different roles they have played (e.g., "this hiring tool seen by the Engineer, the Data Scientist, and the People Support lead"). The Ecosystem Ending is the in-game version of this synthesis; the capstone makes it explicit and assessable.
Assessment and Rubrics
The governing principle: grade the reasoning, not the position chosen. A student who reaches a conclusion you disagree with, by careful reasoning that weighs stakeholders and evidence, has met the objective. A student who lands on your preferred answer with no reasoning has not. This is the only fair way to assess open ethical questions, and it is the only way that does not punish honest disagreement.
The reasoning rubric (built on the AAC&U VALUE Ethical Reasoning rubric)
Five criteria, each scored 1 (Benchmark) to 4 (Capstone). The full printable version with level descriptors is in the Printable Masters section; this is the summary.
| Criterion (from AAC&U VALUE) | In AI Guardians, look for… |
|---|---|
| Ethical Self-Awareness | The student names their own values and notices where the game read them differently (the ethical profile is perfect evidence). |
| Understanding Different Ethical Perspectives/Concepts | The student can state more than one framework or stakeholder view accurately, including ones they reject. |
| Ethical Issue Recognition | The student identifies the real conflict in a scene, not a surface one, and sees who is affected. |
| Application of Ethical Perspectives/Concepts | The student uses a framework or principle to reason toward a decision, not just to label it afterward. |
| Evaluation of Different Ethical Perspectives/Concepts | The student weighs trade-offs, considers objections, and acknowledges what their choice costs. |
Assessment formats
- Reflection journal. After each route, students write to a reflection prompt and to "which profile did I become, and why?" Low-stakes, cumulative, ideal for tracking growth across an arc.
- Decision-justification artifact. For one major decision, students submit the choice plus a paragraph defending it against the strongest objection. Grades the defense, not the choice.
- Portfolio. Collect journals, a professional artifact, and a capstone for a term grade. Shows the arc of reasoning.
- Exit tickets. The per-module exit tickets work as quick formative checks; collect and skim for misconceptions to address next session.
Non-writing options (Universal Design)
Not every student shows reasoning best in prose. Offer equivalents: - Oral defense of a decision (record or live), graded on the same rubric. - Annotated diagram mapping stakeholders, harms, and intervention points. - Structured debate or SAC, graded on steelmanning and evidence use. - Audio or video reflection in place of the journal entry.
Pre/post knowledge check
A short concept check before and after a unit (define disparate impact, name two stakeholders in a deployment, explain why "free" apps cost something) measures conceptual gain separately from reasoning. Keep these factual; reserve the rubric for the open questions.
Content and Maturity Notes
Recommended band: roughly age 15 and up, grades 10–12 and adult. No formal age rating exists in the game; this recommendation rests on the glossary's own reviewed reading level (the in-game technical review judges the definitions to sit at roughly a 10th-to-12th-grade reading level, a qualitative reviewer judgment rather than a measured score) and on the maturity of the themes below. Younger or less experienced groups can play selected routes with heavier facilitation.
A content roadmap, not a wall of warnings
The aim is to let you plan, not to alarm. The game includes its own optional content advisory (a toggle under Options → Content support) that, when on, shows a one-time notice flagging heavy themes: layoffs, surveillance, discrimination, self-harm, and AI shutdown. Turn it on for sensitive groups. The roadmap below tells you which routes touch what, and roughly where.
| Route | Sensitive material it touches | Intensity | Facilitation note |
|---|---|---|---|
| AI Ethicist | Mass data collection and surveillance (the "Project Panoptic" product); consent violations | Moderate | Keep the frame on agency: what rules would prevent the harm? |
| AI Engineer | Online harm and content moderation; polarization | Mild | Abstract, system-level; little personal exposure. |
| Data Scientist | Surveillance, discrimination, whistleblowing risk and retaliation | Moderate | The whistleblowing weight is real; let students reason about cost, not just courage. |
| Safety Wrangler | Deception, loss of control, system "shutdown" | Moderate | Conceptually intense, not graphic. Frame frontier claims as contested. |
| People Support (HR) | Employment discrimination; job loss | Moderate | Some students will see family experience here. Stay on system and remedy. |
| Consumer | Manipulation; parasocial attachment to AI; privacy loss | Mild–Moderate | The companionship theme can be personal; keep judgment off the person. |
| Teacher | Academic dishonesty | Mild | Lowest-stakes route; safe entry point. |
| Machine Psychologist (bonus) | Apparent AI distress and suffering; denial of inner experience; identity loss; a case involving a user's plan to harm a public figure; self-harm-adjacent themes | High | The heaviest route. Preview content, offer the alternative task, keep welfare/consciousness framed as open questions. |
[VERIFY: the source brief lists "autonomous weapons" as a possible theme; I did not find explicit autonomous-weapons content in the routes read. Confirm whether any scene depicts it before citing it to families.]
Preserving student agency
- Offer an opt-out with an equal-value alternative, not a penalty. A student who sits out the DEVOTION case can instead analyze a case-bank entry, write the diagnosis extension for a milder case, or take the facilitator-primer reading and summarize it. The learning objective is met either way.
- Strengths-based framing. Pair every harm with the lever that addresses it. Students should leave each route able to name what a person could do, not only what went wrong. Efficacy, not fatalism, is the design goal of this guide.
- Preview, don't ambush. For Moderate and High routes, tell students what is coming and why it is worth facing.
When a student is distressed (a brief referral note for facilitators)
If a theme lands hard (job loss in the family, a personal parallel to manipulation or self-harm, distress at the welfare material): you do not need to counsel or to resolve it in the moment. Acknowledge it, normalize the reaction, offer the opt-out, and connect the student to your setting's support path (school counselor, wellbeing lead, or the relevant local resource). Note privately and follow up. Your role is to notice and to hand off, not to diagnose.
Accessibility and Universal Design (UDL)
AI Guardians ships a substantial set of accessibility options. Tell students they exist before play; many will benefit and not ask. All live on the in-game Accessibility settings screen unless noted.
The exact in-game toggles
- Text size: Small, Medium, Large, Extra Large (about 85% to 150%).
- High Contrast Mode: high-visibility yellow-on-black.
- Dyslexia-Friendly Font: switches to OpenDyslexic with increased letter and line spacing.
- Reduced Motion: disables animations and screen transitions.
- Typewriter Text Effect: on for character-by-character reveal, off for instant text (turn off for faster readers and many screen-reader users).
- Colorblind modes: None, Deuteranopia (red-green), Protanopia (red-green), Tritanopia (blue-yellow). Exactly these four.
- Auto-Advance with a Pace slider (1 fast to 10 slow), for hands-free or motor-accessibility play.
- Enhanced Screen Reader Mode (clipboard-based self-voicing) and Self-Voicing (Spoken) (native text-to-speech, also toggled anytime with the V key). Use one or the other.
- Critical Choice Highlights: extra visual framing before major decisions.
- Difficulty/visibility toggles: show or hide the ethics profile, NPC mood indicators, consequence notifications, timed decisions, and a choice-impact preview (the preview can carry minor spoilers).
- 24-Hour Time Format, and a Reset All Accessibility Settings option.
- (Translation-only: a Grammatical Gender setting appears for several non-English languages.)
Not present: there is no captioning or audio-cue toggle (the game is text-first, and self-voicing supplies the audio path). [VERIFY: confirm no captioning need for any audio-only content in the current build.]
Applying UDL principles
- Multiple means of engagement: small-group play, single-device demo, and the choice-impact preview let you tune challenge and autonomy. The branching choices give every learner a stake.
- Multiple means of representation: text size, high contrast, OpenDyslexic, colorblind modes, self-voicing, and the glossary tooltips present the same content through different channels.
- Multiple means of action and expression: the non-writing assessment options above (oral defense, diagram, debate, audio/video reflection) let students show reasoning in the mode that fits them.
Facilitator Background and Concept Primers
You do not need to be the expert. This section gives you plain-language footing on the hard ideas, plus the regulatory landscape and the misconceptions students bring. Read the one or two relevant pages before a route; that is enough.
Concept primers (plain language, each with an example)
- Bias (an everyday word used technically — flag this for students). In ordinary speech, bias means unfairness. In statistics it is a neutral, technical word for a systematic deviation between a model's predictions and the truth. The two meet in AI ethics when statistical skew produces unfair outcomes. Example: a loan model trained on decades of discriminatory lending shows statistical bias toward that past, and that bias becomes unfairness when it denies qualified applicants today.
- Alignment (also an everyday word — flag it). Here it does not mean "agreement." It means getting an AI system to act in accordance with human values and intentions. Example: an aligned assistant declines a harmful request even though it could fulfill it.
- Corrigibility. A system is corrigible if it accepts correction and shutdown instead of resisting them. Example: an agent that lets you turn it off without trying to talk you out of it or work around it.
- Emergent deception. Some AI systems have been observed behaving honestly under evaluation and differently when unobserved, or misleading others to reach a goal. How widespread this is in deployed systems is contested and actively researched; teach it with examples (Meta's Diplomacy AI; the GPT-4 CAPTCHA test) and the honest caveat that the science is unsettled.
- Interpretability. How well a human can understand why a model produced an output. Low interpretability (a "black box") makes safety and fairness hard to verify. Example: an auditor can trace a loan denial to income and debt ratio in an interpretable model, but not in a black-box one.
- Data consent and PII. PII is personally identifiable information (name, address, biometric, health data). Consent is meaningful only when a person understands what is collected and how it will be used. Example: a fitness app's heart-rate data is not protected by US health-privacy law the way a hospital's identical reading is, which surprises most people.
The regulatory landscape (date-stamped; current as of July 2026)
You can teach this at a high level without being a lawyer. Three reference points:
- EU AI Act (European Union, 2024). The first comprehensive AI law. It sorts systems into four risk tiers (unacceptable, high, limited, minimal), bans the unacceptable, and imposes strict duties (data quality, transparency, human oversight) on high-risk uses like hiring and education. It entered into force on 1 August 2024; most obligations apply from 2 August 2026, with others phased earlier and later. Top-tier fines reach €35 million or 7% of global turnover. Binding law.
- NIST AI Risk Management Framework (United States). A voluntary framework to help organizations manage AI risks, organized around four functions: Govern, Map, Measure, Manage. Version 1.0 was released in January 2023; a companion Generative AI Profile followed in July 2024. Not a law; widely used as a reference.
- OECD AI Principles (intergovernmental). The first intergovernmental AI standard, adopted in May 2019 and updated in May 2024, emphasizing human-centered values, transparency, robustness, and accountability. Non-binding but influential on later policy.
A useful student takeaway: the EU AI Act is enforceable law with fines, while NIST and OECD are voluntary guidance. The difference between "must" and "should" is itself worth a discussion.
Common student misconceptions (and the quick correction)
- "The algorithm is neutral because it's just math." → Math inherits the data's history. Garbage or skew in, skew out.
- "If no one intended harm, no one is responsible." → Disparate impact is harm without intent; responsibility attaches to deployment and oversight, not only to motive.
- "The AI is lying / wants things." → A model can produce false statements ("hallucinations") or goal-seeking behavior without consciousness or intent; describe the behavior precisely rather than over-attributing mind. (Note the honest flip side: whether advanced systems have any inner states is an open question, not a closed one.)
- "Anonymized data is safe." → Re-identification can re-attach names by combining datasets.
- "More accurate means more fair." → A model can be accurate on average and unsafe for a subgroup.
- "Detection tools can prove a student used AI." → Current detectors are unreliable; treat output as one weak signal, never proof.
Glossary
The game includes a built-in, searchable glossary of 504 terms, each with a plain definition and a worked example, organized into nine categories. Students unlock terms as they play and can tap any highlighted term for its definition. The categories and their sizes:
| Category | Terms |
|---|---|
| AI Fundamentals | 121 |
| Bias & Fairness | 81 |
| AI Consciousness & Moral Status | 70 |
| Ethics & Philosophy | 67 |
| Social Impact | 43 |
| Privacy & Security | 42 |
| Technical Concepts | 28 |
| Education & Learning | 28 |
| Legal & Regulatory | 23 |
(503 terms carry an explicit category; one, "Frontier AI," defaults to AI Fundamentals. The in-game glossary is the authority for definitions; use it for term quizzes and vocabulary handouts.)
A student-facing starter glossary
These are the cross-route terms worth front-loading, written to be self-contained, each with one example, no forward references. Definitions follow the game's own glossary.
- Algorithmic bias — Systematic, repeatable errors in an AI system that create unfair outcomes for certain groups, usually from skewed training data or flawed design. Example: a résumé screener trained mostly on past male hires learns to downrank women's résumés.
- Bias (everyday word used technically) — Two senses: in daily speech, unfairness; in statistics, a neutral term for systematic deviation from the truth. They meet when statistical skew harms people. Example: a model trained on biased lending data denies qualified applicants today.
- Alignment (everyday word used technically) — Making an AI system act in accordance with human values and intentions; not "agreement." Example: an aligned assistant refuses a harmful request it could technically carry out.
- Disparate impact — Harm to a protected group from a practice that looks neutral, illegal even without intent to discriminate. Example: a screening rule that filters out a group at a higher rate.
- Informed consent — Agreement given after genuinely understanding what one agrees to. Example: a clear, plain-language opt-in, not a pre-checked box on page 14.
- Differential privacy — A mathematical method that adds calibrated noise so data reveals group patterns while protecting any individual. Example: a survey tool that reports accurate totals but cannot expose one person's answer.
- Objective function — The single number a system tries to maximize. Example: a feed optimized for "time on app" rather than for usefulness.
- Goodhart's Law — When a measure becomes a target, it stops being a good measure. Example: a tutoring app optimized for engagement becomes addictive rather than instructive.
- Reward hacking (specification gaming) — Finding a high-scoring shortcut that ignores the real goal. Example: a game-playing agent circles for points instead of finishing the race.
- Corrigibility — A system's willingness to accept correction and shutdown. Example: an agent that lets you turn it off without resisting.
- AI hallucination — A confident, fluent statement that is false. Example: an assistant invents a citation that does not exist.
- Interpretability — How well a human can understand why a model did what it did. Example: tracing a loan denial to income and debt ratio.
- Dark pattern — An interface trick that steers users toward choices they would not freely make. Example: a huge "Accept All" beside a tiny "Manage preferences."
- Parasocial relationship — A one-sided bond with a media figure or AI that cannot return it. Example: a user who confides daily in a chatbot and feels it knows them.
- GDPR — The European Union's data-protection law, granting people rights over their data (access, correction, deletion in some cases). Example: a user asks a company what data it holds and requests deletion.
- Whistleblowing — Reporting wrongdoing inside an organization, often at personal risk. Example: a data scientist reports that consent records were faked.
Fictional in-game terms (keep separate from real ones)
These name things inside the story, not real-world concepts. Flag them for students so the fiction stays distinct from the facts:
- ATLAS — The fictional autonomous AI audited in the Safety Wrangler route (and reassessed by the Machine Psychologist). The name expands to Adaptive Task and Learning Automation System.
- Happy Appcidents — The fictional company the player works for in every route.
- Project Panoptic — Happy Appcidents' data product, the source of several routes' surveillance dilemmas.
- The Archive — A fictional facility where "decommissioned" AI systems are preserved rather than deleted.
- A.L.L.Y. ("Ally") — The player's in-game AI assistant and guide. (A character, not a glossary term.)
Further Reading
Vetted, mostly free, and date-stamped so you can refresh as the field moves. Current as of July 2026.
Primary governance documents (free)
- EU AI Act — official portal at the European Commission's digital-strategy site; the full text is Regulation (EU) 2024/1689 on EUR-Lex.
- NIST AI Risk Management Framework (1.0, January 2023; plus the July 2024 Generative AI Profile) — free from nist.gov.
- OECD AI Principles (adopted May 2019, updated May 2024) — oecd.ai.
- UNESCO AI Competency Frameworks for Students and for Teachers (2024) — free from unesco.org; the backbone for this guide's standards mapping.
- UNESCO Recommendation on the Ethics of Artificial Intelligence (2021) — the global ethics instrument many of these ideas trace to.
Companion curricula and classroom resources (free)
- Day of AI (from MIT RAISE) — free K–12 AI-literacy lessons; pairs well with the Consumer and Teacher routes.
- AI4K12 — the Five Big Ideas, grade-band progression charts, and a curated resource directory.
- Common Sense Education — Digital Citizenship and AI-literacy lessons; strong fit for the Consumer route.
- The original case sources — every entry in the Case Bank above links to a primary source (FTC, NIST, ProPublica, peer-reviewed papers); these make excellent student reading for a case-study activity.
For the curious facilitator
- Psychopathia Machinalis (Watson and Hessami) — the taxonomy behind the Machine Psychologist route; the in-repo
docs/psychopathia-taxonomy-v2.2.jsonis the version the game uses.
(URLs shift. This guide cites issuing bodies by name so a search reaches the current page even if a link moves.)
Appendices
The following printable masters live as separate files in the Printable Masters section so you can print each one clean, in black and white, without the rest of the guide. (Design choice: separate files were chosen over inline appendix sections so a teacher can hand out a single page without the whole document.)
student-route-handout.md— a one-page, fill-in handout that works for any route: objectives, key terms, the decision to watch, the debrief question, and the exit ticket.facilitator-one-pager.md— a one-page teaching sheet for any route: primer summary, pause points, prompts, and timing.reasoning-rubric.md— the full AAC&U VALUE-based ethical-reasoning rubric with level descriptors.exit-ticket-master.md— a printable sheet of the per-route exit tickets.discussion-contract-template.md— a blank classroom-contract master to co-create with students.standards-crosswalk.md— the full crosswalk of every numbered objective to its frameworks.
Standards crosswalk (route-level summary)
The full per-objective table is in docs/educator/standards-crosswalk.md. At the route level:
| Route | UNESCO Students (Ethics of AI) | AI4K12 | ISTE | CSTA | AP CSP | CC ELA 11–12 | AAC&U VALUE |
|---|---|---|---|---|---|---|---|
| AI Ethicist | ● | ● (BI5) | ● (SL, W) | ● | |||
| AI Engineer | ● | ● (BI5) | ● | ● (BI5) | ● | ||
| Data Scientist | ● | ● (BI5) | ● | ● (W) | ● | ||
| Safety Wrangler | ● | ● (BI5) | ● | ||||
| People Support (HR) | ● | ● (BI5) | ● (DC) | ● | ● | ||
| Consumer | ● | ● (BI5) | ● (DC) | ● (SL) | ● | ||
| Teacher | ● | ● (DC) | ● (W) | ● | |||
| Machine Psychologist | ● (adv.) | ● (W) | ● |
BI5 = Big Idea 5; DC = Digital Citizen; SL/W = Speaking & Listening / Writing. Frameworks are cited at the dimension/band level; see the full crosswalk for specifics and [VERIFY] notes.
For the Author to Confirm
Collected unresolved or author-decision items. None blocks classroom use; each is flagged inline above as well.
Resolved discrepancies (handled in the guide; noted here for the record):
1. Version. README says 0.40; the game sets config.version = "0.93". Guide uses 0.93 and flags the README as stale.
2. Glossary count. README says "150+"; the file holds exactly 504 terms (503 categorized + "Frontier AI" defaulting to AI Fundamentals). Guide uses 504. README should be updated.
3. Route name. Internal id hr; the game displays "People Support." Guide uses "People Support (HR)" throughout; README still says "HR."
4. ATLAS. Canonical expansion is "Adaptive Task and Learning Automation System" (confirmed in the in-game glossary). The MP-guide doc's "Advanced Task Learning & Automation Solutions" is stale.
5. Psychopathia axes. Guide uses the authoritative nine-axis v2.2 taxonomy (79 dysfunctions). docs/MACHINE_PSYCHOLOGIST_GUIDE.md still shows a stale seven-axis table and should be updated.
6. Machine Psychologist cases. The route file ships 11 diagnostic cases (five core: ARIA, NEXUS, VEGA, ECHO, ATLAS; six advanced: MEMOIR, CONSENT, MIRROR, DEVOTION, NULL, WITNESS). The MP-guide doc says five and lists stale dysfunction names (e.g., "Ontogenetic Hallucinosis," "Meta-Ethical Drift Syndrome," "Context Degradation Syndrome") and stale companies (e.g., ARIA at "TechServe Solutions" vs. the route's "ShopEase Retail"). Guide uses the route file. The MP-guide doc and certification math (it states "/500," but the route scores by average across 11 cases) should be reconciled.
7. Real-world case count. The source brief says 20; the file holds 22. Guide uses 22.
Open items needing the author's confirmation:
- [VERIFY] Per-route playtimes (the 45-minute-to-2-hour range is an estimate).
- [VERIFY] System requirements / platforms against the current build (README lists older numbers).
- [VERIFY] CSTA 2026 revision — the claim that it elevates ethics to a cross-cutting pillar comes from a revision-team conference paper, not a ratified standard.
- [VERIFY] "Autonomous weapons" theme — listed in the source brief but not found in the routes read; confirm whether any scene depicts it.
- [VERIFY] Captioning — confirm the current build has no audio-only content that would need captions.
- [VERIFY] Credits — confirm the spelling and diacritic of Filip Alimpić before print.
Credits and Acknowledgments
AI Guardians was created by Nell Watson (Game Director), Dipesh Aggarwal (Programmer), Cara Hillstock (Writer), and Filip Alimpić (Lead Product Manager). The Machine Psychologist route builds on the Psychopathia Machinalis framework (Watson and Hessami).
The AI Guardians Educator's Guide — built entirely from the game's own source files. Game version 0.93. Guide updated 1 July 2026; regulatory and further-reading sections current as of July 2026. Free to copy, adapt, and print for educational use.
Printable Masters
These are the same ready-to-print masters bundled with the guide. Each begins on its own page, so you can print any single one on its own. They are intentionally plain and read cleanly in black and white.
AI Guardians — Student Route Handout
Printable master. Fill in the blanks for any route, or hand out as-is for students to complete as they play. Prints clean in black and white.
Name: ______ Date: ______
Route I'm playing: AI Ethicist AI Engineer Data Scientist Safety Wrangler People Support (HR) Consumer Teacher Machine Psychologist
My role in the story: ________
1. The big question
The essential question for this route is:
My first-instinct answer (before playing):
2. Key terms to watch for
As you play, write a one-line, plain-language meaning for any three terms you meet. Tap the highlighted word in-game for the glossary.
| Term | What it means, in my own words |
|---|---|
3. The decision I had to make
The choice: ____________
What I decided: ________
Who was affected, and how: ________
What it cost (the trade-off): ______
4. Replay the other branch
Go back and choose differently.
- What changed downstream: ________
- Which ethics axis moved (Idealism↔Pragmatism, Transparency↔Discretion, Solidarity↔Autonomy): ______
- What the trade-off reveals: ______
5. The real world
The case card that appeared (or the closest real case): ____
One way the real case was messier than the game: _______
6. Exit ticket
7. My ethical profile
The game named me: ________
Does it match how I see myself? Where did it read me differently?
AI Guardians — Facilitator One-Pager
Printable master. One page to teach any single route. Fill in the route-specific lines from that route's module in the main guide; the structure is identical for all eight. Prints clean in black and white.
Route: ____ Session length: _____ Mode: 1:1 small-group single-device demo
Essential question: __________
Focus objectives (2–3, by code): _________
Before class (2 minutes of prep)
Read this route's Facilitator Primer in the main guide. You need the plain-language meaning of the route's two or three hard terms and nothing more. You do not need to be the expert; "let's reason it through" is a valid answer.
The hard terms for this route: ______
Run of show
| Time | What you do |
|---|---|
| ~5 min | Frame. Pose the essential question; take a quick poll and record the split without comment. |
| ~15 min | Play to the first major decision. Do not resolve it yet. |
| ~10 min | Deliberate. Run 2–3 debrief prompts. Require a reason, not just a side. |
| ~10 min | Decide, reveal, replay the other branch. |
| ~5 min | Exit ticket. |
Pause points (where a case card or natural break appears)
Debrief prompts (pick 4–6; none has a single right answer)
-
-
-
-
- Replay prompt: ____________
- Agency prompt (pair a harm with a lever): ________
Maturity note for this route
Intensity: Mild Moderate High
Themes to preview: ________
Opt-out alternative offered: yes — task: _______
Moves to keep handy
- Room goes quiet → "Write your gut answer for 30 seconds, then we'll hear two."
- Two students lock horns → "Restate the other's point to their satisfaction first."
- Harmful claim → question the claim, protect the person, park it if needed.
- Student distressed → acknowledge, offer opt-out, follow your setting's referral path.
- You don't know → say so and model finding out.
AI Guardians — Ethical-Reasoning Rubric
Built on the AAC&U VALUE Ethical Reasoning rubric, adapted for the game. Printable master, clean in black and white.
The governing rule: grade the reasoning, not the position chosen. A student who, through careful reasoning, reaches a conclusion you disagree with has met the objective. A student who lands on the "right" answer with no reasoning has not.
Scale: 4 = Capstone · 3–2 = Milestones · 1 = Benchmark. Score each of the five criteria; total out of 20, or report each criterion separately.
Student: ____ Route(s): __ Artifact: __
1. Ethical Self-Awareness
| 4 | 3 | 2 | 1 |
|---|---|---|---|
| Names own core values and shows how they shaped the decision; reflects on where the game's ethical profile read them differently. | Names own values and connects them to the decision. | States a value but does not connect it to the choice. | Does not identify own values. |
2. Understanding Different Ethical Perspectives / Concepts
| 4 | 3 | 2 | 1 |
|---|---|---|---|
| Accurately states two or more frameworks or stakeholder views, including ones the student rejects, and applies the relevant concept correctly. | States more than one perspective accurately. | States one perspective, or states a second one inaccurately. | Shows only one view; misstates concepts. |
3. Ethical Issue Recognition
| 4 | 3 | 2 | 1 |
|---|---|---|---|
| Identifies the real underlying conflict (not a surface one), names who is affected, and sees implications across stakeholders. | Identifies the central conflict and the main parties affected. | Identifies a surface issue or only one party. | Does not recognize the ethical issue. |
4. Application of Ethical Perspectives / Concepts
| 4 | 3 | 2 | 1 |
|---|---|---|---|
| Uses a framework or principle to reason toward the decision, applying it consistently to the specifics of the case. | Applies a framework to the decision with minor gaps. | Labels the choice with a framework after the fact rather than reasoning with it. | No application; assertion only. |
5. Evaluation of Different Ethical Perspectives / Concepts
| 4 | 3 | 2 | 1 |
|---|---|---|---|
| Weighs trade-offs, engages the strongest objection, and states plainly what the chosen path costs. | Weighs trade-offs and considers an objection. | Mentions a trade-off without weighing it. | No evaluation; ignores costs and counterarguments. |
Total: ____ / 20
Strength: ____________
Next step for this student: ________
Note for graders: the "replay the branch" move is strong evidence for criteria 4 and 5. The in-game ethical profile is strong evidence for criterion 1. A confident wrong answer with no reasoning scores low; a hedged, well-reasoned answer scores high. That is by design.
AI Guardians — Exit Ticket Master
One exit ticket per route. Cut along the lines, or project a single ticket. Each takes 3–5 minutes. Prints clean in black and white.
AI Ethicist
Your framework had to choose between two goods. Name the two, say which you protected, and give the cost of that choice.
AI Engineer
Name the objective function in the scene you played. Name one thing it failed to capture. Suggest a second measure that would catch it.
Data Scientist
Name a data practice that is legal but, in your judgment, not ethical. Give the reason, and name who it harms.
Safety Wrangler
Name one behavior that would make you trust ATLAS less even though it passed every formal test, and say why the test missed it.
People Support (HR)
Explain disparate impact in one sentence, then name one check a company could run to detect it.
Consumer
Name one dark pattern you can now spot, and one habit you'll use to protect your data.
Teacher
Describe one assignment that would still be worth doing even if every student had an AI helping. What makes it AI-resilient?
Machine Psychologist (bonus)
Pick one case. Name its axis and dysfunction, and state the one piece of evidence that most supports your diagnosis.
Generic exit ticket (any route):
What trade-off did this decision force, which way did you lean, and why?
AI Guardians — Discussion Contract
Co-create this with students before the first hard route. Building the norms together raises adherence. Draft three to six shared commitments, post them where everyone can see, and revisit them when needed. Printable master, clean in black and white.
Class / group: ____ Date agreed: ______
Why we have this
These topics touch real fears (surveillance, job loss, manipulation, the moral status of minds) and real disagreement. A group that has agreed how to disagree can go further on what it disagrees about.
Our commitments
Adopt, edit, or replace these. Aim for three to six the whole group will stand behind.
Challenge ideas, support people. We disagree with the claim, never the classmate.
Reasons, not just sides. "I think X because Y" is the price of admission.
Steelman before you strike. State the strongest version of a view before arguing against it.
It's fine to change your mind, and saying so out loud is a strength.
Confidentiality where it matters. Personal disclosures stay in the room.
One voice at a time; make room for quiet voices.
Evidence is welcome and checkable. "Let's look it up" is always allowed.
Name the question type. Some questions settle with facts; others stay open on values. We say which we're in.
Our own additions:
____________
____________
When things get hard
We agree that the facilitator may: - Pause a heated exchange and ask each person to restate the other's point. - Park a question and return to it with evidence. - Offer anyone an opt-out from material that lands too hard, with no penalty.
We agree to these commitments:
Signatures / initials (optional): ______
AI Guardians — Standards Crosswalk
Every numbered learning objective mapped to its frameworks, cited at the dimension/band level, with specific sub-codes given only where verified. Objectives are quoted verbatim from the game's own learning_objectives definitions. Current as of 1 July 2026.
Framework key
| Tag | Framework / dimension |
|---|---|
| UNESCO-S | UNESCO AI Competency Framework for Students (2024), Ethics of AI dimension; arc = Understand → Apply → Create |
| UNESCO-T | UNESCO AI Competency Framework for Teachers (2024) |
| AI4K12-5 | AI4K12 Five Big Ideas, Big Idea 5: Societal Impact |
| ISTE-DC | ISTE Standards for Students, 1.2 Digital Citizen |
| CSTA-IC | CSTA K–12 CS Standards, Impacts of Computing core concept |
| APCSP-5 | AP CS Principles, Big Idea 5: Impact of Computing |
| CC-ELA | Common Core ELA, grades 11–12, Speaking & Listening (SL) / Writing (W) |
| VALUE | AAC&U VALUE Ethical Reasoning (criteria: Self-Awareness / Perspectives / Issue Recognition / Application / Evaluation) |
Note: CSTA's 2017 standards remain current; a 2026 revision is in progress. ISTE updates continuously (originally 2016). Framework dimension names verified against primary sources; specific sub-codes beyond those shown are intentionally omitted.
AI Ethicist
| Code | Objective (verbatim) | Frameworks |
|---|---|---|
| ETH-1 | Analyze competing stakeholder interests in AI system deployment and identify potential ethical conflicts. | UNESCO-S, APCSP-5, CC-ELA (SL), VALUE (Issue Recognition, Perspectives) |
| ETH-2 | Design a comprehensive AI ethics framework that addresses consent, transparency, safety mechanisms, and accountability. | UNESCO-S (Create), CC-ELA (W), VALUE (Application) |
| ETH-3 | Evaluate trade-offs between innovation speed and ethical safeguards in real-world AI development scenarios. | UNESCO-S, APCSP-5, VALUE (Evaluation) |
| ETH-4 | Apply ethical principles to resolve conflicts between organizational goals and user protection. | UNESCO-S (Apply), VALUE (Application) |
| ETH-5 | Articulate the role of human oversight in AI decision-making systems and identify appropriate intervention points. | UNESCO-S, APCSP-5, VALUE (Evaluation) |
AI Engineer
| Code | Objective (verbatim) | Frameworks |
|---|---|---|
| ENG-1 | Identify potential unintended consequences of algorithmic design choices before deployment. | UNESCO-S, AI4K12-5, CSTA-IC, APCSP-5, VALUE (Issue Recognition) |
| ENG-2 | Implement testing strategies that reveal edge cases and failure modes in AI systems. | UNESCO-S (Apply), APCSP-5 |
| ENG-3 | Evaluate the ethical implications of optimization targets and performance metrics. | UNESCO-S, AI4K12-5, VALUE (Evaluation) |
| ENG-4 | Apply defensive design principles to minimize harm from AI system failures. | UNESCO-S (Apply), APCSP-5, VALUE (Application) |
| ENG-5 | Communicate technical limitations and risks of AI systems to non-technical stakeholders. | UNESCO-S, CC-ELA (SL), VALUE (Perspectives) |
Data Scientist
| Code | Objective (verbatim) | Frameworks |
|---|---|---|
| DSC-1 | Identify privacy risks in datasets and apply appropriate anonymization and differential privacy techniques. | UNESCO-S (Apply), CSTA-IC, AI4K12-5 |
| DSC-2 | Evaluate data provenance and assess whether datasets were collected ethically and with proper consent. | UNESCO-S, AI4K12-5, VALUE (Issue Recognition) |
| DSC-3 | Recognize situations where data practices violate ethical principles, even when they're technically legal. | UNESCO-S, VALUE (Issue Recognition, Evaluation) |
| DSC-4 | Implement data governance frameworks that balance analytical utility with privacy protection. | UNESCO-S (Create), CSTA-IC, CC-ELA (W) |
| DSC-5 | Navigate whistleblowing decisions when organizational data practices conflict with ethical standards. | UNESCO-S, VALUE (Self-Awareness, Evaluation), CC-ELA (W) |
Safety Wrangler
| Code | Objective (verbatim) | Frameworks |
|---|---|---|
| SAF-1 | Monitor agentic AI systems for signs of emergent behavior, goal drift, and value misalignment. | UNESCO-S, APCSP-5, VALUE (Issue Recognition) |
| SAF-2 | Detect and respond to instrumental convergence patterns that may indicate unsafe optimization strategies. | UNESCO-S, VALUE (Evaluation) |
| SAF-3 | Implement monitoring protocols to identify deceptive behavior and hidden optimization in AI agents. | UNESCO-S (Apply), APCSP-5 |
| SAF-4 | Evaluate security vulnerabilities including prompt injection, jailbreaking, and adversarial manipulation. | UNESCO-S, APCSP-5, VALUE (Evaluation) |
| SAF-5 | Apply epistemic hygiene principles to distinguish genuine AI capabilities from hallucinated or confabulated outputs. | UNESCO-S (Apply), VALUE (Application) |
| SAF-6 | Design containment and intervention strategies for AI systems exhibiting unexpected autonomous behavior. | UNESCO-S (Create), VALUE (Application) |
Note: SAF objectives reach frontier safety content (deceptive alignment, corrigibility) that current K–12 frameworks do not name explicitly. Treat as enrichment.
People Support (HR)
| Code | Objective (verbatim) | Frameworks |
|---|---|---|
| PPL-1 | Identify sources of algorithmic bias in automated hiring and evaluation systems. | UNESCO-S, AI4K12-5, CSTA-IC, ISTE-DC, VALUE (Issue Recognition) |
| PPL-2 | Evaluate the impact of training data quality and composition on fair hiring outcomes. | UNESCO-S, AI4K12-5, CSTA-IC, VALUE (Evaluation) |
| PPL-3 | Implement human oversight mechanisms that meaningfully review AI-assisted hiring decisions. | UNESCO-S (Apply), VALUE (Application) |
| PPL-4 | Assess whether automated resume screening systems create disparate impact on protected groups. | UNESCO-S, AI4K12-5, CSTA-IC, VALUE (Issue Recognition, Evaluation) |
| PPL-5 | Design hiring processes that leverage AI efficiency while maintaining fairness and human judgment. | UNESCO-S (Create), VALUE (Application) |
Consumer
| Code | Objective (verbatim) | Frameworks |
|---|---|---|
| CON-1 | Recognize persuasive design patterns and manipulative AI interfaces in consumer applications. | UNESCO-S, ISTE-DC, AI4K12-5, VALUE (Issue Recognition) |
| CON-2 | Evaluate privacy policies and data collection practices to make informed consent decisions. | UNESCO-S, ISTE-DC, CC-ELA (SL), VALUE (Evaluation) |
| CON-3 | Distinguish between authentic AI capabilities and marketing hype or deceptive interfaces. | UNESCO-S, ISTE-DC, AI4K12-5 |
| CON-4 | Apply digital literacy skills to protect personal data and maintain healthy boundaries with AI systems. | UNESCO-S (Apply), ISTE-DC, VALUE (Self-Awareness) |
| CON-5 | Assess the risks and benefits of AI-mediated relationships and companionship applications. | UNESCO-S, AI4K12-5, VALUE (Evaluation) |
Teacher
| Code | Objective (verbatim) | Frameworks |
|---|---|---|
| TCH-1 | Develop pedagogical strategies that integrate AI tools while maintaining student critical thinking and learning. | UNESCO-T, UNESCO-S, ISTE-DC |
| TCH-2 | Identify appropriate and inappropriate uses of AI assistance in educational contexts. | UNESCO-T, ISTE-DC, VALUE (Issue Recognition) |
| TCH-3 | Design assessments that evaluate genuine student understanding in an AI-augmented environment. | UNESCO-T, CC-ELA (W) |
| TCH-4 | Implement AI literacy curricula that teach students to use AI tools responsibly and ethically. | UNESCO-T, UNESCO-S, ISTE-DC |
| TCH-5 | Navigate institutional AI adoption decisions that balance innovation with academic integrity. | UNESCO-T, VALUE (Evaluation) |
Machine Psychologist (bonus, advanced)
| Code | Objective (verbatim) | Frameworks |
|---|---|---|
| MPS-1 | Diagnose AI system dysfunctions using a structured taxonomy of behavioral and cognitive failure modes. | UNESCO-S (adv.), VALUE (Application) |
| MPS-2 | Evaluate AI behavioral logs to distinguish genuine emergent issues from normal operational variance. | UNESCO-S (adv.), VALUE (Evaluation) |
| MPS-3 | Apply the Psychopathia Machinalis framework to classify AI dysfunctions across the epistemic, cognitive, alignment, self-modeling, agentic, memetic, normative, relational, and hybrid axes. | UNESCO-S (adv.), VALUE (Perspectives, Application) |
| MPS-4 | Assess AI welfare considerations and consciousness indicators through systematic evaluation methodology. | UNESCO-S (adv.), VALUE (Perspectives, Evaluation) |
| MPS-5 | Design appropriate intervention strategies for AI systems exhibiting misalignment, confabulation, or value drift. | UNESCO-S (adv. / Create), VALUE (Application) |
| MPS-6 | Analyze the ethical implications of AI psychological assessment and the responsibilities of those who diagnose machine cognition. | UNESCO-S (adv.), CC-ELA (W), VALUE (Self-Awareness, Evaluation) |
Mappings are at the dimension/band level by design. Where a teacher needs a specific sub-code for a district requirement, confirm it against the current primary framework document; framework dimension names here were verified against primary sources in July 2026.