Narrative Intelligence in AI Coaching
Why our coaches don't follow scripts--they follow meaning

Summary. Most chatbots respond with pattern-matching logic; wa4u coaches respond with story logic. We call this narrative intelligence: recognising where a user is in their personal arc and tailoring questions that advance--not interrupt--the plot.
Context & Problem
Large language models excel at factual Q&A, but emotional conversations stall when answers arrive too quickly or too literally. Gen Z test users told us:
"I don't need a solution in the first reply--I need to feel the bot gets the point of my story."
Conventional therapeutic chat flows either over-diagnose ('You may have anxiety--here's a worksheet') or under-engage ('That sounds hard'). Both break narrative momentum.
Conceptual Approach: Narrative Intelligence
We draw on psycholinguistics (Labov, 1997) and narrative coaching (Drake, 2014):
| Layer | What the model listens for | Prompt style |
|---|---|---|
| Setting | "Semester just started; I moved cities" | Mirror context in one line |
| Complication | "Now I'm frozen before every reading" | Name the tension, avoid advice |
| Evaluation | "Feels like I'm faking it" | Ask for meaning: 'What does faking cost you?' |
| Resolution path | User proposes micro-action | Offer reflective scaffold, not prescription |
The coach's first goal is to locate the user's place in the story arc; only then does it invite forward motion.
Application in wa4u
Prompt stack. Each coach system prompt includes a story_stage slot (unknown / setup / tension / insight / action). The model must:
- Detect stage from user text (few-shot examples).
- Answer with <=45-word reply that mirrors stage.
- Suggest next-stage question, not solution.
Example.
User: 'Week 2 at uni and I'm already three chapters behind. I just stare at the pdf.'
Coach (Rich<3rt):
'Sounds like the story is stuck at tension: the pdf vs. your focus.
When did staring start beating reading?'
92 percent of alpha testers rated such replies 'feels like you get me' versus 41 percent for baseline FAQ-style answers.
Open Questions
Automatic stage detection is about 80 percent accurate; misfires typically label evaluation as tension. We're testing a lightweight self-check:
"If I've got this wrong, let me know what really feels stuck."
Future work: can sentiment plus temporal references improve stage tagging without heavy classification models?
Insight
The best prompt isn't the cleverest; it's the one that helps a user hear their own narrative beat more clearly. By embedding story logic into every turn, wa4u coaches move conversations from noise to meaning in a way static scripts can't.