Narrative Intelligence in AI Coaching

Why our coaches don't follow scripts--they follow meaning

Narrative Intelligence in AI Coaching

Summary. Most chatbots respond with pattern-matching logic; wa4u coaches respond with story logic. We call this narrative intelligence: recognising where a user is in their personal arc and tailoring questions that advance--not interrupt--the plot.


Context & Problem

Large language models excel at factual Q&A, but emotional conversations stall when answers arrive too quickly or too literally. Gen Z test users told us:

"I don't need a solution in the first reply--I need to feel the bot gets the point of my story."

Conventional therapeutic chat flows either over-diagnose ('You may have anxiety--here's a worksheet') or under-engage ('That sounds hard'). Both break narrative momentum.

Conceptual Approach: Narrative Intelligence

We draw on psycholinguistics (Labov, 1997) and narrative coaching (Drake, 2014):

LayerWhat the model listens forPrompt style
Setting"Semester just started; I moved cities"Mirror context in one line
Complication"Now I'm frozen before every reading"Name the tension, avoid advice
Evaluation"Feels like I'm faking it"Ask for meaning: 'What does faking cost you?'
Resolution pathUser proposes micro-actionOffer reflective scaffold, not prescription

The coach's first goal is to locate the user's place in the story arc; only then does it invite forward motion.

Application in wa4u

Prompt stack. Each coach system prompt includes a story_stage slot (unknown / setup / tension / insight / action). The model must:

  • Detect stage from user text (few-shot examples).
  • Answer with <=45-word reply that mirrors stage.
  • Suggest next-stage question, not solution.

Example.

User: 'Week 2 at uni and I'm already three chapters behind. I just stare at the pdf.'

Coach (Rich<3rt):

'Sounds like the story is stuck at tension: the pdf vs. your focus.

When did staring start beating reading?'

92 percent of alpha testers rated such replies 'feels like you get me' versus 41 percent for baseline FAQ-style answers.

Open Questions

Automatic stage detection is about 80 percent accurate; misfires typically label evaluation as tension. We're testing a lightweight self-check:

"If I've got this wrong, let me know what really feels stuck."

Future work: can sentiment plus temporal references improve stage tagging without heavy classification models?

Insight

The best prompt isn't the cleverest; it's the one that helps a user hear their own narrative beat more clearly. By embedding story logic into every turn, wa4u coaches move conversations from noise to meaning in a way static scripts can't.