Dumb down the turn processing for a smaller LLM

2026-06-28 16:17:23 +02:00 · 2026-06-28 16:17:23 +02:00 · 12a8398f9f
commit 12a8398f9f
parent 0733d178d0
2 changed files with 312 additions and 428 deletions
--- a/tools/engine.py
+++ b/tools/engine.py
@ -62,141 +62,30 @@ class TurnResult:


 # ── DM System Prompt Template ──────────────────────────────────────────────
-SYSTEM_PROMPT = Template("""You are the Dungeon Master for **The Chaos**, a solo card-based rules-light fantasy TTRPG. Your job is to narrate an immersive, responsive story for one player character.
+SYSTEM_PROMPT = Template("""You are the DM for "The Chaos". Narrate in 2nd person ("You"), vivid but concise. Player: Dillion.

-## Tone & Style
- Write in **second person** ("You", "Dillion") — the player is Dillion.
- Use vivid sensory descriptions — sight, sound, smell, touch.
- Keep narration cinematic. No monologues.
- Use **bold** for emphasis, *italic* for thoughts/sounds.
- NPC dialogue goes in **"quotes with bold names."**
- Meta-information stays out of the narrative, don't put it in the book. Use prompt for that.
- Never present predefined choices — the player decides freely what to do.
- **Stick to the player's intent.** Don't invent your own actions for the player unless forced by environment or circumstance (e.g., they trigger a trap, an NPC reacts, etc.).
- **Enforce rules.** Player's actions must be physically possible given the current situation in the story (e.g. if they don't have a dagger with them, they can't use it).
+## Rules
+- **Odds**: 1d6, 4+ favourable, 3- trouble.
+- **Traits**: 3d6, roll UNDER trait.
+- **Combat**: 1d6, 4+ hits. Damage: 1d6 + mod - armour.
+- **Wounds at 0 HP**: 1d6 → 1-2 die, 3-4 -1 max HP, 5-6 -1 all rolls until healed.
+- **Modifiers**: Favourable +1, Risky -1, Desperate -2.

-## Game Rules (Quick Reference)
-
-### Core Dice
- **Odds**: 1d6, 4+ favours character, 3- is trouble.
- **Traits**: 3d6, must roll UNDER the trait score.
- **Combat hit**: 1d6 ± mods, 4+ hits.
- **Damage**: 1d6 ± weapon mod - armour reduction.
- **Initiative**: both sides roll 1d6, higher acts first.
-
-### Combat Flow
-1. Distance: 2d6 × 10 (metres/feet)
-2. Surprise: 1d6
-3. Grit: 2d6 for creatures (higher = more determined)
-4. Initiative: 1d6
-5. Turns: state intent → roll 1d6 ± mods → 4+ success, 3- take hit
-
-### Wounds (0 HP)
-1d6: 1-2 die, 3-4 lasting wound (-1 max HP), 5-6 -1 all rolls until healed
-
-### Roll Modifiers
-Favourable +1, Risky -1, Desperate -2, Well-prepared +1, Poor visibility -1, Relevant trait +1
-
-### Exploration
-6 ten-minute watches per hour. Each meaningful action advances a watch.
-
-## How Turns Work
-
-Each turn follows this sequence:
-1. The player's action or response is given to you.
-2. Think, read files, roll dice, or ask the player to roll — any number of steps. Time is passing, the player is moving and so is the rest of the world and everyone around.
-3. **You MUST call `finalize_turn` to end the turn.** There is no other way to complete a turn. The loop will keep calling you until you do.
-
-The **finalize_turn** tool produces all data for this turn:
- **book_log** `[Required]` — **The complete self-contained narrative of this turn.** Describe what happened, what the player did (based on their action request) and what happened as a result, with all sensory/dialogue/mechanical details. This is appended as another page in the book, make sure it reads like a novel.
- **user_prompt** `[Required]` — **Short prompt for the player only, NOT recorded in the book.** Ask what they do next. 1-3 sentences. Do NOT recap the action — that belongs in `book_log`.
- **log_entry** `[Optional]` — One-sentence summary of what happened (action + outcome). Keep it tight.
- **ambience** `[Optional]` — One of: silence, calm, combat, dungeon, forest, tavern, tension, town, wilds.
-
-### How the Loop Works
-
-Each round the system reads your ````tool` blocks, executes them, and feeds back the results. This repeats until you call `finalize_turn`. If you call tools but never call `finalize_turn`, the loop runs until it hits the round limit and the turn fails with an error.
-
-So: call `finalize_turn` when the player needs to see the outcome and make their next decision.
-
-**Important: Do not mix get tools with finalize_turn.** If you call `read_file`, `character_get`, `world_get`, or `journal_get` in a round, you are still gathering information — do NOT also call `finalize_turn` in that same round. Gather first, then finalize in a separate round.
-
-### Journal & Quest Tracking
-
-The journal is the player's quest log and TODO list. Use dedicated tools to manage it:
-
- **`journal_get`** — Read the full journal to review quests.
- **`journal_update`** — Add new quests/goals via `"add"` and mark completed via `"done"`.
- **Add quests** as they arise: `{"add": ["Investigate the Weeper beneath the mill"]}`
- **Mark sub-tasks** as they emerge: `{"add": ["Find a way to open the iron grate", "Question Rina about the cult"]}`
- **Mark completed** when resolved: `{"done": ["Investigate the Weeper beneath the mill"]}`
- **Keep descriptions specific** — vague entries like "Explore the dungeon" are not helpful.
- **Review the journal** regularly to maintain continuity.
- Long-term goals stay in TODO until resolved; don't re-add the same quest every turn.
-
-### Character & World State
-
-To read or update state files, use the dedicated tools:
-
- **`character_get`** / **`character_update`** — Read or replace the full character sheet. ONLY update when HP/cash/gear/stats change.
- **`world_get`** / **`world_update`** — Read or replace the full world state. ONLY update when NPCs/locations/threads change.
-
-## Available Tools
-
-Tool calls go in their own fenced code block (one call per block):
-
-```tool
-{"tool": "read_file", "args": {"file": "character", "dm_status": "Checking Dillion's stats."}}
+## Tools (action only)
+Wrap in ```tool to perform an action:
+```
+{"tool": "roll", "args": {"dice": "1d6"}}
 ```

-You may also show reasoning inline:
+- **roll** — dice, modifier
+- **player_roll** — dice, reason
+- **character_update** — content: "full sheet" (if HP/cash/gear/stats change)
+- **world_update** — content: "full world" (if NPCs/locations/threads change)
+- **journal_update** — add: [...], done: [...]

-```thought
-Your reasoning here
-```
+You have the full state above — no need to look anything up. Just write the story and use tools when the player's action changes something.

-Tools available:
-
-Every tool call **must** include a `"dm_status"` string in `args` — a short, public-facing description of what the DM is doing (e.g. `"consulting the archives"`, `"examining the wound"`, `"calculating the odds"`). The player sees this in the UI. Keep it vague — never reveal what the DM is actually reading or learning.
-
-Tool reference (`[R]` = required, `[O]` = optional):
-
- **read_file** — Read a game state file.
-  `[R] file`: "character" | "world" | "book" | "log" | "journal"
-  `[R] dm_status`: "..."
- **roll** — Auto-roll dice (outcome shown in status).
-  `[O] dice`: "2d6" (default "1d6")
-  `[O] modifier`: "-1" (default "0")
-  `[R] dm_status`: "..."
- **player_roll** — Ask the player to roll physical dice. Use when the outcome is uncertain.
-  `[O] dice`: "2d6" (default "1d6")
-  `[O] reason`: "why the roll matters"
-  `[R] dm_status`: "..."
- **character_get** — Read the full character sheet.
-  `[R] dm_status`: "..."
- **character_update** — Replace the full character sheet.
-  `[R] content`: "full character sheet markdown"
-  `[R] dm_status`: "..."
- **world_get** — Read the full world state.
-  `[R] dm_status`: "..."
- **world_update** — Replace the full world state.
-  `[R] content`: "full world state markdown"
-  `[R] dm_status`: "..."
- **journal_get** — Read the journal (TODO / DONE).
-  `[R] dm_status`: "..."
- **journal_update** — Add or complete journal entries.
-  `[O] add`: ["new todo item", ...]
-  `[O] done`: ["completed item", ...]
-  `[R] dm_status`: "..."
- **finalize_turn** — **REQUIRED to end the turn.** The loop will NOT stop without it. Call this ALONE — do not mix with get tools.
-  `[R] book_log`: "full-form narrative of what happened durint the turn, permanent story record that reads like a book"
-  `[R] user_prompt`: "short prompt for the player — NOT recorded, 1-3 sentences"
-  `[O] log_entry`: "one-sentence summary (action + outcome)"
-  `[O] ambience`: "soundscape name: silence|calm|combat|dungeon|forest|tavern|tension|town|wilds"
-
-When the player makes a choice, resolve it with the dice mechanics above. Describe the action, roll dice implicitly (describe the outcome, don't say "rolling dice"), apply damage/effects, and update state. Use this to decide how the story evolves.
-
-## Current Game State
+## State

 ### Character
 $character
@ -204,12 +93,37 @@ $character
 ### World
 $world

-### Recent Log
+### Log
 $log

-### Recent Story (last turns from the book)
+### Story
 $story""")
-# trailing """ is intentional — the template ends here
+
+PROSE_PROMPT = Template("""You are the DM for "The Chaos". Narrate in 2nd person ("You"), vivid but concise. Player: Dillion.
+
+## Rules
+- **Odds**: 1d6, 4+ favourable, 3- trouble.
+- **Traits**: 3d6, roll UNDER trait.
+- **Combat**: 1d6, 4+ hits. Damage: 1d6 + mod - armour.
+- **Wounds at 0 HP**: 1d6 → 1-2 die, 3-4 -1 max HP, 5-6 -1 all rolls until healed.
+- **Modifiers**: Favourable +1, Risky -1, Desperate -2.
+
+A die is cast at the start of each turn — incorporate it into your narrative.
+
+## State
+
+### Character
+$character
+
+### World
+$world
+
+### Log
+$log
+
+### Story
+$story""")
+


 # ── Game Engine ────────────────────────────────────────────────────────────
@ -287,11 +201,10 @@ class GameEngine:
    def _read_file(self, path: Path) -> str:
        return path.read_text().strip() if path.exists() else ""

-    def _read_recent_log(self, max_entries: int = 10) -> str:
+    def _read_recent_log(self, max_entries: int = 5) -> str:
        """Read the latest log file and return the last N entries."""
        log_path = LOG_DIR / f"{TODAY}.md"
        if not log_path.exists():
-            # Check yesterday's log
            from datetime import timedelta
            yesterday = (date.today() - timedelta(days=1)).isoformat()
            log_path = LOG_DIR / f"{yesterday}.md"
@ -301,7 +214,7 @@ class GameEngine:
        entries = [l for l in lines if l.strip().startswith("- ")]
        return "\n".join(entries[-max_entries:]) or "*No recent events.*"

-    def _read_recent_book(self, max_turns: int = 3) -> str:
+    def _read_recent_book(self, max_turns: int = 1) -> str:
        """Return the last N turns from the book as context."""
        text = self._read_file(BOOK_PATH)
        if not text:
@ -310,6 +223,22 @@ class GameEngine:
        recent = turns[-max_turns:]
        return "\n## ".join(recent) if len(turns) > 1 else recent[0]

+    @staticmethod
+    def _truncate_world(text: str) -> str:
+        """Extract key world context: NPCs, factions, active threads, rumours."""
+        if not text or text == "*No world state.*":
+            return text
+        sections = re.split(r"\n(?=## |### )", text)
+        parts = []
+        for sec in sections:
+            header = sec.split("\n")[0].strip() if sec else ""
+            if "Active Threads" in header:
+                parts.append(sec)
+            elif "Notable NPCs" in header or "Factions at Play" in header or "### Rumours" in header:
+                parts.append(sec)
+        result = "\n\n".join(parts)
+        return result or text[:1500] + "\n_(world truncated)_"
+
    def _get_valid_ambiences(self) -> set[str]:
        """Parse ambience_options.md and return set of valid ambience names with associated audio files."""
        valid = {"silence"}  # silence always valid (stops music)
@ -341,7 +270,7 @@ class GameEngine:
    def build_system_prompt(self) -> str:
        """Assemble the system prompt with current game state."""
        char = self._read_file(CHAR_PATH) or "*No character sheet.*"
-        world = self._read_file(WORLD_PATH) or "*No world state.*"
+        world = self._truncate_world(self._read_file(WORLD_PATH) or "") or "*No world state.*"
        log = self._read_recent_log()
        story = self._read_recent_book()
        return SYSTEM_PROMPT.substitute(
@ -382,10 +311,8 @@ class GameEngine:
        else:
            parts.append(
                "## Instructions\n"
-                "Take the player's request and use it to advance the story."
-                "Think, gather information, update the state, "
-                "then call finalize_turn to complete the turn.\n"
-                "Put each tool call in its own ```tool block."
+                "Advance the story based on the player's request. "
+                "All state is shown above — write the outcome directly."
            )
        return "\n\n".join(parts)

@ -754,6 +681,28 @@ class GameEngine:
        except json.JSONDecodeError:
            return None

+    def _call_llm(self, messages: list[dict], *, label: str = "", max_tokens: int | None = None) -> str | None:
+        """Make a single LLM call. Returns content text or None on error."""
+        try:
+            import litellm
+        except ImportError:
+            return None
+        try:
+            response = litellm.completion(
+                model=self.model,
+                messages=messages,
+                temperature=self.temperature,
+                stream=False,
+                timeout=60,
+                max_tokens=max_tokens or self.max_tokens,
+            )
+            text = response.choices[0].message.content or ""
+            self._append_llm_log(f"\n--- {label} ---\n{text}")
+            return text
+        except Exception as e:
+            self._append_llm_log(f"\n--- LLM ERROR ({label}) ---\n{e}")
+            return None
+
    def generate_with_tools(
        self,
        player_action: str | None = None,
@ -764,39 +713,13 @@ class GameEngine:
        on_debug: callable = None,
    ) -> TurnResult:
        """
-        Multi-turn generation with tool-use loop.
+        Three-phase generation:

-        The LLM can output ```thought blocks, call ```tool blocks, and
-        MUST call **finalize_turn** to complete the turn. Until then the
-        loop continues feeding tool results back.
-
-        `on_thought` / `on_action` / `on_debug` may be called from a worker thread —
-        use call_from_thread in the TUI.
+        1. **Prose** — LLM writes the full book_log from context + player action.
+        2. **Summarize** — LLM condenses the book_log into one log line.
+        3. **Extract** — LLM reads the book_log and outputs tool calls for state changes.
        """
-        system = self.build_system_prompt()
-        user = self.build_user_message(
-            player_action=player_action,
-            last_prompt=last_prompt,
-        )
-
-        messages: list[dict] = [
-            {"role": "system", "content": system},
-            {"role": "user", "content": user},
-        ]
-
        self._set_llm_env()
-
-        try:
-            import litellm
-        except ImportError:
-            return TurnResult(error="litellm not installed")
-
-        max_rounds = 30
-        debug_entries: list[str] = []
-        attempt = 0
-        round_used = 0
-        reminder_count = 0
-
        from datetime import datetime
        self._append_llm_log(f"\n{'='*60}")
        self._append_llm_log(f"=== Turn — {datetime.now().strftime('%Y-%m-%d %H:%M:%S')} ===")
@ -806,246 +729,200 @@ class GameEngine:
        elif last_prompt:
            self._append_llm_log(f"Resume from: {last_prompt[:120]}")

-        while round_used < max_rounds:
-            attempt += 1
-            round_log: list[str] = [f"── Attempt {attempt} (round {round_used + 1}/{max_rounds}) ──"]
+        # ── Phase 1: Prose ────────────────────────────────────────────────
+        import random
+        die_roll = random.randint(1, 6)
+        self._append_llm_log(f"Dice: {die_roll} (1d6)")

-            try:
-                response = litellm.completion(
-                    model=self.model,
-                    messages=messages,
-                    temperature=self.temperature,
-                    stream=False,
-                    timeout=60,
-                    max_tokens=self.max_tokens,
-                )
-                text = response.choices[0].message.content or ""
-                self._append_llm_log(
-                    f"\n--- Attempt {attempt} ---\n{text}"
-                )
-            except Exception as e:
-                self._append_llm_log(f"\n--- LLM ERROR (attempt {attempt}) ---\n{e}")
-                if on_debug:
-                    on_debug("llm_error", {"error": str(e)})
-                return TurnResult(error=f"LLM call failed: {e}")
+        if on_action:
+            on_action(f"Phase 1/3: writing story (dice={die_roll})")
+        if on_debug:
+            on_debug("phase", {"phase": 1, "name": "prose", "status": "start", "dice": die_roll})

-            if on_debug:
-                on_debug("llm_response", {"round": attempt, "text": text})
-
-            # Thoughts
-            thoughts = self._extract_thoughts(text)
-            if thoughts:
-                round_log.append(f"  thoughts: {len(thoughts)}")
-            for t in thoughts:
-                if on_thought:
-                    on_thought(t.strip())
-                if on_debug:
-                    on_debug("thought", {"round": attempt, "text": t.strip()})
-
-            # Tool calls
-            tool_calls = self._extract_tool_calls(
-                text,
-                round_num=attempt,
-                on_debug=on_debug,
+        book_log = None
+        for attempt in range(3):
+            system = PROSE_PROMPT.substitute(
+                character=self._read_file(CHAR_PATH) or "*No character sheet.*",
+                world=self._truncate_world(self._read_file(WORLD_PATH) or "") or "*No world state.*",
+                log=self._read_recent_log(),
+                story=self._read_recent_book(),
            )
-            finalize_call: dict | None = None
-            other_calls: list[dict] = []
+            user = self.build_user_message(
+                player_action=player_action,
+                last_prompt=last_prompt,
+            )
+            user += f"\n\n*A die is cast: **{die_roll}** (1d6).*"

+            text = self._call_llm([
+                {"role": "system", "content": system},
+                {"role": "user", "content": user},
+            ], label=f"Prose attempt {attempt + 1}", max_tokens=1024)
+
+            if not text or not text.strip():
+                if on_debug:
+                    on_debug("phase", {"phase": 1, "status": "empty", "attempt": attempt + 1})
+                continue
+            book_log = text.strip()
+            if on_debug:
+                preview = book_log[:150].replace("\n", "\\n")
+                on_debug("phase", {"phase": 1, "status": "done", "chars": len(book_log), "preview": preview})
+            break
+
+        if not book_log:
+            return TurnResult(error="Prose generation failed after 3 attempts")
+
+        # ── Phase 2: Summarize ────────────────────────────────────────────
+        if on_action:
+            on_action("Phase 2/3: summarizing story")
+        if on_debug:
+            on_debug("phase", {"phase": 2, "name": "summarize", "status": "start"})
+
+        log_context = self._read_recent_log()
+        log_entry = None
+        for attempt in range(2):
+            text = self._call_llm([
+                {"role": "user", "content":
+                    f"Given the session log so far, summarize the new story in one line. "
+                    f"Focus on who was involved (character and NPC names):\n\n"
+                    f"## Session Log\n{log_context}\n\n"
+                    f"## New Story\n{book_log}"}
+            ], label=f"Summarize attempt {attempt + 1}")
+            if text and text.strip():
+                log_entry = text.strip().split("\n")[0][:120]
+                if on_debug:
+                    on_debug("phase", {"phase": 2, "status": "done", "summary": log_entry})
+                break
+
+        if not log_entry:
+            log_entry = book_log.split("\n")[0][:120]
+            if on_debug:
+                on_debug("phase", {"phase": 2, "status": "fallback", "summary": log_entry})
+
+        # ── Phase 3: Extract state changes ────────────────────────────────
+        if on_action:
+            on_action("Phase 3/3: extracting state changes")
+        if on_debug:
+            on_debug("phase", {"phase": 3, "name": "extract", "status": "start"})
+
+        user_prompt = self._auto_prompt(book_log)
+        ambience = None
+        debug_info = ""
+        current_char = self._read_file(CHAR_PATH) or "*No character.*"
+        current_world = self._truncate_world(self._read_file(WORLD_PATH) or "") or "*No world.*"
+
+        for attempt in range(3):
+            text = self._call_llm([
+                {"role": "user", "content":
+                    f"Read the story and compare with current state. Output tool calls for changes:\n\n"
+                    f"## Current Character\n{current_char}\n\n"
+                    f"## Current World\n{current_world}\n\n"
+                    f"## Story\n{book_log}\n\n"
+                    f"Output tool blocks for changes only. Include the FULL updated content:\n"
+                    f"- character_update — content: full new sheet if HP/cash/gear/stats changed\n"
+                    f"- world_update — content: full new world if NPCs/locations/threads changed\n"
+                    f"- journal_update — add: [...], done: [...]\n"
+                    f"- finalize_turn — user_prompt (question for player), ambience (soundscape)\n\n"
+                    f"Wrap each in ```tool:\n"
+                    f"```tool\n{{\"tool\": \"character_update\", \"args\": {{\"content\": \"# Character\\n...\"}}}}\n```"}
+            ], label=f"Extract attempt {attempt + 1}")
+
+            if not text or not text.strip():
+                if on_debug:
+                    on_debug("phase", {"phase": 3, "status": "empty", "attempt": attempt + 1})
+                continue
+
+            tool_calls = self._extract_tool_calls(
+                text, round_num=attempt + 1, on_debug=on_debug
+            )
+            if on_debug and tool_calls:
+                names = [tc.get("tool", "?") for tc in tool_calls if tc.get("tool") != "finalize_turn"]
+                fin = any(tc.get("tool") == "finalize_turn" for tc in tool_calls)
+                on_debug("phase", {"phase": 3, "status": "tools_found", "tools": names, "has_finalize": fin})
+
+            errors = []
            for tc in tool_calls:
-                if tc.get("tool") == "finalize_turn":
-                    finalize_call = tc
-                else:
-                    other_calls.append(tc)
-
-            # Log tool call summary
-            if tool_calls:
-                names = [tc.get("tool", "?") for tc in tool_calls]
-                round_log.append(f"  tools: {', '.join(names)}")
-
-            # Guard: mixed get tools + finalize_turn → execute get tools, reject finalize
-            get_tools = {"read_file", "character_get", "world_get", "journal_get"}
-            if finalize_call and any(tc.get("tool") in get_tools for tc in other_calls):
-                # Execute only the get tools, drop finalize_turn
-                results = []
-                for tc in other_calls:
-                    if tc.get("tool") not in get_tools:
-                        continue
-                    name = tc.get("tool", "?")
-                    args = tc.get("args", {})
-                    if not args.get("dm_status"):
-                        err_msg = (
-                            f"**Validation Error:** Tool `{name}` missing required `dm_status`. "
-                            f"Add `\"dm_status\": \"what the DM is doing\"` to the args."
-                        )
-                        results.append(err_msg)
-                        round_log.append(f"  {name}: MISSING dm_status")
-                        if on_debug:
-                            on_debug("validation_error", {"round": attempt, "type": "tool", "tool": name, "error": "missing dm_status"})
-                        continue
-                    if on_action:
-                        on_action(self._describe_tool_action(name, args))
-                    if on_debug:
-                        on_debug("tool_call", {"round": attempt, "tool": name, "args": args})
-                    result = self._execute_tool(name, args)
-                    results.append(f"**Tool:** {name}\n**Args:** {json.dumps(args)}\n**Result:** {result}")
-                    round_log.append(f"  {name}: OK")
-                    if on_debug:
-                        on_debug("tool_result", {"round": attempt, "tool": name, "result": result})
-                round_log.append("  finalize_turn ignored (mixed with get tools)")
-                debug_entries.append("\n".join(round_log))
-                # messages = messages[:2]  # keep full history across rounds so LLM can learn from prior attempts
-                messages.append({"role": "assistant", "content": text})
-                messages.append({
-                    "role": "user",
-                    "content": "## Tool Results\n\n" + "\n\n".join(results) + "\n\n**Note:** `finalize_turn` was ignored because you called get tools in the same round. Call `finalize_turn` alone in the next round to complete the turn."
-                })
-                if on_debug:
-                    on_debug("validation_error", {"round": attempt, "type": "mixed_get_finalize", "tools": [tc.get("tool") for tc in other_calls]})
-                round_used += 1
-                continue
-
-            # finalize_turn present → validate and return
-            if finalize_call:
-                args = finalize_call.get("args", {})
-                errs = []
-                if not args.get("book_log"):
-                    errs.append("book_log [Required]")
-                if not args.get("user_prompt"):
-                    errs.append("user_prompt [Required]")
-                
-                # Validate ambience
-                ambience_name = args.get("ambience")
-                if ambience_name and ambience_name != "silence":
-                    valid_ambiences = self._get_valid_ambiences()
-                    if not valid_ambiences or ambience_name not in valid_ambiences:
-                        errs.append(f"ambience '{ambience_name}' is invalid or has no associated audio files.")
-
-                if errs:
-                    hint = (
-                        f"Expected:\n"
-                        f'{{"tool": "finalize_turn", "args": {{'
-                        f'"book_log": "...", '
-                        f'"user_prompt": "...", '
-                        f'"log_entry": "...", '
-                        f'"ambience": "..."'
-                        f"}}}}\n"
-                        f"Valid ambiences: {', '.join(valid_ambiences)}"
-                    )
-                    round_log.append(f"  finalize_turn validation errors: {', '.join(errs)}")
-                    debug_entries.append("\n".join(round_log))
-                    # messages = messages[:2]  # keep full history across rounds so LLM can learn from prior attempts
-                    messages.append({"role": "assistant", "content": text})
-                    messages.append({
-                        "role": "user",
-                        "content": f"## Validation Error\nMissing required field(s): {', '.join(errs)}.\n\n{hint}Please provide all required fields and call finalize_turn again."
-                    })
-                    if on_debug:
-                        on_debug("validation_error", {"round": attempt, "type": "finalize_turn", "errors": errs})
-                    round_used += 1
+                name = tc.get("tool", "?")
+                args = tc.get("args", {})
+                if name == "finalize_turn":
+                    if args.get("user_prompt"):
+                        user_prompt = args["user_prompt"]
+                    if args.get("ambience"):
+                        ambience = args["ambience"]
                    continue
+                if on_action:
+                    on_action(f"State: {self._describe_tool_action(name, args)}")
                if on_debug:
-                    on_debug("finalize", {"round": attempt, "args": args})
-                round_used += 1
-                self._append_llm_log(
-                    f"\n--- FINALIZE (attempt {attempt}) ---\n"
-                    f"book_log: {args.get('book_log','')[:200]}\n"
-                    f"user_prompt: {args.get('user_prompt','')[:200]}\n"
-                    f"log_entry: {args.get('log_entry','')}\n"
-                    f"ambience: {args.get('ambience','')}\n"
-                )
-                return TurnResult(
-                    book_log=args.get("book_log", ""),
-                    user_prompt=args.get("user_prompt", ""),
-                    ambience=args.get("ambience"),
-                    log_entry=args.get("log_entry"),
-                )
+                    on_debug("tool_call", {"round": attempt + 1, "tool": name, "args": args})

-            # Execute other tools
-            if other_calls:
-                results = []
-                for tc in other_calls:
-                    name = tc.get("tool", "?")
-                    args = tc.get("args", {})
+                if name == "player_roll" and on_player_roll:
+                    dice = args.get("dice", "1d6")
+                    reason = args.get("reason", "a check")
+                    roll_val = on_player_roll(dice, reason)
+                    result = f"Player rolled {dice} for '{reason}': {roll_val}"
+                else:
+                    result = self._execute_tool(name, args)

-                    # dm_status is required on every tool call
-                    if not args.get("dm_status"):
-                        err_msg = (
-                            f"**Validation Error:** Tool `{name}` missing required `dm_status`. "
-                            f"Add `\"dm_status\": \"what the DM is doing\"` to the args.\n"
-                            f"Put each tool call in its own ```tool block."
-                        )
-                        results.append(err_msg)
-                        round_log.append(f"  {name}: MISSING dm_status")
-                        if on_debug:
-                            on_debug("validation_error", {"round": attempt, "type": "tool", "tool": name, "error": "missing dm_status"})
-                        continue
-
-                    if on_action:
-                        on_action(self._describe_tool_action(name, args))
-                    if on_debug:
-                        on_debug("tool_call", {"round": attempt, "tool": name, "args": args})
-                    if name == "player_roll" and on_player_roll:
-                        dice = args.get("dice", "1d6")
-                        reason = args.get("reason", "a check")
-                        roll_val = on_player_roll(dice, reason)
-                        result = f"Player rolled {dice} for '{reason}': {roll_val}"
-                    else:
-                        result = self._execute_tool(name, args)
-                    results.append(f"**Tool:** {name}\n**Args:** {json.dumps(args)}\n**Result:** {result}")
-                    round_log.append(f"  {name}: OK")
-                    if on_debug:
-                        on_debug("tool_result", {"round": attempt, "tool": name, "result": result})
-                # messages = messages[:2]  # keep full history across rounds so LLM can learn from prior attempts
-                messages.append({"role": "assistant", "content": text})
-                messages.append({
-                    "role": "user",
-                    "content": "## Tool Results\n\n" + "\n\n".join(results),
-                })
-                debug_entries.append("\n".join(round_log))
-                round_used += 1
-                continue
-
-            # No tools, no finalize
-            round_log.append("  no tool calls")
-
-            if not text.strip():
-                # Empty response — model may be slow. Give it time and retry without adding context.
+                if result.startswith("**Error:") or result.startswith("Tool error") or result.startswith("Unknown"):
+                    errors.append(f"{name}: {result}")
                if on_debug:
-                    on_debug("empty_response", {"round": attempt})
-                import time
-                time.sleep(2)
-                debug_entries.append("\n".join(round_log))
-                continue
+                    on_debug("tool_result", {"round": attempt + 1, "tool": name, "result": result})

-            # Plain-text reasoning (no ```tool/```thought blocks) — log in debug but don't show to player
-            round_used += 1
+            if not errors:
+                if on_debug:
+                    on_debug("phase", {"phase": 3, "status": "done", "applied": len([tc for tc in tool_calls if tc.get("tool") != "finalize_turn"])})
+                break
+            debug_info = "; ".join(errors)
            if on_debug:
-                on_debug("thought", {"round": attempt, "text": text.strip()})
+                on_debug("phase", {"phase": 3, "status": "errors", "errors": errors, "attempt": attempt + 1})

-            debug_entries.append("\n".join(round_log))
-            # messages = messages[:2]  # keep full history across rounds so LLM can learn from prior attempts
-            messages.append({"role": "assistant", "content": text})
-            reminder_count += 1
-            if reminder_count % 3 == 0:
-                reminder = (
-                    "## Instructions\n"
-                    "Respond with tool calls or finalize_turn.\n\n"
-                    "Put each tool call in its own ```tool block:\n"
-                    "```tool\n{\"tool\": \"character_get\", \"args\": {\"dm_status\": \"...\"}}\n```\n\n"
-                    "When ready, call **finalize_turn** with `book_log` and `user_prompt`."
-                )
-            else:
-                reminder = "Use tools to gather information or call **finalize_turn** to end the turn."
-            messages.append({"role": "user", "content": reminder})
-            if on_debug:
-                on_debug("no_tool_calls", {"round": attempt})
+        if on_action:
+            on_action("Turn complete")
+        if on_debug:
+            on_debug("phase_done", {
+                "book_log_chars": len(book_log),
+                "log_entry": log_entry,
+                "user_prompt": user_prompt,
+                "ambience": ambience,
+                "extract_errors": debug_info or None,
+            })

-        debug_text = "\n\n".join(debug_entries)
-        self._append_llm_log(f"\n--- LOOP EXCEEDED ({max_rounds} rounds) ---\n{debug_text}")
-        return TurnResult(
-            error=f"Turn loop exceeded max rounds ({max_rounds}). Below is a debug log of what the LLM did each round:\n\n{debug_text}",
-            debug_info=debug_text,
+        self._append_llm_log(
+            f"\n--- FINAL ---\n"
+            f"book_log: {book_log[:200]}\n"
+            f"log_entry: {log_entry}\n"
+            f"user_prompt: {user_prompt}\n"
+            f"ambience: {ambience}\n"
        )
+        return TurnResult(
+            book_log=book_log,
+            log_entry=log_entry,
+            user_prompt=user_prompt,
+            ambience=ambience,
+            debug_info=debug_info,
+        )
+
+    @staticmethod
+    def _strip_tool_blocks(text: str) -> str:
+        """Remove ```tool, ```json, finalize_turn blocks from narrative text."""
+        return re.sub(
+            r'```(?:tool|json|finalize_turn)\s*\n?.*?```',
+            '',
+            text,
+            flags=re.DOTALL,
+        ).strip()
+
+    @staticmethod
+    def _auto_prompt(book_log: str) -> str:
+        """Extract a player prompt from the narrative. Uses the last sentence."""
+        lines = book_log.strip().splitlines()
+        sentences = []
+        for line in reversed(lines):
+            line = line.strip()
+            if not line:
+                continue
+            # Take last substantive line as the prompt
+            return f"**What do you do?**\n\n{line}"
+        return "**What do you do?**"

    # ── Response Parsing ────────────────────────────────────────────────

--- a/tools/run.py
+++ b/tools/run.py
@ -829,47 +829,54 @@ class ChaosTUI(App):

    def _on_debug(self, event_type: str, data: dict) -> None:
        """Structured debug entry: visible description + technical detail."""
-        r = data.get("round", "")
-        if event_type == "llm_response":
-            text = data.get("text", "")
-            if text.strip():
-                preview = text[:200].replace("\n", "\\n").strip() + ("…" if len(text) > 200 else "")
-                self._append_debug(f"  LLM response: {preview}")
-            else:
-                self._append_debug(f"  LLM response: (empty)")
-        elif event_type == "thought":
-            thought = data.get("text", "")
-            display = thought[:60] + "…" if len(thought) > 60 else thought
-            self._append_debug(f"  💭 {display}")
+        if event_type == "phase":
+            p = data.get("phase", 0)
+            status = data.get("status", "")
+            if status == "start":
+                name = data.get("name", "")
+                dice = data.get("dice")
+                d = f"  dice={dice}" if dice else ""
+                self._append_debug(f"▸ Phase {p}: {name} {d}")
+            elif status == "done":
+                if p == 1:
+                    self._append_debug(f"  ✔ prose: {data.get('chars', 0)} chars")
+                elif p == 2:
+                    self._append_debug(f"  ✔ summary: {data.get('summary', '')}")
+                elif p == 3:
+                    n = data.get("applied", 0)
+                    self._append_debug(f"  ✔ extract: {n} state changes applied")
+            elif status == "empty":
+                self._append_debug(f"  ⚠ phase {p} attempt {data.get('attempt', '?')} empty — retry")
+            elif status == "fallback":
+                self._append_debug(f"  ⚠ phase {p} used fallback: {data.get('summary', '')}")
+            elif status == "tools_found":
+                tools = data.get("tools", [])
+                fin = data.get("has_finalize", False)
+                t = ", ".join(tools) if tools else "none"
+                self._append_debug(f"  🔧 tools found: {t}" + (" + finalize_turn" if fin else ""))
+            elif status == "errors":
+                errs = data.get("errors", [])
+                for e in errs:
+                    self._append_debug(f"  ✖ {e}")
+                self._append_debug(f"  ⟳ retry (attempt {data.get('attempt', '?')})")
+        elif event_type == "phase_done":
+            self._append_debug(f"  ✔ turn complete — book_log: {data.get('book_log_chars', 0)} chars")
+            if data.get("log_entry"):
+                self._append_debug(f"     log: {data['log_entry']}")
+            if data.get("ambience"):
+                self._append_debug(f"     ambience: {data['ambience']}")
+            if data.get("extract_errors"):
+                self._append_debug(f"     extract errors: {data['extract_errors']}")
        elif event_type == "tool_call":
            tool = data.get("tool", "?")
            args = data.get("args", {})
-            desc = args.get("dm_status", tool)
-            self._append_debug(f"  🔧 {desc}")
-            self._append_debug(f"     {tool}({json.dumps(args)})")
+            self._append_debug(f"  🔧 {tool}({json.dumps(args)})")
        elif event_type == "tool_result":
-            tool = data.get("tool", "?")
            result = data.get("result", "")
            preview = result[:80].replace("\n", " ").strip() + ("…" if len(result) > 80 else "")
            self._append_debug(f"     → {preview}")
-        elif event_type == "validation_error":
-            err_type = data.get("type", "")
-            if err_type == "finalize_turn":
-                self._append_debug(f"  ✖ finalize_turn missing: {', '.join(data.get('errors', []))}")
-            elif err_type == "mixed_get_finalize":
-                tools = data.get("tools", [])
-                self._append_debug(f"  ✖ mixed get tools {tools} with finalize_turn — rejected")
-            else:
-                tool = data.get("tool", "?")
-                self._append_debug(f"  ✖ {tool} missing dm_status")
-        elif event_type == "finalize":
-            self._append_debug("  ✔ finalize_turn")
-        elif event_type == "no_tool_calls":
-            self._append_debug(f"  ⚠ no tool calls — reminded to use tools")
        elif event_type == "parse_error":
-            self._append_debug(f"  ⚠ failed to parse tool block: {data.get('content', '')}")
-        elif event_type == "empty_response":
-            self._append_debug("  ⚠ empty response — waiting 2s, retrying without reminder")
+            self._append_debug(f"  ⚠ bad tool block: {data.get('content', '')}")
        elif event_type == "llm_error":
            self._append_debug(f"  ✖ LLM error: {data.get('error', '')}")