Dumb down the turn processing for a smaller LLM

2026-06-28 16:17:23 +02:00 · 2026-06-28 16:17:23 +02:00 · 12a8398f9f
commit 12a8398f9f
parent 0733d178d0
2 changed files with 312 additions and 428 deletions
--- a/tools/engine.py
+++ b/tools/engine.py
@ -62,141 +62,30 @@ class TurnResult:
 # ── DM System Prompt Template ──────────────────────────────────────────────
-SYSTEM_PROMPT = Template("""You are the Dungeon Master for **The Chaos**, a solo card-based rules-light fantasy TTRPG. Your job is to narrate an immersive, responsive story for one player character.
+SYSTEM_PROMPT = Template("""You are the DM for "The Chaos". Narrate in 2nd person ("You"), vivid but concise. Player: Dillion.
-## Tone & Style
+## Rules
- Write in **second person** ("You", "Dillion") — the player is Dillion.
+- **Odds**: 1d6, 4+ favourable, 3- trouble.
- Use vivid sensory descriptions — sight, sound, smell, touch.
+- **Traits**: 3d6, roll UNDER trait.
- Keep narration cinematic. No monologues.
+- **Combat**: 1d6, 4+ hits. Damage: 1d6 + mod - armour.
- Use **bold** for emphasis, *italic* for thoughts/sounds.
+- **Wounds at 0 HP**: 1d6 → 1-2 die, 3-4 -1 max HP, 5-6 -1 all rolls until healed.
- NPC dialogue goes in **"quotes with bold names."**
+- **Modifiers**: Favourable +1, Risky -1, Desperate -2.
 - Meta-information stays out of the narrative, don't put it in the book. Use prompt for that.
 - Never present predefined choices — the player decides freely what to do.
 - **Stick to the player's intent.** Don't invent your own actions for the player unless forced by environment or circumstance (e.g., they trigger a trap, an NPC reacts, etc.).
 - **Enforce rules.** Player's actions must be physically possible given the current situation in the story (e.g. if they don't have a dagger with them, they can't use it).
-## Game Rules (Quick Reference)
+## Tools (action only)
-
+Wrap in ```tool to perform an action:
-### Core Dice
+```
- **Odds**: 1d6, 4+ favours character, 3- is trouble.
+{"tool": "roll", "args": {"dice": "1d6"}}
 - **Traits**: 3d6, must roll UNDER the trait score.
 - **Combat hit**: 1d6 ± mods, 4+ hits.
 - **Damage**: 1d6 ± weapon mod - armour reduction.
 - **Initiative**: both sides roll 1d6, higher acts first.
 ### Combat Flow
 1. Distance: 2d6 × 10 (metres/feet)
 2. Surprise: 1d6
 3. Grit: 2d6 for creatures (higher = more determined)
 4. Initiative: 1d6
 5. Turns: state intent → roll 1d6 ± mods → 4+ success, 3- take hit
 ### Wounds (0 HP)
 1d6: 1-2 die, 3-4 lasting wound (-1 max HP), 5-6 -1 all rolls until healed
 ### Roll Modifiers
 Favourable +1, Risky -1, Desperate -2, Well-prepared +1, Poor visibility -1, Relevant trait +1
 ### Exploration
 6 ten-minute watches per hour. Each meaningful action advances a watch.
 ## How Turns Work
 Each turn follows this sequence:
 1. The player's action or response is given to you.
 2. Think, read files, roll dice, or ask the player to roll — any number of steps. Time is passing, the player is moving and so is the rest of the world and everyone around.
 3. **You MUST call `finalize_turn` to end the turn.** There is no other way to complete a turn. The loop will keep calling you until you do.
 The **finalize_turn** tool produces all data for this turn:
 - **book_log** `[Required]` — **The complete self-contained narrative of this turn.** Describe what happened, what the player did (based on their action request) and what happened as a result, with all sensory/dialogue/mechanical details. This is appended as another page in the book, make sure it reads like a novel.
 - **user_prompt** `[Required]` — **Short prompt for the player only, NOT recorded in the book.** Ask what they do next. 1-3 sentences. Do NOT recap the action — that belongs in `book_log`.
 - **log_entry** `[Optional]` — One-sentence summary of what happened (action + outcome). Keep it tight.
 - **ambience** `[Optional]` — One of: silence, calm, combat, dungeon, forest, tavern, tension, town, wilds.
 ### How the Loop Works
 Each round the system reads your ````tool` blocks, executes them, and feeds back the results. This repeats until you call `finalize_turn`. If you call tools but never call `finalize_turn`, the loop runs until it hits the round limit and the turn fails with an error.
 So: call `finalize_turn` when the player needs to see the outcome and make their next decision.
 **Important: Do not mix get tools with finalize_turn.** If you call `read_file`, `character_get`, `world_get`, or `journal_get` in a round, you are still gathering information — do NOT also call `finalize_turn` in that same round. Gather first, then finalize in a separate round.
 ### Journal & Quest Tracking
 The journal is the player's quest log and TODO list. Use dedicated tools to manage it:
 - **`journal_get`** — Read the full journal to review quests.
 - **`journal_update`** — Add new quests/goals via `"add"` and mark completed via `"done"`.
 - **Add quests** as they arise: `{"add": ["Investigate the Weeper beneath the mill"]}`
 - **Mark sub-tasks** as they emerge: `{"add": ["Find a way to open the iron grate", "Question Rina about the cult"]}`
 - **Mark completed** when resolved: `{"done": ["Investigate the Weeper beneath the mill"]}`
 - **Keep descriptions specific** — vague entries like "Explore the dungeon" are not helpful.
 - **Review the journal** regularly to maintain continuity.
 - Long-term goals stay in TODO until resolved; don't re-add the same quest every turn.
 ### Character & World State
 To read or update state files, use the dedicated tools:
 - **`character_get`** / **`character_update`** — Read or replace the full character sheet. ONLY update when HP/cash/gear/stats change.
 - **`world_get`** / **`world_update`** — Read or replace the full world state. ONLY update when NPCs/locations/threads change.
 ## Available Tools
 Tool calls go in their own fenced code block (one call per block):
 ```tool
 {"tool": "read_file", "args": {"file": "character", "dm_status": "Checking Dillion's stats."}}
 ```
-You may also show reasoning inline:
+- **roll** — dice, modifier
 - **player_roll** — dice, reason
 - **character_update** — content: "full sheet" (if HP/cash/gear/stats change)
 - **world_update** — content: "full world" (if NPCs/locations/threads change)
 - **journal_update** — add: [...], done: [...]
-```thought
+You have the full state above — no need to look anything up. Just write the story and use tools when the player's action changes something.
 Your reasoning here
 ```
-Tools available:
+## State
 Every tool call **must** include a `"dm_status"` string in `args` — a short, public-facing description of what the DM is doing (e.g. `"consulting the archives"`, `"examining the wound"`, `"calculating the odds"`). The player sees this in the UI. Keep it vague — never reveal what the DM is actually reading or learning.
 Tool reference (`[R]` = required, `[O]` = optional):
 - **read_file** — Read a game state file.
  `[R] file`: "character" | "world" | "book" | "log" | "journal"
  `[R] dm_status`: "..."
 - **roll** — Auto-roll dice (outcome shown in status).
  `[O] dice`: "2d6" (default "1d6")
  `[O] modifier`: "-1" (default "0")
  `[R] dm_status`: "..."
 - **player_roll** — Ask the player to roll physical dice. Use when the outcome is uncertain.
  `[O] dice`: "2d6" (default "1d6")
  `[O] reason`: "why the roll matters"
  `[R] dm_status`: "..."
 - **character_get** — Read the full character sheet.
  `[R] dm_status`: "..."
 - **character_update** — Replace the full character sheet.
  `[R] content`: "full character sheet markdown"
  `[R] dm_status`: "..."
 - **world_get** — Read the full world state.
  `[R] dm_status`: "..."
 - **world_update** — Replace the full world state.
  `[R] content`: "full world state markdown"
  `[R] dm_status`: "..."
 - **journal_get** — Read the journal (TODO / DONE).
  `[R] dm_status`: "..."
 - **journal_update** — Add or complete journal entries.
  `[O] add`: ["new todo item", ...]
  `[O] done`: ["completed item", ...]
  `[R] dm_status`: "..."
 - **finalize_turn** — **REQUIRED to end the turn.** The loop will NOT stop without it. Call this ALONE — do not mix with get tools.
  `[R] book_log`: "full-form narrative of what happened durint the turn, permanent story record that reads like a book"
  `[R] user_prompt`: "short prompt for the player — NOT recorded, 1-3 sentences"
  `[O] log_entry`: "one-sentence summary (action + outcome)"
  `[O] ambience`: "soundscape name: silence|calm|combat|dungeon|forest|tavern|tension|town|wilds"
 When the player makes a choice, resolve it with the dice mechanics above. Describe the action, roll dice implicitly (describe the outcome, don't say "rolling dice"), apply damage/effects, and update state. Use this to decide how the story evolves.
 ## Current Game State
 ### Character
 $character
@ -204,12 +93,37 @@ $character
 ### World
 $world
-### Recent Log
+### Log
 $log
-### Recent Story (last turns from the book)
+### Story
 $story""")
-# trailing """ is intentional — the template ends here
+
 PROSE_PROMPT = Template("""You are the DM for "The Chaos". Narrate in 2nd person ("You"), vivid but concise. Player: Dillion.
 ## Rules
 - **Odds**: 1d6, 4+ favourable, 3- trouble.
 - **Traits**: 3d6, roll UNDER trait.
 - **Combat**: 1d6, 4+ hits. Damage: 1d6 + mod - armour.
 - **Wounds at 0 HP**: 1d6 → 1-2 die, 3-4 -1 max HP, 5-6 -1 all rolls until healed.
 - **Modifiers**: Favourable +1, Risky -1, Desperate -2.
 A die is cast at the start of each turn — incorporate it into your narrative.
 ## State
 ### Character
 $character
 ### World
 $world
 ### Log
 $log
 ### Story
 $story""")
 # ── Game Engine ────────────────────────────────────────────────────────────
@ -287,11 +201,10 @@ class GameEngine:
    def _read_file(self, path: Path) -> str:
        return path.read_text().strip() if path.exists() else ""
-    def _read_recent_log(self, max_entries: int = 10) -> str:
+    def _read_recent_log(self, max_entries: int = 5) -> str:
        """Read the latest log file and return the last N entries."""
        log_path = LOG_DIR / f"{TODAY}.md"
        if not log_path.exists():
            # Check yesterday's log
            from datetime import timedelta
            yesterday = (date.today() - timedelta(days=1)).isoformat()
            log_path = LOG_DIR / f"{yesterday}.md"
@ -301,7 +214,7 @@ class GameEngine:
        entries = [l for l in lines if l.strip().startswith("- ")]
        return "\n".join(entries[-max_entries:]) or "*No recent events.*"
-    def _read_recent_book(self, max_turns: int = 3) -> str:
+    def _read_recent_book(self, max_turns: int = 1) -> str:
        """Return the last N turns from the book as context."""
        text = self._read_file(BOOK_PATH)
        if not text:
@ -310,6 +223,22 @@ class GameEngine:
        recent = turns[-max_turns:]
        return "\n## ".join(recent) if len(turns) > 1 else recent[0]
    @staticmethod
    def _truncate_world(text: str) -> str:
        """Extract key world context: NPCs, factions, active threads, rumours."""
        if not text or text == "*No world state.*":
            return text
        sections = re.split(r"\n(?=## |### )", text)
        parts = []
        for sec in sections:
            header = sec.split("\n")[0].strip() if sec else ""
            if "Active Threads" in header:
                parts.append(sec)
            elif "Notable NPCs" in header or "Factions at Play" in header or "### Rumours" in header:
                parts.append(sec)
        result = "\n\n".join(parts)
        return result or text[:1500] + "\n_(world truncated)_"
    def _get_valid_ambiences(self) -> set[str]:
        """Parse ambience_options.md and return set of valid ambience names with associated audio files."""
        valid = {"silence"}  # silence always valid (stops music)
@ -341,7 +270,7 @@ class GameEngine:
    def build_system_prompt(self) -> str:
        """Assemble the system prompt with current game state."""
        char = self._read_file(CHAR_PATH) or "*No character sheet.*"
-        world = self._read_file(WORLD_PATH) or "*No world state.*"
+        world = self._truncate_world(self._read_file(WORLD_PATH) or "") or "*No world state.*"
        log = self._read_recent_log()
        story = self._read_recent_book()
        return SYSTEM_PROMPT.substitute(
@ -382,10 +311,8 @@ class GameEngine:
        else:
            parts.append(
                "## Instructions\n"
-                "Take the player's request and use it to advance the story."
+                "Advance the story based on the player's request. "
-                "Think, gather information, update the state, "
+                "All state is shown above — write the outcome directly."
                "then call finalize_turn to complete the turn.\n"
                "Put each tool call in its own ```tool block."
            )
        return "\n\n".join(parts)
@ -754,6 +681,28 @@ class GameEngine:
        except json.JSONDecodeError:
            return None
    def _call_llm(self, messages: list[dict], *, label: str = "", max_tokens: int | None = None) -> str | None:
        """Make a single LLM call. Returns content text or None on error."""
        try:
            import litellm
        except ImportError:
            return None
        try:
            response = litellm.completion(
                model=self.model,
                messages=messages,
                temperature=self.temperature,
                stream=False,
                timeout=60,
                max_tokens=max_tokens or self.max_tokens,
            )
            text = response.choices[0].message.content or ""
            self._append_llm_log(f"\n--- {label} ---\n{text}")
            return text
        except Exception as e:
            self._append_llm_log(f"\n--- LLM ERROR ({label}) ---\n{e}")
            return None
    def generate_with_tools(
        self,
        player_action: str | None = None,
@ -764,39 +713,13 @@ class GameEngine:
        on_debug: callable = None,
    ) -> TurnResult:
        """
-        Multi-turn generation with tool-use loop.
+        Three-phase generation:
-        The LLM can output ```thought blocks, call ```tool blocks, and
+        1. **Prose** — LLM writes the full book_log from context + player action.
-        MUST call **finalize_turn** to complete the turn. Until then the
+        2. **Summarize** — LLM condenses the book_log into one log line.
-        loop continues feeding tool results back.
+        3. **Extract** — LLM reads the book_log and outputs tool calls for state changes.
        `on_thought` / `on_action` / `on_debug` may be called from a worker thread —
        use call_from_thread in the TUI.
        """
        system = self.build_system_prompt()
        user = self.build_user_message(
            player_action=player_action,
            last_prompt=last_prompt,
        )
        messages: list[dict] = [
            {"role": "system", "content": system},
            {"role": "user", "content": user},
        ]
        self._set_llm_env()
        try:
            import litellm
        except ImportError:
            return TurnResult(error="litellm not installed")
        max_rounds = 30
        debug_entries: list[str] = []
        attempt = 0
        round_used = 0
        reminder_count = 0
        from datetime import datetime
        self._append_llm_log(f"\n{'='*60}")
        self._append_llm_log(f"=== Turn — {datetime.now().strftime('%Y-%m-%d %H:%M:%S')} ===")
@ -806,246 +729,200 @@ class GameEngine:
        elif last_prompt:
            self._append_llm_log(f"Resume from: {last_prompt[:120]}")
-        while round_used < max_rounds:
+        # ── Phase 1: Prose ────────────────────────────────────────────────
-            attempt += 1
+        import random
-            round_log: list[str] = [f"── Attempt {attempt} (round {round_used + 1}/{max_rounds}) ──"]
+        die_roll = random.randint(1, 6)
        self._append_llm_log(f"Dice: {die_roll} (1d6)")
-            try:
+        if on_action:
-                response = litellm.completion(
+            on_action(f"Phase 1/3: writing story (dice={die_roll})")
-                    model=self.model,
+        if on_debug:
-                    messages=messages,
+            on_debug("phase", {"phase": 1, "name": "prose", "status": "start", "dice": die_roll})
                    temperature=self.temperature,
                    stream=False,
                    timeout=60,
                    max_tokens=self.max_tokens,
                )
                text = response.choices[0].message.content or ""
                self._append_llm_log(
                    f"\n--- Attempt {attempt} ---\n{text}"
                )
            except Exception as e:
                self._append_llm_log(f"\n--- LLM ERROR (attempt {attempt}) ---\n{e}")
                if on_debug:
                    on_debug("llm_error", {"error": str(e)})
                return TurnResult(error=f"LLM call failed: {e}")
-            if on_debug:
+        book_log = None
-                on_debug("llm_response", {"round": attempt, "text": text})
+        for attempt in range(3):
-
+            system = PROSE_PROMPT.substitute(
-            # Thoughts
+                character=self._read_file(CHAR_PATH) or "*No character sheet.*",
-            thoughts = self._extract_thoughts(text)
+                world=self._truncate_world(self._read_file(WORLD_PATH) or "") or "*No world state.*",
-            if thoughts:
+                log=self._read_recent_log(),
-                round_log.append(f"  thoughts: {len(thoughts)}")
+                story=self._read_recent_book(),
            for t in thoughts:
                if on_thought:
                    on_thought(t.strip())
                if on_debug:
                    on_debug("thought", {"round": attempt, "text": t.strip()})
            # Tool calls
            tool_calls = self._extract_tool_calls(
                text,
                round_num=attempt,
                on_debug=on_debug,
            )
-            finalize_call: dict | None = None
+            user = self.build_user_message(
-            other_calls: list[dict] = []
+                player_action=player_action,
                last_prompt=last_prompt,
            )
            user += f"\n\n*A die is cast: **{die_roll}** (1d6).*"
            text = self._call_llm([
                {"role": "system", "content": system},
                {"role": "user", "content": user},
            ], label=f"Prose attempt {attempt + 1}", max_tokens=1024)
            if not text or not text.strip():
                if on_debug:
                    on_debug("phase", {"phase": 1, "status": "empty", "attempt": attempt + 1})
                continue
            book_log = text.strip()
            if on_debug:
                preview = book_log[:150].replace("\n", "\\n")
                on_debug("phase", {"phase": 1, "status": "done", "chars": len(book_log), "preview": preview})
            break
        if not book_log:
            return TurnResult(error="Prose generation failed after 3 attempts")
        # ── Phase 2: Summarize ────────────────────────────────────────────
        if on_action:
            on_action("Phase 2/3: summarizing story")
        if on_debug:
            on_debug("phase", {"phase": 2, "name": "summarize", "status": "start"})
        log_context = self._read_recent_log()
        log_entry = None
        for attempt in range(2):
            text = self._call_llm([
                {"role": "user", "content":
                    f"Given the session log so far, summarize the new story in one line. "
                    f"Focus on who was involved (character and NPC names):\n\n"
                    f"## Session Log\n{log_context}\n\n"
                    f"## New Story\n{book_log}"}
            ], label=f"Summarize attempt {attempt + 1}")
            if text and text.strip():
                log_entry = text.strip().split("\n")[0][:120]
                if on_debug:
                    on_debug("phase", {"phase": 2, "status": "done", "summary": log_entry})
                break
        if not log_entry:
            log_entry = book_log.split("\n")[0][:120]
            if on_debug:
                on_debug("phase", {"phase": 2, "status": "fallback", "summary": log_entry})
        # ── Phase 3: Extract state changes ────────────────────────────────
        if on_action:
            on_action("Phase 3/3: extracting state changes")
        if on_debug:
            on_debug("phase", {"phase": 3, "name": "extract", "status": "start"})
        user_prompt = self._auto_prompt(book_log)
        ambience = None
        debug_info = ""
        current_char = self._read_file(CHAR_PATH) or "*No character.*"
        current_world = self._truncate_world(self._read_file(WORLD_PATH) or "") or "*No world.*"
        for attempt in range(3):
            text = self._call_llm([
                {"role": "user", "content":
                    f"Read the story and compare with current state. Output tool calls for changes:\n\n"
                    f"## Current Character\n{current_char}\n\n"
                    f"## Current World\n{current_world}\n\n"
                    f"## Story\n{book_log}\n\n"
                    f"Output tool blocks for changes only. Include the FULL updated content:\n"
                    f"- character_update — content: full new sheet if HP/cash/gear/stats changed\n"
                    f"- world_update — content: full new world if NPCs/locations/threads changed\n"
                    f"- journal_update — add: [...], done: [...]\n"
                    f"- finalize_turn — user_prompt (question for player), ambience (soundscape)\n\n"
                    f"Wrap each in ```tool:\n"
                    f"```tool\n{{\"tool\": \"character_update\", \"args\": {{\"content\": \"# Character\\n...\"}}}}\n```"}
            ], label=f"Extract attempt {attempt + 1}")
            if not text or not text.strip():
                if on_debug:
                    on_debug("phase", {"phase": 3, "status": "empty", "attempt": attempt + 1})
                continue
            tool_calls = self._extract_tool_calls(
                text, round_num=attempt + 1, on_debug=on_debug
            )
            if on_debug and tool_calls:
                names = [tc.get("tool", "?") for tc in tool_calls if tc.get("tool") != "finalize_turn"]
                fin = any(tc.get("tool") == "finalize_turn" for tc in tool_calls)
                on_debug("phase", {"phase": 3, "status": "tools_found", "tools": names, "has_finalize": fin})
            errors = []
            for tc in tool_calls:
-                if tc.get("tool") == "finalize_turn":
+                name = tc.get("tool", "?")
-                    finalize_call = tc
+                args = tc.get("args", {})
-                else:
+                if name == "finalize_turn":
-                    other_calls.append(tc)
+                    if args.get("user_prompt"):
-
+                        user_prompt = args["user_prompt"]
-            # Log tool call summary
+                    if args.get("ambience"):
-            if tool_calls:
+                        ambience = args["ambience"]
                names = [tc.get("tool", "?") for tc in tool_calls]
                round_log.append(f"  tools: {', '.join(names)}")
            # Guard: mixed get tools + finalize_turn → execute get tools, reject finalize
            get_tools = {"read_file", "character_get", "world_get", "journal_get"}
            if finalize_call and any(tc.get("tool") in get_tools for tc in other_calls):
                # Execute only the get tools, drop finalize_turn
                results = []
                for tc in other_calls:
                    if tc.get("tool") not in get_tools:
                        continue
                    name = tc.get("tool", "?")
                    args = tc.get("args", {})
                    if not args.get("dm_status"):
                        err_msg = (
                            f"**Validation Error:** Tool `{name}` missing required `dm_status`. "
                            f"Add `\"dm_status\": \"what the DM is doing\"` to the args."
                        )
                        results.append(err_msg)
                        round_log.append(f"  {name}: MISSING dm_status")
                        if on_debug:
                            on_debug("validation_error", {"round": attempt, "type": "tool", "tool": name, "error": "missing dm_status"})
                        continue
                    if on_action:
                        on_action(self._describe_tool_action(name, args))
                    if on_debug:
                        on_debug("tool_call", {"round": attempt, "tool": name, "args": args})
                    result = self._execute_tool(name, args)
                    results.append(f"**Tool:** {name}\n**Args:** {json.dumps(args)}\n**Result:** {result}")
                    round_log.append(f"  {name}: OK")
                    if on_debug:
                        on_debug("tool_result", {"round": attempt, "tool": name, "result": result})
                round_log.append("  finalize_turn ignored (mixed with get tools)")
                debug_entries.append("\n".join(round_log))
                # messages = messages[:2]  # keep full history across rounds so LLM can learn from prior attempts
                messages.append({"role": "assistant", "content": text})
                messages.append({
                    "role": "user",
                    "content": "## Tool Results\n\n" + "\n\n".join(results) + "\n\n**Note:** `finalize_turn` was ignored because you called get tools in the same round. Call `finalize_turn` alone in the next round to complete the turn."
                })
                if on_debug:
                    on_debug("validation_error", {"round": attempt, "type": "mixed_get_finalize", "tools": [tc.get("tool") for tc in other_calls]})
                round_used += 1
                continue
            # finalize_turn present → validate and return
            if finalize_call:
                args = finalize_call.get("args", {})
                errs = []
                if not args.get("book_log"):
                    errs.append("book_log [Required]")
                if not args.get("user_prompt"):
                    errs.append("user_prompt [Required]")
                # Validate ambience
                ambience_name = args.get("ambience")
                if ambience_name and ambience_name != "silence":
                    valid_ambiences = self._get_valid_ambiences()
                    if not valid_ambiences or ambience_name not in valid_ambiences:
                        errs.append(f"ambience '{ambience_name}' is invalid or has no associated audio files.")
                if errs:
                    hint = (
                        f"Expected:\n"
                        f'{{"tool": "finalize_turn", "args": {{'
                        f'"book_log": "...", '
                        f'"user_prompt": "...", '
                        f'"log_entry": "...", '
                        f'"ambience": "..."'
                        f"}}}}\n"
                        f"Valid ambiences: {', '.join(valid_ambiences)}"
                    )
                    round_log.append(f"  finalize_turn validation errors: {', '.join(errs)}")
                    debug_entries.append("\n".join(round_log))
                    # messages = messages[:2]  # keep full history across rounds so LLM can learn from prior attempts
                    messages.append({"role": "assistant", "content": text})
                    messages.append({
                        "role": "user",
                        "content": f"## Validation Error\nMissing required field(s): {', '.join(errs)}.\n\n{hint}Please provide all required fields and call finalize_turn again."
                    })
                    if on_debug:
                        on_debug("validation_error", {"round": attempt, "type": "finalize_turn", "errors": errs})
                    round_used += 1
                    continue
                if on_action:
                    on_action(f"State: {self._describe_tool_action(name, args)}")
                if on_debug:
-                    on_debug("finalize", {"round": attempt, "args": args})
+                    on_debug("tool_call", {"round": attempt + 1, "tool": name, "args": args})
                round_used += 1
                self._append_llm_log(
                    f"\n--- FINALIZE (attempt {attempt}) ---\n"
                    f"book_log: {args.get('book_log','')[:200]}\n"
                    f"user_prompt: {args.get('user_prompt','')[:200]}\n"
                    f"log_entry: {args.get('log_entry','')}\n"
                    f"ambience: {args.get('ambience','')}\n"
                )
                return TurnResult(
                    book_log=args.get("book_log", ""),
                    user_prompt=args.get("user_prompt", ""),
                    ambience=args.get("ambience"),
                    log_entry=args.get("log_entry"),
                )
-            # Execute other tools
+                if name == "player_roll" and on_player_roll:
-            if other_calls:
+                    dice = args.get("dice", "1d6")
-                results = []
+                    reason = args.get("reason", "a check")
-                for tc in other_calls:
+                    roll_val = on_player_roll(dice, reason)
-                    name = tc.get("tool", "?")
+                    result = f"Player rolled {dice} for '{reason}': {roll_val}"
-                    args = tc.get("args", {})
+                else:
                    result = self._execute_tool(name, args)
-                    # dm_status is required on every tool call
+                if result.startswith("**Error:") or result.startswith("Tool error") or result.startswith("Unknown"):
-                    if not args.get("dm_status"):
+                    errors.append(f"{name}: {result}")
                        err_msg = (
                            f"**Validation Error:** Tool `{name}` missing required `dm_status`. "
                            f"Add `\"dm_status\": \"what the DM is doing\"` to the args.\n"
                            f"Put each tool call in its own ```tool block."
                        )
                        results.append(err_msg)
                        round_log.append(f"  {name}: MISSING dm_status")
                        if on_debug:
                            on_debug("validation_error", {"round": attempt, "type": "tool", "tool": name, "error": "missing dm_status"})
                        continue
                    if on_action:
                        on_action(self._describe_tool_action(name, args))
                    if on_debug:
                        on_debug("tool_call", {"round": attempt, "tool": name, "args": args})
                    if name == "player_roll" and on_player_roll:
                        dice = args.get("dice", "1d6")
                        reason = args.get("reason", "a check")
                        roll_val = on_player_roll(dice, reason)
                        result = f"Player rolled {dice} for '{reason}': {roll_val}"
                    else:
                        result = self._execute_tool(name, args)
                    results.append(f"**Tool:** {name}\n**Args:** {json.dumps(args)}\n**Result:** {result}")
                    round_log.append(f"  {name}: OK")
                    if on_debug:
                        on_debug("tool_result", {"round": attempt, "tool": name, "result": result})
                # messages = messages[:2]  # keep full history across rounds so LLM can learn from prior attempts
                messages.append({"role": "assistant", "content": text})
                messages.append({
                    "role": "user",
                    "content": "## Tool Results\n\n" + "\n\n".join(results),
                })
                debug_entries.append("\n".join(round_log))
                round_used += 1
                continue
            # No tools, no finalize
            round_log.append("  no tool calls")
            if not text.strip():
                # Empty response — model may be slow. Give it time and retry without adding context.
                if on_debug:
-                    on_debug("empty_response", {"round": attempt})
+                    on_debug("tool_result", {"round": attempt + 1, "tool": name, "result": result})
                import time
                time.sleep(2)
                debug_entries.append("\n".join(round_log))
                continue
-            # Plain-text reasoning (no ```tool/```thought blocks) — log in debug but don't show to player
+            if not errors:
-            round_used += 1
+                if on_debug:
                    on_debug("phase", {"phase": 3, "status": "done", "applied": len([tc for tc in tool_calls if tc.get("tool") != "finalize_turn"])})
                break
            debug_info = "; ".join(errors)
            if on_debug:
-                on_debug("thought", {"round": attempt, "text": text.strip()})
+                on_debug("phase", {"phase": 3, "status": "errors", "errors": errors, "attempt": attempt + 1})
-            debug_entries.append("\n".join(round_log))
+        if on_action:
-            # messages = messages[:2]  # keep full history across rounds so LLM can learn from prior attempts
+            on_action("Turn complete")
-            messages.append({"role": "assistant", "content": text})
+        if on_debug:
-            reminder_count += 1
+            on_debug("phase_done", {
-            if reminder_count % 3 == 0:
+                "book_log_chars": len(book_log),
-                reminder = (
+                "log_entry": log_entry,
-                    "## Instructions\n"
+                "user_prompt": user_prompt,
-                    "Respond with tool calls or finalize_turn.\n\n"
+                "ambience": ambience,
-                    "Put each tool call in its own ```tool block:\n"
+                "extract_errors": debug_info or None,
-                    "```tool\n{\"tool\": \"character_get\", \"args\": {\"dm_status\": \"...\"}}\n```\n\n"
+            })
                    "When ready, call **finalize_turn** with `book_log` and `user_prompt`."
                )
            else:
                reminder = "Use tools to gather information or call **finalize_turn** to end the turn."
            messages.append({"role": "user", "content": reminder})
            if on_debug:
                on_debug("no_tool_calls", {"round": attempt})
-        debug_text = "\n\n".join(debug_entries)
+        self._append_llm_log(
-        self._append_llm_log(f"\n--- LOOP EXCEEDED ({max_rounds} rounds) ---\n{debug_text}")
+            f"\n--- FINAL ---\n"
-        return TurnResult(
+            f"book_log: {book_log[:200]}\n"
-            error=f"Turn loop exceeded max rounds ({max_rounds}). Below is a debug log of what the LLM did each round:\n\n{debug_text}",
+            f"log_entry: {log_entry}\n"
-            debug_info=debug_text,
+            f"user_prompt: {user_prompt}\n"
            f"ambience: {ambience}\n"
        )
        return TurnResult(
            book_log=book_log,
            log_entry=log_entry,
            user_prompt=user_prompt,
            ambience=ambience,
            debug_info=debug_info,
        )
    @staticmethod
    def _strip_tool_blocks(text: str) -> str:
        """Remove ```tool, ```json, finalize_turn blocks from narrative text."""
        return re.sub(
            r'```(?:tool|json|finalize_turn)\s*\n?.*?```',
            '',
            text,
            flags=re.DOTALL,
        ).strip()
    @staticmethod
    def _auto_prompt(book_log: str) -> str:
        """Extract a player prompt from the narrative. Uses the last sentence."""
        lines = book_log.strip().splitlines()
        sentences = []
        for line in reversed(lines):
            line = line.strip()
            if not line:
                continue
            # Take last substantive line as the prompt
            return f"**What do you do?**\n\n{line}"
        return "**What do you do?**"
    # ── Response Parsing ────────────────────────────────────────────────
--- a/tools/run.py
+++ b/tools/run.py
@ -829,47 +829,54 @@ class ChaosTUI(App):
    def _on_debug(self, event_type: str, data: dict) -> None:
        """Structured debug entry: visible description + technical detail."""
-        r = data.get("round", "")
+        if event_type == "phase":
-        if event_type == "llm_response":
+            p = data.get("phase", 0)
-            text = data.get("text", "")
+            status = data.get("status", "")
-            if text.strip():
+            if status == "start":
-                preview = text[:200].replace("\n", "\\n").strip() + ("…" if len(text) > 200 else "")
+                name = data.get("name", "")
-                self._append_debug(f"  LLM response: {preview}")
+                dice = data.get("dice")
-            else:
+                d = f"  dice={dice}" if dice else ""
-                self._append_debug(f"  LLM response: (empty)")
+                self._append_debug(f"▸ Phase {p}: {name} {d}")
-        elif event_type == "thought":
+            elif status == "done":
-            thought = data.get("text", "")
+                if p == 1:
-            display = thought[:60] + "…" if len(thought) > 60 else thought
+                    self._append_debug(f"  ✔ prose: {data.get('chars', 0)} chars")
-            self._append_debug(f"  💭 {display}")
+                elif p == 2:
                    self._append_debug(f"  ✔ summary: {data.get('summary', '')}")
                elif p == 3:
                    n = data.get("applied", 0)
                    self._append_debug(f"  ✔ extract: {n} state changes applied")
            elif status == "empty":
                self._append_debug(f"  ⚠ phase {p} attempt {data.get('attempt', '?')} empty — retry")
            elif status == "fallback":
                self._append_debug(f"  ⚠ phase {p} used fallback: {data.get('summary', '')}")
            elif status == "tools_found":
                tools = data.get("tools", [])
                fin = data.get("has_finalize", False)
                t = ", ".join(tools) if tools else "none"
                self._append_debug(f"  🔧 tools found: {t}" + (" + finalize_turn" if fin else ""))
            elif status == "errors":
                errs = data.get("errors", [])
                for e in errs:
                    self._append_debug(f"  ✖ {e}")
                self._append_debug(f"  ⟳ retry (attempt {data.get('attempt', '?')})")
        elif event_type == "phase_done":
            self._append_debug(f"  ✔ turn complete — book_log: {data.get('book_log_chars', 0)} chars")
            if data.get("log_entry"):
                self._append_debug(f"     log: {data['log_entry']}")
            if data.get("ambience"):
                self._append_debug(f"     ambience: {data['ambience']}")
            if data.get("extract_errors"):
                self._append_debug(f"     extract errors: {data['extract_errors']}")
        elif event_type == "tool_call":
            tool = data.get("tool", "?")
            args = data.get("args", {})
-            desc = args.get("dm_status", tool)
+            self._append_debug(f"  🔧 {tool}({json.dumps(args)})")
            self._append_debug(f"  🔧 {desc}")
            self._append_debug(f"     {tool}({json.dumps(args)})")
        elif event_type == "tool_result":
            tool = data.get("tool", "?")
            result = data.get("result", "")
            preview = result[:80].replace("\n", " ").strip() + ("…" if len(result) > 80 else "")
            self._append_debug(f"     → {preview}")
        elif event_type == "validation_error":
            err_type = data.get("type", "")
            if err_type == "finalize_turn":
                self._append_debug(f"  ✖ finalize_turn missing: {', '.join(data.get('errors', []))}")
            elif err_type == "mixed_get_finalize":
                tools = data.get("tools", [])
                self._append_debug(f"  ✖ mixed get tools {tools} with finalize_turn — rejected")
            else:
                tool = data.get("tool", "?")
                self._append_debug(f"  ✖ {tool} missing dm_status")
        elif event_type == "finalize":
            self._append_debug("  ✔ finalize_turn")
        elif event_type == "no_tool_calls":
            self._append_debug(f"  ⚠ no tool calls — reminded to use tools")
        elif event_type == "parse_error":
-            self._append_debug(f"  ⚠ failed to parse tool block: {data.get('content', '')}")
+            self._append_debug(f"  ⚠ bad tool block: {data.get('content', '')}")
        elif event_type == "empty_response":
            self._append_debug("  ⚠ empty response — waiting 2s, retrying without reminder")
        elif event_type == "llm_error":
            self._append_debug(f"  ✖ LLM error: {data.get('error', '')}")