Dejvino 91b1b35cfa Add more logging. Selectable LLM strategy.

2026-06-29 22:59:45 +02:00

14 KiB

Raw Blame History

The Chaos — Game Architecture

This document describes the system architecture for developers and AI agents working on the codebase.

Design Principle

The Chaos is a self-contained terminal game. The TUI owns the full game loop — including LLM calls — so there is no split between a "DM agent" in chat and a "dashboard" in the terminal. The player runs one command:

python3 tools/run.py

Everything — narrative, choices, character sheet, log, archive, ambience — lives in that process.

Project Layout

the-chaos/
├── rules/                    # LOCKED — game rules, do not modify
│   ├── deck/                 # Card tables
│   └── mechanics.md          # Core mechanics reference
├── tools/                    # Game system code
│   ├── __init__.py
│   ├── engine.py             # Game engine (prompt builder, LLM client, parser, state)
│   ├── run.py                # TUI (Textual app, game loop, narrative, input)
│   ├── ambience.py           # CLI shortcut for ambience switching
│   ├── draw.py               # Card drawing tool
│   ├── music-fetch.py        # YouTube audio downloader
│   ├── roll.py               # Dice rolling tool
│   ├── store_turn.py         # DEPRECATED — use engine.py archive_turn instead
│   ├── test_imports.py       # Import validation test
│   └── test_runtime.py       # Runtime import test
├── scripts/                  # UNLOCKED — helper scripts
├── run.sh                    # Entry point (just calls tools/run.py)
└── session/                  # Game state (read/write by engine)
    ├── config.json           # LLM provider config
    ├── character.md          # Player character sheet
    ├── world.md              # Keep & Realm state
    ├── book.md               # Story book (append-only turn archive)
    ├── journal.md            # TODO / DONE tracking
    ├── ambience.md           # Current ambience name
    ├── ambience_options.md   # Ambience → file mapping
    ├── ambience_sources.md   # Track source URLs
    ├── tweaks.md             # House rules log
    ├── audio/                # Music files
    └── log/                  # Session logs by date

How It Works

Tools

Tool	Role
`tools/engine.py`	Game engine. Owns the LLM interaction, prompt assembly, response parsing, and state persistence. Can be used standalone from the CLI for debugging.
`tools/run.py`	TUI (Textual app). Owns the game loop: display narrative → get player input → call engine → display result.

The Game Loop (run.py)

Mount: Load engine, build system prompt (rules + character + world + log).
Scene: Call engine.generate() → receive narrative + choices.
Display: Show narrative in main pane, render choice buttons.
Input: Player clicks a choice or types free text, presses Enter.
Resolve: Call engine.generate(player_action) → receive outcome + state changes.
Archive: Append the full turn (scene + action + outcome) to book.md.
Apply: Write state changes to character.md, world.md, log/, ambience.md, journal.md.
Loop: Display the next scene → go to step 3.

The Engine (engine.py)

GameEngine class loads config from session/config.json.
build_system_prompt() assembles the DM prompt from game rules + current state.
build_user_message() builds the per-turn message with player action context.
generate() calls litellm, returns parsed GenerationResult.
parse_response() extracts the JSON block from the LLM response.
apply_state() writes state changes to session files.
archive_turn() appends the narrative to book.md.

LLM Output Format

The LLM must end every response with a JSON fenced code block:

{
  "choices": ["Choice 1", "Choice 2"],
  "log_entry": "- **time** — description.",
  "ambience": "ambience_name_or_null",
  "character_updates": null,
  "world_updates": null,
  "journal_add": [],
  "journal_done": []
}

choices: 2-4 action options for the player.
log_entry: Single-line summary appended to today's log.
ambience: One of: silence, calm, combat, dungeon, forest, tavern, tension, town, wilds.
character_updates: Full character sheet markdown only if HP/cash/gear/stats changed.
world_updates: Full world markdown only if NPCs/locations/threads changed.
journal_add / journal_done: TODO list management.

Session Config

{
  "llm": {
    "model": "ollama/llama3.1",
    "api_key": null,
    "api_base": null,
    "temperature": 0.8
  }
}

The model field accepts any litellm provider string: openai/gpt-4, anthropic/claude-sonnet-4-20250514, ollama/llama3.1, groq/llama3-70b-8192, etc. Set api_key for remote providers.

Running

# Start the game
./run.sh

# Or directly
python3 tools/run.py

# No music
python3 tools/run.py --no-music

# Test a generation from CLI (no TUI)
python3 tools/engine.py --action "I head to the market"

Testing Commands

Always run tests before making changes. This prevents runtime errors like missing imports.

# Quick test (runs import and runtime validation)
./run.sh

# Test with engine action
./run.sh --action "I check on Rina"

# Run tests only
python3 tools/test_imports.py
python3 tools/test_runtime.py

Test Coverage

tools/test_imports.py — Checks for missing imports using AST analysis
tools/test_runtime.py — Verifies module loads without errors, checks for missing classes/methods
Both tests should pass before proceeding with development

LLM Strategies

Two strategies for LLM interaction:

"conversational" — 3-phase approach (prose → summarize → extract)
"tools" — Single-call approach with tool blocks

Default is "tools" for faster single-call generation.

Configuration

{
  "llm": {
    "model": "openai/llama3",
    "api_key": "sk-bogus-key",
    "api_base": "http://localhost:8080/v1",
    "temperature": 0.8,
    "timeout": 120,
    "max_tokens": 10000,
    "strategy": "tools"
  }
}

Important Notes

The random module must be imported before use — it's used in dice rolling and die roll generation
All LLM responses go through _call_llm which logs complete output with markers
The engine extracts both content and reasoning_content fields from responses (for OpenAI-compatible servers)
The generate_with_tools_single() method handles single-call tool-based generation

LLM Logging

The engine logs detailed information to llm.log:

============================================================
=== Turn — 2026-06-28 23:21:58 ===
============================================================
Player: I smash the demon
Dice: 2 (1d6)

[TOOL] Single call — 8615 chars system, 219 chars user
System preview: You are an RPG dungeon master. The player just took an action....
User preview: ## Situation...

┌─ Single tool call ───────────────────────────────────────────────────────────
├─ Model: openai/llama3 | Temp: 0.80 | Tokens: 4096
├─ Messages:
├   [SYSTEM]: You are an RPG dungeon master. The player just took an action.

Narrate the outcome in engaging, vivid prose. Use tools for any mechanics (rolls, damage, state changes). Only use ```tool blocks — no p...
├   [USER]: ## Situation
What do you do?

## Player's Request
I smash the demon

## Instructions
Advance the story based on the player's request. All state is shown above — write the outcome directly.

*A ...
└─ Response:
└   The die cast for this turn is a 2. The player wants to smash the demon. I need to narrate the outcome of that action, incorporating the die result and the combat mechanics.

First, determine if the attack hits. The demon is a large creature (size 5). I assume the player's DEX is 14, so the roll to hit is 1d6 with a 4+ favourable. The die result is 2, which is a failure. However, the player might have a modifier. The demon is a weaver? No, it's a demon. The player is Dillion, who just took -4 HP ...
└───────────────────────────────────────────────────────────────────────


[TOOL] got 17372 chars in 97396.3ms

[TOOL] no tool blocks found
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│ Turn Details — 2026-06-28 23:23:36.097                                                                            │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│ Input: I smash the demon              │
│ Last Prompt: What do you do?          │
│ Strategy: tools                                                         │
│ Dice: 2 (1d6)                                                                │
│ Model: openai/llama3 | Temp: 0.8 | Tokens: 10000                            │
│ Output: 0 chars (0 words)                                    │
│ Log Entry:                                                          │
│ Ambience: None                                                    │
│ Tool Calls: 0 ()       │
│─────────────────────────────────────────────────────────────────────────────────────────┘

Debug Markers

┌─ / └─ — Response markers
├─ — Message lines
[TOOL] — Tool execution logs
[DEBUG LLM RESPONSE] — Raw LLM response logging
[DEBUG RESPONSE LENGTH] — Response length logging

TUI Debug Tab

The TUI has a DEBUG tab that shows:

LLM configuration
Tool calls with arguments
LLM errors with tracebacks
Turn details (timestamps, dice rolls, response sizes, tool call counts)
Phase progress (Phase 1/3, Phase 2/3, Phase 3/3)

Key Code References

_call_llm — Core LLM interaction with logging
generate_with_tools — 3-phase conversational approach
generate_with_tools_single — Single-call tool-based approach
_log_turn_details — Comprehensive turn logging
_on_debug — Structured debug output to TUI

Common Errors & Fixes

NameError: name 'random' is not defined

Add import random at the top of tools/engine.py. The random module is used in:

Line ~488: Dice rolling in _tool_roll
Line ~926: Random die roll in generate_with_tools
Line ~1232: Random die roll in generate_with_tools_single

NameError: name 'read_todo' is not defined

The read_todo function must be defined in tools/run.py. It reads TODO items from journal.md.

NameError: name 'read_log_tail' is not defined

The read_log_tail function must be defined in tools/run.py. It reads the tail of the session log.

Testing Workflow

Before coding: Run ./run.sh to ensure imports and runtime are valid
After coding: Run ./run.sh --action "test action" to test the engine
Before committing: Run both tests to ensure no missing imports

# Quick validation
./run.sh

# Test with engine action
./run.sh --action "I check on Rina"

# Manual tests
python3 tools/test_imports.py
python3 tools/test_runtime.py

LLM Response Handling

The engine handles both content and reasoning_content fields from LLM responses:

text = response.choices[0].message.content or response.choices[0].message.reasoning_content or ""

This allows compatibility with OpenAI-compatible servers that return content in the reasoning_content field instead of content.

Timeout & Token Configuration

Default timeout: 120 seconds (configurable in config.json)
Default max tokens: 10000 (configurable in config.json)
Adjust these values based on your LLM provider's limits

Session Files

session/config.json — LLM config (edit directly)
session/character.md — PC state (written by engine)
session/world.md — Realm state (written by engine)
session/book.md — Story archive (written by engine)
session/journal.md — TODO/DONE list (written by engine)
session/ambience.md — Current ambience (written by engine)
session/log/<date>.md — Session logs (written by engine)
session/tweaks.md — House rules (manual edit)

LLM Strategies Explained

"conversational" Strategy

Uses three separate LLM calls:

Prose — Writes full book_log from context + player action
Summarize — Condenses book_log into one log line
Extract — Reads book_log and outputs tool calls for state changes

Retry loop: 3 attempts, Phase 3 fallback to Phase 1 if extraction fails.

"tools" Strategy

Uses single LLM call with all tools available:

System prompt instructs LLM to use tools for changes
Single call outputs narrative + tool blocks together
No retry loop — if it fails, turn fails
Extracts tool blocks, applies changes, summarizes in one pass

Debugging Tips

Check llm.log for detailed LLM interaction logs
Use the TUI's DEBUG tab for structured debug output
Run tests before making changes
Check config.json for LLM settings
Look for missing imports in the engine.py file
Verify that the LLM provider is correctly configured in config.json

14 KiB Raw Blame History