splinter-keep/AGENTS.md
2026-06-30 19:40:58 +02:00

14 KiB

The Chaos — Game Architecture

This document describes the system architecture for developers and AI agents working on the codebase.

Design Principle

The Chaos is a self-contained terminal game. The TUI owns the full game loop — including LLM calls — so there is no split between a "DM agent" in chat and a "dashboard" in the terminal. The player runs one command:

python3 tools/run.py

Everything — narrative, choices, character sheet, log, archive, ambience — lives in that process.

Project Layout

the-chaos/
├── rules/                    # LOCKED — game rules, do not modify
│   ├── deck/                 # Card tables
│   └── mechanics.md          # Core mechanics reference
├── tools/                    # Game system code
│   ├── __init__.py
│   ├── engine.py             # Game engine (prompt builder, LLM client, parser, state)
│   ├── run.py                # TUI (Textual app, game loop, narrative, input)
│   ├── ambience.py           # CLI shortcut for ambience switching
│   ├── draw_card.py          # Card drawing tool
│   ├── music-fetch.py        # YouTube audio downloader
│   ├── roll_dice.py          # Dice rolling tool
│   ├── store_turn.py         # DEPRECATED — use engine.py archive_turn instead
│   ├── test_imports.py       # Import validation test
│   └── test_runtime.py       # Runtime import test
├── scripts/                  # UNLOCKED — helper scripts
├── run.sh                    # Entry point (just calls tools/run.py)
└── session/                  # Game state (read/write by engine)
    ├── config.json           # LLM provider config
    ├── character.md          # Player character sheet
    ├── world.md              # Keep & Realm state
    ├── book.md               # Story book (append-only turn archive)
    ├── journal.md            # TODO / DONE tracking
    ├── ambience.md           # Current ambience name
    ├── ambience_options.md   # Ambience → file mapping
    ├── ambience_sources.md   # Track source URLs
    ├── tweaks.md             # House rules log
    ├── audio/                # Music files
    └── log/                  # Session logs by date

How It Works

Tools

Tool Role
tools/engine.py Game engine. Owns the LLM interaction, prompt assembly, response parsing, and state persistence. Can be used standalone from the CLI for debugging.
tools/run.py TUI (Textual app). Owns the game loop: display narrative → get player input → call engine → display result.

The Game Loop (run.py)

  1. Mount: Load engine, build system prompt (rules + character + world + log).
  2. Scene: Call engine.generate() → receive narrative + choices.
  3. Display: Show narrative in main pane, render choice buttons.
  4. Input: Player clicks a choice or types free text, presses Enter.
  5. Resolve: Call engine.generate(player_action) → receive outcome + state changes.
  6. Archive: Append the full turn (scene + action + outcome) to book.md.
  7. Apply: Write state changes to character.md, world.md, log/, ambience.md, journal.md.
  8. Loop: Display the next scene → go to step 3.

The Engine (engine.py)

  • GameEngine class loads config from session/config.json.
  • build_system_prompt() assembles the DM prompt from game rules + current state.
  • build_user_message() builds the per-turn message with player action context.
  • generate() calls litellm, returns parsed GenerationResult.
  • parse_response() extracts the JSON block from the LLM response.
  • apply_state() writes state changes to session files.
  • archive_turn() appends the narrative to book.md.

LLM Output Format

The LLM must end every response with a JSON fenced code block:

{
  "choices": ["Choice 1", "Choice 2"],
  "log_entry": "- **time** — description.",
  "ambience": "ambience_name_or_null",
  "character_updates": null,
  "world_updates": null,
  "journal_add": [],
  "journal_done": []
}
  • choices: 2-4 action options for the player.
  • log_entry: Single-line summary appended to today's log.
  • ambience: One of: silence, calm, combat, dungeon, forest, tavern, tension, town, wilds.
  • character_updates: Full character sheet markdown only if HP/cash/gear/stats changed.
  • world_updates: Full world markdown only if NPCs/locations/threads changed.
  • journal_add / journal_done: TODO list management.

Session Config

{
  "llm": {
    "model": "ollama/llama3.1",
    "api_key": null,
    "api_base": null,
    "temperature": 0.8
  }
}

The model field accepts any litellm provider string: openai/gpt-4, anthropic/claude-sonnet-4-20250514, ollama/llama3.1, groq/llama3-70b-8192, etc. Set api_key for remote providers.

Running

# Start the game
./run.sh

# Or directly
python3 tools/run.py

# No music
python3 tools/run.py --no-music

# Test a generation from CLI (no TUI)
python3 tools/engine.py --action "I head to the market"

Testing Commands

nWhen committing, also use the pre-commit validator to check for oversized Python files.

n## Pre-commit Validation Before committing, run the pre-commit validator script to ensure no Python file is too large.

./pre-commit.sh

Always run tests before making changes. This prevents runtime errors like missing imports.

# Quick test (runs import and runtime validation)
./run.sh

# Test with engine action
./run.sh --action "I check on Rina"

# Run tests only
python3 tools/test_imports.py
python3 tools/test_runtime.py

Test Coverage

  • tools/test_imports.py — Checks for missing imports using AST analysis
  • tools/test_runtime.py — Verifies module loads without errors, checks for missing classes/methods
  • Both tests should pass before proceeding with development

LLM Strategies

Two strategies for LLM interaction:

  1. "conversational" — 3-phase approach (prose → summarize → extract)
  2. "tools" — Single-call approach with tool blocks

Default is "tools" for faster single-call generation.

Configuration

{
  "llm": {
    "model": "openai/llama3",
    "api_key": "sk-bogus-key",
    "api_base": "http://localhost:8080/v1",
    "temperature": 0.8,
    "timeout": 120,
    "max_tokens": 10000,
    "strategy": "tools"
  }
}

Important Notes

  • The random module must be imported before use — it's used in dice rolling and die roll generation
  • All LLM responses go through _call_llm which logs complete output with markers
  • The engine extracts both content and reasoning_content fields from responses (for OpenAI-compatible servers)
  • The generate_with_tools_single() method handles single-call tool-based generation

LLM Logging

The engine logs detailed information to llm.log:

============================================================
=== Turn — 2026-06-28 23:21:58 ===
============================================================
Player: I smash the demon
Dice: 2 (1d6)

[TOOL] Single call — 8615 chars system, 219 chars user
System preview: You are an RPG dungeon master. The player just took an action....
User preview: ## Situation...

┌─ Single tool call ───────────────────────────────────────────────────────────
├─ Model: openai/llama3 | Temp: 0.80 | Tokens: 4096
├─ Messages:
├   [SYSTEM]: You are an RPG dungeon master. The player just took an action.

Narrate the outcome in engaging, vivid prose. Use tools for any mechanics (rolls, damage, state changes). Only use ```tool blocks — no p...
├   [USER]: ## Situation
What do you do?

## Player's Request
I smash the demon

## Instructions
Advance the story based on the player's request. All state is shown above — write the outcome directly.

*A ...
└─ Response:
└   The die cast for this turn is a 2. The player wants to smash the demon. I need to narrate the outcome of that action, incorporating the die result and the combat mechanics.

First, determine if the attack hits. The demon is a large creature (size 5). I assume the player's DEX is 14, so the roll to hit is 1d6 with a 4+ favourable. The die result is 2, which is a failure. However, the player might have a modifier. The demon is a weaver? No, it's a demon. The player is Dillion, who just took -4 HP ...
└───────────────────────────────────────────────────────────────────────


[TOOL] got 17372 chars in 97396.3ms

[TOOL] no tool blocks found
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│ Turn Details — 2026-06-28 23:23:36.097                                                                            │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│ Input: I smash the demon              │
│ Last Prompt: What do you do?          │
│ Strategy: tools                                                         │
│ Dice: 2 (1d6)                                                                │
│ Model: openai/llama3 | Temp: 0.8 | Tokens: 10000                            │
│ Output: 0 chars (0 words)                                    │
│ Log Entry:                                                          │
│ Ambience: None                                                    │
│ Tool Calls: 0 ()       │
│─────────────────────────────────────────────────────────────────────────────────────────┘

Debug Markers

  • ┌─ / └─ — Response markers
  • ├─ — Message lines
  • [TOOL] — Tool execution logs
  • [DEBUG LLM RESPONSE] — Raw LLM response logging
  • [DEBUG RESPONSE LENGTH] — Response length logging

TUI Debug Tab

The TUI has a DEBUG tab that shows:

  • LLM configuration
  • Tool calls with arguments
  • LLM errors with tracebacks
  • Turn details (timestamps, dice rolls, response sizes, tool call counts)
  • Phase progress (Phase 1/3, Phase 2/3, Phase 3/3)

Key Code References

  • _call_llm — Core LLM interaction with logging
  • generate_with_tools — 3-phase conversational approach
  • generate_with_tools_single — Single-call tool-based approach
  • _log_turn_details — Comprehensive turn logging
  • _on_debug — Structured debug output to TUI

Common Errors & Fixes

NameError: name 'random' is not defined

Add import random at the top of tools/engine.py. The random module is used in:

  • Line ~488: Dice rolling in _tool_roll
  • Line ~926: Random die roll in generate_with_tools
  • Line ~1232: Random die roll in generate_with_tools_single

NameError: name 'read_todo' is not defined

The read_todo function must be defined in tools/run.py. It reads TODO items from journal.md.

NameError: name 'read_log_tail' is not defined

The read_log_tail function must be defined in tools/run.py. It reads the tail of the session log.

Testing Workflow

  1. Before coding: Run ./run.sh to ensure imports and runtime are valid
  2. After coding: Run ./run.sh --action "test action" to test the engine
  3. Before committing: Run both tests to ensure no missing imports
# Quick validation
./run.sh

# Test with engine action
./run.sh --action "I check on Rina"

# Manual tests
python3 tools/test_imports.py
python3 tools/test_runtime.py

LLM Response Handling

The engine handles both content and reasoning_content fields from LLM responses:

text = response.choices[0].message.content or response.choices[0].message.reasoning_content or ""

This allows compatibility with OpenAI-compatible servers that return content in the reasoning_content field instead of content.

Timeout & Token Configuration

  • Default timeout: 120 seconds (configurable in config.json)
  • Default max tokens: 10000 (configurable in config.json)
  • Adjust these values based on your LLM provider's limits

Session Files

  • session/config.json — LLM config (edit directly)
  • session/character.md — PC state (written by engine)
  • session/world.md — Realm state (written by engine)
  • session/book.md — Story archive (written by engine)
  • session/journal.md — TODO/DONE list (written by engine)
  • session/ambience.md — Current ambience (written by engine)
  • session/log/<date>.md — Session logs (written by engine)
  • session/tweaks.md — House rules (manual edit)

LLM Strategies Explained

"conversational" Strategy

Uses three separate LLM calls:

  1. Prose — Writes full book_log from context + player action
  2. Summarize — Condenses book_log into one log line
  3. Extract — Reads book_log and outputs tool calls for state changes

Retry loop: 3 attempts, Phase 3 fallback to Phase 1 if extraction fails.

"tools" Strategy

Uses single LLM call with all tools available:

  • System prompt instructs LLM to use tools for changes
  • Single call outputs narrative + tool blocks together
  • No retry loop — if it fails, turn fails
  • Extracts tool blocks, applies changes, summarizes in one pass

Debugging Tips

  1. Check llm.log for detailed LLM interaction logs
  2. Use the TUI's DEBUG tab for structured debug output
  3. Run tests before making changes
  4. Check config.json for LLM settings
  5. Look for missing imports in the engine.py file
  6. Verify that the LLM provider is correctly configured in config.json