splinter-keep/AGENTS.md
2026-06-30 19:40:58 +02:00

372 lines
14 KiB
Markdown

# The Chaos — Game Architecture
This document describes the system architecture for developers and AI agents
working on the codebase.
## Design Principle
The Chaos is a **self-contained terminal game**. The TUI owns the full game
loop — including LLM calls — so there is no split between a "DM agent" in chat
and a "dashboard" in the terminal. The player runs one command:
```bash
python3 tools/run.py
```
Everything — narrative, choices, character sheet, log, archive, ambience — lives
in that process.
## Project Layout
```
the-chaos/
├── rules/ # LOCKED — game rules, do not modify
│ ├── deck/ # Card tables
│ └── mechanics.md # Core mechanics reference
├── tools/ # Game system code
│ ├── __init__.py
│ ├── engine.py # Game engine (prompt builder, LLM client, parser, state)
│ ├── run.py # TUI (Textual app, game loop, narrative, input)
│ ├── ambience.py # CLI shortcut for ambience switching
│ ├── draw_card.py # Card drawing tool
│ ├── music-fetch.py # YouTube audio downloader
│ ├── roll_dice.py # Dice rolling tool
│ ├── store_turn.py # DEPRECATED — use engine.py archive_turn instead
│ ├── test_imports.py # Import validation test
│ └── test_runtime.py # Runtime import test
├── scripts/ # UNLOCKED — helper scripts
├── run.sh # Entry point (just calls tools/run.py)
└── session/ # Game state (read/write by engine)
├── config.json # LLM provider config
├── character.md # Player character sheet
├── world.md # Keep & Realm state
├── book.md # Story book (append-only turn archive)
├── journal.md # TODO / DONE tracking
├── ambience.md # Current ambience name
├── ambience_options.md # Ambience → file mapping
├── ambience_sources.md # Track source URLs
├── tweaks.md # House rules log
├── audio/ # Music files
└── log/ # Session logs by date
```
## How It Works
### Tools
| Tool | Role |
|------|------|
| `tools/engine.py` | Game engine. Owns the LLM interaction, prompt assembly, response parsing, and state persistence. Can be used standalone from the CLI for debugging. |
| `tools/run.py` | TUI (Textual app). Owns the game loop: display narrative → get player input → call engine → display result. |
### The Game Loop (run.py)
1. **Mount**: Load engine, build system prompt (rules + character + world + log).
2. **Scene**: Call `engine.generate()` → receive narrative + choices.
3. **Display**: Show narrative in main pane, render choice buttons.
4. **Input**: Player clicks a choice or types free text, presses Enter.
5. **Resolve**: Call `engine.generate(player_action)` → receive outcome + state changes.
6. **Archive**: Append the full turn (scene + action + outcome) to `book.md`.
7. **Apply**: Write state changes to `character.md`, `world.md`, `log/`, `ambience.md`, `journal.md`.
8. **Loop**: Display the next scene → go to step 3.
### The Engine (engine.py)
- `GameEngine` class loads config from `session/config.json`.
- `build_system_prompt()` assembles the DM prompt from game rules + current state.
- `build_user_message()` builds the per-turn message with player action context.
- `generate()` calls litellm, returns parsed `GenerationResult`.
- `parse_response()` extracts the JSON block from the LLM response.
- `apply_state()` writes state changes to session files.
- `archive_turn()` appends the narrative to `book.md`.
### LLM Output Format
The LLM must end every response with a JSON fenced code block:
```json
{
"choices": ["Choice 1", "Choice 2"],
"log_entry": "- **time** — description.",
"ambience": "ambience_name_or_null",
"character_updates": null,
"world_updates": null,
"journal_add": [],
"journal_done": []
}
```
- `choices`: 2-4 action options for the player.
- `log_entry`: Single-line summary appended to today's log.
- `ambience`: One of: silence, calm, combat, dungeon, forest, tavern, tension, town, wilds.
- `character_updates`: Full character sheet markdown only if HP/cash/gear/stats changed.
- `world_updates`: Full world markdown only if NPCs/locations/threads changed.
- `journal_add` / `journal_done`: TODO list management.
### Session Config
```json
{
"llm": {
"model": "ollama/llama3.1",
"api_key": null,
"api_base": null,
"temperature": 0.8
}
}
```
The `model` field accepts any litellm provider string: `openai/gpt-4`,
`anthropic/claude-sonnet-4-20250514`, `ollama/llama3.1`, `groq/llama3-70b-8192`,
etc. Set `api_key` for remote providers.
## Running
```bash
# Start the game
./run.sh
# Or directly
python3 tools/run.py
# No music
python3 tools/run.py --no-music
# Test a generation from CLI (no TUI)
python3 tools/engine.py --action "I head to the market"
```
## Testing Commands
nWhen committing, also use the pre-commit validator to check for oversized Python files.
n## Pre-commit Validation
Before committing, run the pre-commit validator script to ensure no Python file is too large.
```bash
./pre-commit.sh
```
Always run tests before making changes. This prevents runtime errors like missing imports.
```bash
# Quick test (runs import and runtime validation)
./run.sh
# Test with engine action
./run.sh --action "I check on Rina"
# Run tests only
python3 tools/test_imports.py
python3 tools/test_runtime.py
```
### Test Coverage
- `tools/test_imports.py` — Checks for missing imports using AST analysis
- `tools/test_runtime.py` — Verifies module loads without errors, checks for missing classes/methods
- Both tests should pass before proceeding with development
## LLM Strategies
Two strategies for LLM interaction:
1. **"conversational"** — 3-phase approach (prose → summarize → extract)
2. **"tools"** — Single-call approach with tool blocks
Default is "tools" for faster single-call generation.
### Configuration
```json
{
"llm": {
"model": "openai/llama3",
"api_key": "sk-bogus-key",
"api_base": "http://localhost:8080/v1",
"temperature": 0.8,
"timeout": 120,
"max_tokens": 10000,
"strategy": "tools"
}
}
```
### Important Notes
- The `random` module must be imported before use — it's used in dice rolling and die roll generation
- All LLM responses go through `_call_llm` which logs complete output with markers
- The engine extracts both `content` and `reasoning_content` fields from responses (for OpenAI-compatible servers)
- The `generate_with_tools_single()` method handles single-call tool-based generation
## LLM Logging
The engine logs detailed information to `llm.log`:
```
============================================================
=== Turn — 2026-06-28 23:21:58 ===
============================================================
Player: I smash the demon
Dice: 2 (1d6)
[TOOL] Single call — 8615 chars system, 219 chars user
System preview: You are an RPG dungeon master. The player just took an action....
User preview: ## Situation...
┌─ Single tool call ───────────────────────────────────────────────────────────
├─ Model: openai/llama3 | Temp: 0.80 | Tokens: 4096
├─ Messages:
├ [SYSTEM]: You are an RPG dungeon master. The player just took an action.
Narrate the outcome in engaging, vivid prose. Use tools for any mechanics (rolls, damage, state changes). Only use ```tool blocks — no p...
├ [USER]: ## Situation
What do you do?
## Player's Request
I smash the demon
## Instructions
Advance the story based on the player's request. All state is shown above — write the outcome directly.
*A ...
└─ Response:
└ The die cast for this turn is a 2. The player wants to smash the demon. I need to narrate the outcome of that action, incorporating the die result and the combat mechanics.
First, determine if the attack hits. The demon is a large creature (size 5). I assume the player's DEX is 14, so the roll to hit is 1d6 with a 4+ favourable. The die result is 2, which is a failure. However, the player might have a modifier. The demon is a weaver? No, it's a demon. The player is Dillion, who just took -4 HP ...
└───────────────────────────────────────────────────────────────────────
[TOOL] got 17372 chars in 97396.3ms
[TOOL] no tool blocks found
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│ Turn Details — 2026-06-28 23:23:36.097 │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│ Input: I smash the demon │
│ Last Prompt: What do you do? │
│ Strategy: tools │
│ Dice: 2 (1d6) │
│ Model: openai/llama3 | Temp: 0.8 | Tokens: 10000 │
│ Output: 0 chars (0 words) │
│ Log Entry: │
│ Ambience: None │
│ Tool Calls: 0 () │
│─────────────────────────────────────────────────────────────────────────────────────────┘
```
### Debug Markers
- `┌─` / `└─` — Response markers
- `├─` — Message lines
- `[TOOL]` — Tool execution logs
- `[DEBUG LLM RESPONSE]` — Raw LLM response logging
- `[DEBUG RESPONSE LENGTH]` — Response length logging
## TUI Debug Tab
The TUI has a DEBUG tab that shows:
- LLM configuration
- Tool calls with arguments
- LLM errors with tracebacks
- Turn details (timestamps, dice rolls, response sizes, tool call counts)
- Phase progress (Phase 1/3, Phase 2/3, Phase 3/3)
## Key Code References
- `_call_llm` — Core LLM interaction with logging
- `generate_with_tools` — 3-phase conversational approach
- `generate_with_tools_single` — Single-call tool-based approach
- `_log_turn_details` — Comprehensive turn logging
- `_on_debug` — Structured debug output to TUI
## Common Errors & Fixes
### NameError: name 'random' is not defined
Add `import random` at the top of `tools/engine.py`. The `random` module is used in:
- Line ~488: Dice rolling in `_tool_roll`
- Line ~926: Random die roll in `generate_with_tools`
- Line ~1232: Random die roll in `generate_with_tools_single`
### NameError: name 'read_todo' is not defined
The `read_todo` function must be defined in `tools/run.py`. It reads TODO items from `journal.md`.
### NameError: name 'read_log_tail' is not defined
The `read_log_tail` function must be defined in `tools/run.py`. It reads the tail of the session log.
## Testing Workflow
1. **Before coding**: Run `./run.sh` to ensure imports and runtime are valid
2. **After coding**: Run `./run.sh --action "test action"` to test the engine
3. **Before committing**: Run both tests to ensure no missing imports
```bash
# Quick validation
./run.sh
# Test with engine action
./run.sh --action "I check on Rina"
# Manual tests
python3 tools/test_imports.py
python3 tools/test_runtime.py
```
## LLM Response Handling
The engine handles both `content` and `reasoning_content` fields from LLM responses:
```python
text = response.choices[0].message.content or response.choices[0].message.reasoning_content or ""
```
This allows compatibility with OpenAI-compatible servers that return content in the `reasoning_content` field instead of `content`.
## Timeout & Token Configuration
- Default timeout: 120 seconds (configurable in config.json)
- Default max tokens: 10000 (configurable in config.json)
- Adjust these values based on your LLM provider's limits
## Session Files
- `session/config.json` — LLM config (edit directly)
- `session/character.md` — PC state (written by engine)
- `session/world.md` — Realm state (written by engine)
- `session/book.md` — Story archive (written by engine)
- `session/journal.md` — TODO/DONE list (written by engine)
- `session/ambience.md` — Current ambience (written by engine)
- `session/log/<date>.md` — Session logs (written by engine)
- `session/tweaks.md` — House rules (manual edit)
## LLM Strategies Explained
### "conversational" Strategy
Uses three separate LLM calls:
1. **Prose** — Writes full book_log from context + player action
2. **Summarize** — Condenses book_log into one log line
3. **Extract** — Reads book_log and outputs tool calls for state changes
Retry loop: 3 attempts, Phase 3 fallback to Phase 1 if extraction fails.
### "tools" Strategy
Uses single LLM call with all tools available:
- System prompt instructs LLM to use tools for changes
- Single call outputs narrative + tool blocks together
- No retry loop — if it fails, turn fails
- Extracts tool blocks, applies changes, summarizes in one pass
## Debugging Tips
1. Check `llm.log` for detailed LLM interaction logs
2. Use the TUI's DEBUG tab for structured debug output
3. Run tests before making changes
4. Check config.json for LLM settings
5. Look for missing imports in the engine.py file
6. Verify that the LLM provider is correctly configured in config.json