The Dungeon Master Pattern

Durable State Machines That Survive Server Crashes

I spent years building multiplayer backends with Node.js, and the operational pain was real. Every time I restarted a server, every in-flight game session died. Players lost their progress. The state machine driving the game logic -- whose turn it was, what phase the round was in, which cards had been played -- all gone. I had to build external coordination layers: Redis for state, RabbitMQ for message queuing, Postgres for checkpointing. The saga pattern crept in with its compensating transactions, and suddenly my "simple card game server" had more infrastructure than a banking system.

The dungeon master pattern is the alternative I built into Adama. The server controls the workflow -- like a tabletop game's Dungeon Master -- and the entire execution state survives crashes, restarts, and migrations between hosts. No external coordinators. No saga compensations. Just straight-line code that happens to be immortal.

The Core Idea: Server-Driven Workflows

Most backend architectures are passive. A request arrives, you process it, you respond. The client orchestrates everything. But games, approval workflows, auctions, and onboarding flows all share a common property: the server needs to drive the conversation. It decides whose turn it is, asks specific people for input, enforces ordering, and handles timeouts.

In Adama, this is expressed through state machines defined with the # notation. Each state is a named block of code that executes when the document enters that state:

#waitingForPlayers {
  // Logic runs when we enter this state
}

#gameInProgress {
  // Different logic here
}

#gameOver {
  // Final state
}

The transition keyword moves between states. Crucially, transitions are not immediate jumps -- they schedule the next state to run after the current transaction commits. This means every state transition is a durable checkpoint.

State Machine Lifecycle with Durable Persistence

#setup #playerTurn #scoring #gameOver next player Durable Storage (persisted after every transition) state label + document fields + pending futures + delayed transitions Server Crash at #playerTurn Awaiting player2's move... State + pending fetch persisted to disk restart Server Recovers Loads snapshot, resumes at #playerTurn player2's pending request still active

Channels: The Client-Server Communication Layer

Channels are how clients talk to documents. A channel is a typed endpoint that accepts a specific message type and runs handler code. Think of them as the API surface of a document:

message Say { string text; }

channel say(Say msg) {
  _chat <- {who: @who, text: msg.text, when: Time.datetime()};
}

But the real power shows up with incomplete channels -- channels declared without handlers. These exist so the state machine can ask specific clients for input, flipping the usual client-server relationship:

message Move { int x; int y; }
channel<Move> move_channel;

Fetch and Await: Durable Async That Survives Anything

The fetch and await pair is where the dungeon master pattern gets interesting. fetch requests input from a specific principal (player). await blocks until that input arrives. But this is not a busy-wait. The document state -- including the pending request -- is persisted to durable storage. The server can crash, restart on a different machine, and the await will resume exactly where it left off when the player finally responds.

#playerTurn {
  future<Move> f = move_channel.fetch(current_player);
  Move m = f.await();
  applyMove(m);

  if (checkWinner()) {
    transition #gameOver;
  } else {
    current_player = getNextPlayer();
    transition #playerTurn;
  }
}

Beyond fetch, channels provide decide and choose for structured input. decide(principal, options[]) restricts the player to picking from a set of valid options -- no cheating. choose(principal, options[], limit) lets the player select a subset. Both return futures and are fully durable:

channel<Play> play;

#playerTurn {
  list<Play> open = iterate _board where piece == @no_one;
  if (play.decide(current_player, @convert<Play>(open)).await() as pick) {
    applyMove(pick);
  }
  transition #nextTurn;
}

The separation of fetch and await into two calls is deliberate. It enables parallel requests to multiple players:

#getResponses {
  future<Answer> f1 = answer.fetch(player1);
  future<Answer> f2 = answer.fetch(player2);

  // Both players respond independently. Order doesn't matter.
  Answer a1 = f1.await();
  Answer a2 = f2.await();

  processAnswers(a1, a2);
  transition #nextPhase;
}

Even though we await player1 first, player2 can respond at any time. If player2 responds before player1, their answer is queued and immediately available when we reach f2.await().

Delayed Transitions: Durable Timers Without Cron

Add in followed by a number of seconds to delay a transition:

#bidding {
  transition #auctionClosed in 3600;  // Close after 1 hour
}

These delayed transitions are durable. Schedule a transition for one hour from now, crash the server after 30 minutes, and the transition still fires 30 minutes after recovery. No external cron jobs. No Redis-backed timers. No SQS delay queues.

This matters for real applications. Auction deadlines, session timeouts, scheduled events -- they all execute reliably as a built-in property of the document.

Why This Kills the Saga Pattern

The saga pattern exists because distributed systems need multi-step workflows that can fail partway through. Each step has a compensating transaction to undo it if a later step fails. In practice, this means:

  1. You need an external saga orchestrator (or choreography bus)
  2. Every step needs a forward action AND a compensating action
  3. You need to handle the case where the compensating action itself fails
  4. You need idempotency keys everywhere
  5. You need a persistent store for saga state
  6. You need monitoring for stuck sagas

Saga Pattern (Microservices)

Orchestrator Message Queue Step 1: Reserve Compensate: Unreserve Step 2: Charge Compensate: Refund Step 3: Ship Compensate: Cancel Saga State Store (Redis/DB) Idempotency Key Store Dead Letter Queue + Monitoring

7+ moving parts per workflow

Dungeon Master Pattern (Adama)

Single Document

#reserve { reserveItem(); chargeCard(); shipOrder(); transition #done; }

Auto-persisted + auto-rollback on abort Durable across crashes, zero external deps

1 document, straight-line code

The dungeon master pattern eliminates all of this. Your workflow is a state machine inside a single document. Each state transition is atomically persisted. If anything goes wrong, the abort keyword rolls back all state changes in the current transaction -- no compensating transactions needed because the state never committed in the first place.

The transactional boundary in Adama is a single message (or a multi-message state machine transition that commits atomically at the end). The runtime monitors all state changes and builds a data differential. If you abort, the inverse differential is applied. If you commit, the differential is durably persisted. There is no partial failure state to handle.

The Invoke Keyword: State Subroutines

The invoke keyword calls a state as a subroutine, returning to the caller when done:

#setupGame {
  invoke #shuffleDeck;
  invoke #dealCards;
  invoke #setupBoard;
  transition #playRound;
}

This reads like a recipe. invoke gives you reusable state logic without leaving your current flow, while transition schedules a genuine state change with a new transaction boundary.

What This Costs

I should be honest about the tradeoffs. The dungeon master pattern constrains your architecture in specific ways:

Single-writer per document. Only one host processes a document at a time. For a board game with 4 players, this is irrelevant. For a global chat with 100,000 concurrent users in one room, you will hit limits. The scaling model is many small documents, not one giant document.

Latency depends on persistence. Every state transition requires a durable write. If your storage backend has 5ms write latency, that is your floor for state transitions. For turn-based games, 5ms is invisible. For a real-time action game at 60fps, it is a non-starter.

You are learning a new language. Adama is not JavaScript or Python. The # notation, fetch/await semantics, transition vs invoke -- these are concepts you need to internalize. The payoff is eliminating 5-7 infrastructure components, but the learning curve is real.

Document size limits. Everything lives in memory when active. A document with 10MB of state is fine. A document with 1GB of state will cause problems. Design your data model with this in mind -- keep documents focused.

The Pattern Applied: A Complete Card Game Round

Here is what a real game round looks like, combining everything discussed:

#round {
  future<DrawCount> draw_fut = how_many.fetch(current_player);
  DrawCount draw = draw_fut.await();

  invoke #drawCards;
  invoke #playCard;

  current_player = (current_player == player1) ? player2 : player1;

  if (deckEmpty()) {
    transition #scoring;
  } else {
    transition #round;
  }
}

This code asks the current player how many cards to draw, waits (durably) for the answer, draws cards, asks them to play a card, switches players, and loops. If the server crashes between the draw and the play, it resumes at exactly the right point. If a player disconnects for three days and comes back, their pending request is still waiting.

No message queues. No saga orchestrators. No Redis. No cron jobs. No compensating transactions. Just code that reads like the game's rules and survives anything the infrastructure throws at it.

That is the dungeon master pattern. The server is in charge, and it does not forget.