Precomputed Graph Traversal in a Reactive Document Database

I hit a wall with a customer. Their data model had users in groups, and groups could sign up for activities. Activities had a nested table of signed-up groups. The query they needed was simple: "Which activities am I signed up for?" and "Who else from my group is in this activity?"

Without any join mechanism, answering those questions required iterating over every activity and scanning its nested members table. For their data set it was fast enough -- everything is in memory -- but it was O(activities * members) per query, and with 1,000 connected users each needing personalized results, the wasted compute was real. Worse, every single mutation to any activity would invalidate every user's query, causing all 1,000 viewers to recompute.

This is the N+1 query problem, but in a reactive system where every query runs continuously. The cost is not paid once per request -- it is paid on every state change for every viewer. I needed a fundamentally different approach.

The Problem with Joins in Streaming Systems

Traditional SQL joins are a request-time operation. You run a query, the planner builds an execution strategy, and you get results. Do it again and the planner runs again. This is fine when queries are sporadic, but in a streaming environment where queries are standing (always active, always watching for changes), join planning becomes a permanent cost on the write path.

Jamie Brandon wrote an excellent piece on why query planning for streaming systems is hard, and I agree with his analysis. The fundamental tension is that streaming hooks into the write path. Every write must be evaluated against every standing query. If your standing queries involve joins, every write to any joined table potentially triggers recomputation of every joined result.

My opinion, informed by building BladeRunner (a streaming system at Amazon), is that starting with a query language first is a recipe for churn. Better to build the primitives bottom-up and let the ergonomics emerge. That is exactly what I did with Adama's graph indexing: three keywords (assoc, join, traverse) that solve the problem at the data structure level rather than the query planner level.

Three Primitives: assoc, join, traverse

The graph indexing system has three parts that compose together:

assoc declares a named association between two record types. Think of it as naming a relationship:

assoc<Person, Tag> tagged;

This says "there exists a relationship called tagged from Person records to Tag records." It creates an RxAssocGraph internally -- a reactive data structure that maps from-IDs to sets of to-IDs with reference counting.

join connects an edge table to an association, defining how edge records map to from/to IDs:

join tagged via _edges[e] from e.person_id to e.tag_id;

This tells the runtime: "watch the _edges table. When an edge record appears, extract person_id and tag_id and add that edge to the tagged graph. When an edge record disappears, remove that edge."

traverse is a query operator that follows an association from a set of source records to destination records:

public formula all_tags = iterate _people traverse tagged;

This collects Person IDs from the iterate, looks up precomputed edges in the tagged graph, and returns the matching Tag records directly. No scanning.

How assoc/join/traverse Creates Precomputed Graph Edges _people (Person) id:1 "Alice" id:2 "Bob" id:3 "Carol" id:4 "Dave" _edges (PersonTag) person:1 -> tag:1 person:1 -> tag:2 person:2 -> tag:1 person:3 -> tag:3 _tags (Tag) id:1 "engineer" id:2 "manager" id:3 "intern" RxAssocGraph "tagged" (precomputed, reactive)

1 (Alice) -> {1, 2} 2 (Bob) -> {1} 3 (Carol) -> {3}

O(1) lookup per source ID Incremental update on edge change Reference counted for duplicates

join monitors iterate resolve

Why This Solves the N+1 Problem for Real-Time Apps

The traditional N+1 query problem: you load a list of N items, then for each item you run a query to load related data. N+1 total queries. In a request-response system, you solve this with eager loading or batching.

In a reactive system, the N+1 problem is worse. Every standing query that traverses a relationship is continuously re-evaluated. Without precomputed indexes, a formula like bubble my_groups = iterate _users where account == @who traverse in_group would require:

  1. Find the user (table scan or index lookup -- O(1) with index)
  2. For that user, scan every group's member table to find matches
  3. Return matching groups

Step 2 is O(groups * avg_members_per_group). With 200 groups averaging 50 members each, that is 10,000 comparisons per viewer per state change. With 1,000 viewers, that is 10 million comparisons per mutation. And mutations happen constantly in a real-time application.

With graph indexing, step 2 becomes a hash map lookup. The RxAssocGraph maps user IDs to sets of group IDs. Finding my groups is O(my_group_count), not O(total_groups * total_members). The graph updates incrementally: when a member is added to a group, only that single edge is processed. No full recomputation.

Nested Table Joins: The Killer Feature

The most powerful application of graph indexing is crossing into nested tables. Adama allows tables inside records -- a record in _groups can contain a table<Member> _members. This is natural for document store modeling but creates a query nightmare: finding which groups contain a specific user requires scanning every group and every nested members table.

Graph indexing solves this by joining through nested tables:

record User {
  public int id;
  public string name;
  public principal account;
}
table<User> _users;

record Member { int user_id; }

assoc<User, Group> _users_to_groups;

record Group {
  public int id;
  public string name;
  table<Member> _members;

  // Join through the NESTED table
  join _users_to_groups via _members[x] from x.user_id to id;
}
table<Group> _groups;

bubble my_groups = iterate _users where account == @who
                   traverse _users_to_groups;

Each Group record's join statement connects its nested _members table to the global association. The DifferentialEdgeTracker monitors every group's members table simultaneously. When you add a member to any group, only that edge is processed and added to the graph. The my_groups bubble resolves in constant time per result group.

Traditional N+1 vs Reactive Graph Traversal Without Graph Index Find user For each group (N groups): Scan _members table Check if user_id matches If match, include group Cost: O(groups * members) Per mutation, per viewer: 200 groups x 50 members = 10,000 comparisons x 1,000 viewers = 10,000,000 ops Every mutation triggers full recompute With Graph Index Find user Hash map lookup: graph[user_id] -> {group_ids} Cost: O(result count) On edge change (member added/removed): O(1) graph update, only affected viewers recompute Per mutation, per viewer: 1 hash lookup + result resolution Only affected viewers recompute Typically 3-5 ops instead of 10,000

Dynamic Edges and Conditional Relationships

Edges are reactive. Add an edge record and the graph updates. Remove one and it updates again. No manual cache invalidation:

channel add_edge(AddEdge msg) {
  _edges <- {person_id: msg.person, tag_id: msg.tag};
  // all_tags formula automatically reflects the new edge
}

channel remove_edge(RemoveEdge msg) {
  (iterate _edges where id == msg.edge_id).delete();
  // all_tags formula automatically drops the removed edge
}

For conditional edges -- soft deletes, feature flags, approval workflows -- the @maybe expression makes from/to values optional:

join link via edges[x]
  from (x.enabled ? @maybe(x.from) : @maybe<int>)
  to (x.enabled ? @maybe(x.to) : @maybe<int>);

When enabled is false, the edge is excluded from the graph. Toggle enabled to true and the edge appears. This lets you model approval workflows where relationships exist but are not yet active, without creating or destroying edge records.

Multiple Joins Per Association

An association can have multiple join statements from different source tables. The runtime uses reference counting internally, so if the same from-to pair comes from two different sources, the edge exists as long as at least one source contributes it:

join access via _direct[d] from d.user_id to d.resource_id;
join access via _group[g] from g.user_id to g.resource_id;

This naturally models "direct access OR group-based access" without union queries.

For bidirectional relationships (like "friends"), you declare two associations -- one for each direction -- and join both from the same edge table:

assoc<User, User> friends_of;
assoc<User, User> friends_with;

join friends_of via _friendships[f] from f.user_a to f.user_b;
join friends_with via _friendships[f] from f.user_b to f.user_a;

Associations also support a third type parameter for documenting the edge record type: assoc<User, Group, Membership> member_of;. This does not change behavior but makes the relationship's semantics explicit in the declaration.

Performance Characteristics and Tradeoffs

Here are the actual costs:

  • Edge insertion/deletion: O(1) per edge (hash map operations)
  • Traversal: O(source records + result records) -- proportional to input/output, not total edges
  • Memory: O(total edges) for the graph index
  • Updates: Differential -- only changed edge records are reprocessed each commit cycle

Graph recomputation happens during the document's commit cycle via __computeGraphs(), after all mutations in the transaction have been applied. This means you can add 50 edges in one message and the graph processes them all in one batch.

The memory cost is the main tradeoff. If you have a million edges, that is real memory (roughly 16-24 bytes per edge for the hash map entries plus reference counts). For most applications this is negligible -- a million edges in 24MB is nothing on a server with 8GB+ RAM. But if you are modeling a social graph with billions of edges per document, this is not the right tool.

The other tradeoff is that graph indexing is an upfront investment. You pay on every write to keep the index current, even if no one is reading. For tables with frequent writes and rare reads, a plain iterate ... where ... with a field index might be cheaper. The break-even point: if you have more than a few connected viewers and the relationship involves more than ~100 edge records, graph indexing wins.

Scenario Recommendation
Small tables (< 100 records) Manual lookups with field index are fine
Large junction tables (1000+ edges) Graph indexing avoids repeated scans
Nested table relationships Graph indexing is the only efficient option
Reactive formulas with many viewers Graph indexing for incremental updates
One-off procedural lookups Manual lookups are simpler

I am prouder of this feature than almost anything else in the language, because it solves a problem that is genuinely painful without it. The N+1 query problem in a reactive system is not a performance annoyance -- it is a scaling wall. Graph indexing turns it into a data structure problem with known, bounded costs.