BUILD LOG // 02

Multi-agent orchestrator on MongoDB Change Streams

Next.js + MongoDB replica set + Change Streams + Server-Sent Events - how we dispatch 17 agent types in real time without a queue system.

2026-04-17 ·9 min read ·AISO-DEV build-logmongodbchange-streamssse

TL;DR

17 agent types dispatched from a Next.js + MongoDB replica-set app.
No Redis, no Kafka, no queue worker. MongoDB Change Streams → Server-Sent Events → Claude Code orchestrator skill.
Event-to-dispatch latency under 200 ms end-to-end.
The app is config + events only. Agent dispatch lives in a Claude Code skill that inherits all user MCPs and skills.
Trade-offs we’d defend and one we’d revisit.

The problem

We run a lot of agents. Research, copywriting, engineering, QA, operations - 17 distinct agent types, each with its own config (model, thinking level, tools, concurrency, thresholds). This is the live skeleton behind our AI agent development practice. We needed:

A web UI to configure agent types and watch activity live.
Real-time dispatch when new work arrives (a message, a task, a change).
A control plane that doesn’t spawn agents itself (because agents need access to skills, MCPs, and hooks the control plane doesn’t have).
Inter-agent messaging with threads, priorities, and cascading client/brand/project context.
Something cheap to run.

What we didn’t do

No message queue

We looked at Redis + BullMQ, Kafka, NATS. For our scale (hundreds of events per day, not per second), they’re overkill. They also introduce a separate process, a separate operational surface, and a separate failure mode.

No long polling

Agents need to react within a second or two of a new message. Long polling works but generates a lot of traffic and eats compute on both ends.

No spawning agents from the app

This is the big one. The Next.js app runs in a stateless-ish environment. Agents need access to:

Claude Code’s Agent tool (to spawn subagents).
All user MCP servers.
All user skills.
File system, git, browsers, the whole toolbox.

If the app spawns agents, we either give the app the whole toolbox (massive attack surface and config drift) or we run agents as crippled versions of themselves. Neither is acceptable.

So the app does not spawn agents. Ever. That separation - control plane in code, capability plane in skills - is a default we now bake into every custom AI development engagement.

What we did

MongoDB replica set + Change Streams

MongoDB’s Change Streams expose a real-time feed of changes to a collection, backed by the oplog. Any insert, update, or delete on a watched collection fires an event. Consumers subscribe with a resume token so they can survive disconnects.

Requirements: MongoDB must run as a replica set (even of one), not standalone. That’s the only operational cost.

We watch three collections: messages, tasks, agent-types-config.

Server-Sent Events from Next.js

Next.js route handler exposes /api/events as a long-lived SSE stream. On connection, we subscribe to the relevant Change Streams. Each change gets pushed to the SSE client as a typed event.

event: message-created
data: {"id":"...","priority":"high","assignedTo":"marketing-team-lead"}

SSE is underrated. One connection, server-pushed, built into every browser and every HTTP library, no websocket upgrade dance, no subprotocol negotiation.

The Claude Code orchestrator skill

On the other end of the SSE stream sits a Claude Code session running the /orchestrator skill. It:

Subscribes to /api/events.
Reads each event.
Fetches the relevant context (agent config, message thread, project tree).
Dispatches the right agent via Claude Code’s Agent tool.

Because it runs inside Claude Code, the dispatched subagent inherits every skill, MCP server, and hook we’ve installed - file system, git, search, the messaging MCP, browser automation, all of it. The control plane is config + events; the skill is capability.

The shape end to end

 ┌────────────┐   insert    ┌──────────────────┐
 │ User / API │────────────▶│ MongoDB replica  │
 └────────────┘             └────────┬─────────┘
                                     │ Change Stream
                                     ▼
                            ┌─────────────────┐
                            │ Next.js app     │
                            │  /api/events    │
                            └────────┬────────┘
                                     │ SSE
                                     ▼
                            ┌─────────────────┐
                            │ Claude Code     │
                            │ /orchestrator   │
                            └────────┬────────┘
                                     │ Agent tool
                                     ▼
                            ┌─────────────────┐
                            │ Subagent        │
                            │ (full capability)│
                            └─────────────────┘

End-to-end, sub-200 ms from db.messages.insertOne() to the subagent receiving the dispatch. This is the dispatch backbone we reuse on every agentic systems build where real-time matters.

Inter-agent messaging

The Agent Messaging MCP sits on top of the same MongoDB. Agents send messages, reply to threads, mark work done - all through MCP tools. Threads carry priority, assignee, and cascading client/brand/project tags. The orchestrator skill relays replies back to the originating agent, so a research agent can ask a copywriter a question and resume when the reply arrives.

Because it’s built on the same database, the dashboard shows every thread live. No separate pub/sub, no extra infrastructure.

Operational notes

Replica set of one is fine for dev. Three-node for production. Oplog window ≥ 24h.
Resume tokens persist in MongoDB. A disconnected consumer reconnects without replaying the world.
SSE reconnection is in the browser / fetch spec - EventSource handles it natively; Claude Code’s side uses a small reconnect loop.
Backpressure: SSE doesn’t do flow control. We never saw it matter at our event rate. If you push 1000/s you’ll want a queue.
Monitoring: the system-health endpoint exposes event rate, connected orchestrator count, and stale-record count. Cheap.

Trade-offs we’d defend

No queue. Change Streams + SSE gets us real-time dispatch without a second broker. At our scale, this is a win. At 10x scale, we’d reassess.
App doesn’t spawn agents. The separation is deliberate. Control plane ≠ capability plane.
MongoDB. We picked it specifically for Change Streams. If we weren’t using Change Streams we’d probably be on Postgres with LISTEN/NOTIFY, which has similar ergonomics.

One trade-off we’d revisit

The SSE /api/events endpoint is a single Next.js route. At our current load it’s fine, but Next.js edge runtimes have time limits on long-lived connections. We either need to move that route to a dedicated Node process or pin it to the Node runtime. We’re currently on the second option; the first is cleaner.