Why event-driven architecture matters for agents
Agent platforms are naturally asynchronous: user requests, tool calls, model responses, retries, and long-running workflows all happen at different times. If you model this as direct request-response chains, throughput collapses under load. If you model it as event streams, each component can scale independently.
DSA lens: queues, heaps, and hashes in production
- Priority queues (heaps): schedule urgent conversations before low-priority background jobs.
- Hash maps: cache conversation state and tool outputs by deterministic keys.
- Sliding windows: apply rate limits per tenant using Redis sorted sets.
- Bloom filters: prevent duplicate processing for at-least-once delivery semantics.
Role of each system component
Kafka for durable streams
Use Kafka for immutable event logs: agent.requested, agent.tool_called, agent.completed. Keep retention long enough for replay and postmortems.
RabbitMQ for workflow orchestration
Use RabbitMQ when you need work queues with acknowledgements, delayed retries, and dead-letter exchanges for failed tool executions.
Redis for fast state and locks
Store active session state, short-lived memories, token budgets, and distributed locks. Redis lets agents coordinate without waiting on primary databases.
Memcached for cheap hot-read caching
Keep read-heavy, disposable values in Memcached: prompt templates, static tool metadata, and feature flags resolved at the edge.
Reference flow
- Gateway emits
agent.requestedto Kafka. - Planner consumes event and enqueues tool tasks in RabbitMQ.
- Workers read task, fetch context from Redis, execute tools, and cache artifacts.
- Result events are published to Kafka and assembled by a response composer.
- Final response is stored, streamed to user, and indexed for retrieval.
Common scaling pitfalls
- Unbounded fan-out: one request triggers too many downstream events.
- No idempotency keys: retries create duplicate side effects.
- Cache stampede: many workers recompute the same context at once.
- Single-tenant hot partitions: bad Kafka partition key skews load.
Production checklist
- Define event contracts and version them.
- Add correlation IDs on every message.
- Enforce idempotency at consumer boundaries.
- Measure queue depth, lag, and retry rates per tenant.
- Run replay drills from Kafka topics monthly.
"Scalable agents are less about one giant model and more about reliable data structures and event contracts working together."
- Nikhil Rao