Skip to content

📨 Messaging Systems - Kafka, RabbitMQ, SQS & Pub/Sub

The Senior Mindset: Don’t ask “which is better?” Ask “What are the delivery guarantees I need?” and “How do I want to handle state?” Choosing a broker is a trade-off between throughput, latency, and the complexity of the consumer logic.


🚦 Broker Architectures: Two Main Philosophies

Section titled “🚦 Broker Architectures: Two Main Philosophies”

1. Message Queues (Smart Broker, Dumb Consumer)

Section titled “1. Message Queues (Smart Broker, Dumb Consumer)”

The broker tracks which messages are consumed and deletes them once acknowledged.

  • RabbitMQ: High feature set (routing, priorities). Best for complex workflows.
  • AWS SQS: Fully managed, infinitely scalable, but simpler routing logic.
  • Behavior: Usually Point-to-Point. One message is processed by exactly one consumer.

2. Log-based Streaming (Dumb Broker, Smart Consumer)

Section titled “2. Log-based Streaming (Dumb Broker, Smart Consumer)”

The broker is a distributed append-only log. It doesn’t track consumption; consumers track their own “offset” (position in the log).

  • Apache Kafka / Amazon MSK: Built for massive throughput and data retention.
  • Google Cloud Pub/Sub: A managed hybrid that scales globally.
  • Behavior: Fan-out. The same stream of events can be read by multiple different service groups simultaneously.

FeatureRabbitMQApache KafkaAWS SQS
ModelPush (Broker pushes to consumer)Pull (Consumer requests data)Pull (Short/Long polling)
PersistenceDeleted after ACKPersistent (Retention policy)Deleted after ACK
OrderingGuaranteed within a queueGuaranteed within a PartitionBest-effort (or strict with FIFO)
ScalingVertical / Cluster-basedHorizontal (Adding partitions)Native / Serverless
Best ForTask Queues, RPC, Complex RoutingLog Aggregation, Event Sourcing, Big DataDecoupling Microservices (Cloud-native)

🛠️ Delivery Guarantees (The Senior Perspective)

Section titled “🛠️ Delivery Guarantees (The Senior Perspective)”

You must choose which “lie” you can live with:

  1. At-Most-Once: Messages may be lost, but never duplicated. (Fastest, lowest overhead).
  2. At-Least-Once: Messages are never lost, but may be delivered more than once. (The industry standard). Requires consumers to be idempotent.
  3. Exactly-Once: Theoretically impossible across a network, but “effectively” achieved by Kafka through transactional writes and idempotent producers. (Highest overhead).

  • You need complex routing (e.g., using Header or Topic exchanges).
  • You need built-in Message Priority.
  • You are working with legacy protocols like AMQP or MQTT.
  • You need to replay data (e.g., rebuilding a database from an event log).
  • You have massive throughput requirements (millions of events per second).
  • You are implementing Event Sourcing or CQRS.
  • You are in the AWS ecosystem and want Zero Maintenance.
  • You need to handle huge spikes in volume without managing a cluster.
  • Your architecture is mostly “Fire and Forget” task processing.

A message that causes a consumer to crash every time it’s read.

  • Solution: Use Dead Letter Queues (DLQ). After X failed retries, the broker moves the message to a separate queue for manual debugging.

If the producer is faster than the consumer, the queue grows.

  • RabbitMQ: Can run out of memory and crash.
  • Kafka: Disk space fills up, but doesn’t affect broker performance as much.
  • Strategy: Monitor Consumer Lag religiously. If lag increases, scale your consumer instances.

💡 Seniority Note: A message broker is stateful infrastructure. It is much harder to maintain than a stateless API. Before adding Kafka to your stack, ask if a simple Redis Pub/Sub or even a Database-backed queue (like Postgres SKIP LOCKED) is enough for your current scale.


  • [[Event-Driven-Architecture]]
  • [[Architecture-Resilience-Patterns]]
  • [[Infrastructure-Cloud-Providers]]