Cloud providers like AWS allow free data transfers within an Availability Zone (AZ), but bill users when data moves between AZs. When the data volume streamed through Kafka reaches big data scale, (e.g. numeric data points or user activity tracking), the costs incurred by cross-AZ traffic can add significantly to your monthly cloud spend. Since Kafka serves reads and writes only from leader partitions, for a topic with a replication factor of 3, a message sent through Kafka can cross AZs up to 4 times. Once when a producer produces a message onto broker in a different AZ, two times during Kafka replication, and once more during message consumption. With careful design, we can eliminate the first and last part of the cross AZ traffic. We can also use message compression strategies provided by Kafka to reduce costs during replication. In this talk, we will discuss the architectural choices that allow us to ensure a Kafka message is produced and consumed within a single AZ, as well as an algorithm that lets consumers intelligently subscribe to partitions with leaders in the same AZ. We will also cover use cases in which cross-AZ message streaming is unavoidable due to design limitations. Talk outline: 1) A review of Kafka replication, 2) Cross-AZ traffic implications, 3) Architectural choices for AZ-aware message streaming, 4) Algorithms for AZ-aware producers and consumers, 5) Results, 6) Limitations, 7) Takeaways.