We recently learned about “Fault Tree Analysis” and decided to apply the technique to bulletproof our Apache Kafka deployments. In this talk, learn about fault tree analysis and what you should focus on to make your Apache Kafka clusters resilient. This talk should provide a framework that answers the following common questions a Kafka operator or user might have:
-What guarantees can I promise my users?
-What should my replication factor?
-What should the ISR setting be?
-Should I use RAID or not?
-Should I use external storage such as EBS or local disks?