Kafka Summit Logo
Organized by

Kafka Summit San Francisco 2018

Streaming platforms at massive scale.

October 16-17, 2018 | San Francisco

Using Chaos Engineering to Level up Apache Kafka Skills

Session Level: Advanced
Video & Slides

At ZipRecruiter, we connect employers with jobseekers. To do that, we record and process millions of events every day using Apache Kafka. Kafka availability is critical for our business. If Kafka is down—we are losing money.

In this talk, we aim to explain the path we’ve gone through after suffering from a production incident that occurred during a rolling upgrade. This incident led us to perform chaos engineering to get our team more acquainted with Kafka’s internals and how to deal with incidents. We will share some best practices of chaos engineering Kafka and learn how to overcome failures when they appear. We will also share some of our strategies on how to perform a cluster update—rolling upgrade vs. a blue green vs. active-passive approach, including pros and cons for each method.


We use cookies to understand how you use our site and to improve your experience. Click here to learn more or change your cookie settings. By continuing to browse, you agree to our use of cookies.