At Shopify we manage multiple Apache Kafka clusters in multiple locations in Google’s cloud platform. We deploy our Kafka clusters as Kubernetes StatefulSets, and we use other K8s workloads to implement different tasks. Automating critical and repetitive operational tasks is one of our top priorities.
In this talk we’ll discuss how we leveraged Kubernetes Custom Resources and Controllers to automate some of the key cluster operational tasks, to detect clusters configuration changes and react to these changes with required actions. We will go through actual examples we implemented at Shopify, how we solved the problem of cluster discovery and how we automated topics creation across different clusters with zero human intervention and safety controls.