When a new project, Global Trip Record was launched at American Express GBT and we were looking for a robust, scalable and fault-tolerant middleware to handle all the orchestration and connectivity needs of the project.
The existing solution was monolithic, and we wanted to convert that to a microservices framework, but the biggest challenge was managing the increasing number of external applications that are connected to the platform. Any slow external application or partner system connected to the platform was slowing down the entire platform. There is always a need for partner systems to go offline or a need to resend the entire day’s data, especially with a system like our data lake where the data volumes are huge.
After evaluating multiple solutions, we settled on Apache Kafka, and started with a simple implementation of around 100,000 messages to just decouple one partner system and the core platform.
Today, we are running our microservices (Docker) running in OpenShift (Kubernetes) processing Kafka Streams, running real-time anomaly detection using Kafka Streams, powering our data lake through Kafka, feeding our distributed caching layer (Apache Ignite) and connecting all internal and external systems using Kafka. With a total of more than 10 million messages per day, i.e., 1.5TB of data with just a small three-node cluster, we are one happy platform for over a year now. With the kind of stability, flexibility and success in our project, a lot of other teams started and will soon be in production with Kafka Steams. The powerful combination of Kafka and OpenShift has proven to be an easily scalable model with great elasticity to the entire platform.