As streaming platforms become central to data strategies, companies both small and large are re-thinking their architecture with real-time context at the forefront. Monoliths are evolving into Microservices. Datacenters are moving to the cloud. What was once a ‘batch’ mindset is quickly being replaced with stream processing as the demands of the business impose more and more real-time requirements on developers and architects.
This revolution is transforming industries.
What started at companies like LinkedIn, Uber, Netflix and Yelp has made its way to countless others in a variety of sectors. Today, thousands of companies across the globe build their businesses on top of Apache Kafka®. The developers responsible for this revolution need a place to share their experiences on this journey.
Kafka Summit is the premier event for data architects, engineers, devops professionals, and developers who want to learn about streaming data. It brings the Apache Kafka community together to share best practices, write code, and discuss the future of streaming technologies.
Welcome to Kafka Summit San Francisco 2019!
Established in 1999, the ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors. Our all-volunteer board oversees more than 350 leading Open Source projects, including Apache HTTP Server — the world’s most popular Web server software.
The ASF provides an established framework for intellectual property and financial contributions that simultaneously limits potential legal exposure for our project committers. Through the ASF’s meritocratic process known as “The Apache Way,” more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation’s official user conference, trainings, and expo.
Confluent, founded by the original creators of Apache Kafka®, pioneered the enterprise-ready event streaming platform. With Confluent, organizations benefit from the first event streaming platform built for the enterprise with the ease-of-use, scalability, security and flexibility required by the most discerning global companies to run their business in real time. Companies leading their respective industries have realized success with this new platform paradigm to transform their architectures to streaming from batch processing, spanning on-premises and multi-cloud environments. Backed by Benchmark, Index Ventures and Sequoia, Confluent is headquartered in Palo Alto and London with offices globally. To learn more, please visit www.confluent.io. Download Confluent Platform at www.confluent.io/download.
Accenture is a leading global professional services company, providing a broad range of services and solutions in strategy, consulting, digital, technology and operations. Combining unmatched experience and specialized skills across more than 40 industries and all business functions — underpinned by the world’s largest delivery network — Accenture works at the intersection of business and technology to help clients improve their performance and create sustainable value for their stakeholders. With 482,000 people serving clients in more than 120 countries, Accenture drives innovation to improve the way the world works and lives. Visit us at www.accenture.com
Google Cloud is widely recognized as a global leader in delivering a secure, open, intelligent and transformative enterprise cloud platform. Customers across more than 150 countries trust Google Cloud’s simply engineered set of tools and unparalleled technology to modernize their computing environment for today’s digital world.
Imply provides an enterprise-ready, real-time analytics solution built around Apache Druid. Druid is an open source database designed for high-speed ingestion and sub-second queries on event data. Large enterprises use Druid and Kafka to analyze clickstreams, user behavior, advertising data, network telemetry, application performance and more. Druid works out of the box with Kafka and provides exactly-once consumption from Kafka.
Founded in 1975, Microsoft (Nasdaq “MSFT”) is a Cloud first, Mobile first company delivering technologies that help businesses worldwide take advantage of mobile, enterprise social, and cloud computing trends to drive growth. Microsoft helps you to move to the cloud on your terms; getting the most value from your existing IT investments while giving you the flexibility to respond quickly to changing business needs. www.Microsoft.com
Slower is a customer-driven consulting firm that assists our customers to achieve successful outcomes in their businesses. Our team consists of a highly talented-unique group of thinkers, makers, and doers. Our offerings in Cloud, Data and A.I. are at the forefront of the people, process and technology opportunities our customers are facing. We believe transforming our customer’s businesses is as important as transforming the businesses of services delivery. Our team of people carry a continuous humility and drive to learn, adapt and evolve. All customer possibilities are achievable with Slower thinking.
Aiven provides Kafka as a Service along with 7 others across 6 different clouds and their regions, making Aiven the largest provider of managed open source data systems in terms of number of clouds, services, and plan options.
Attunity, a division of Qlik, is a leader in modern data integration enabling enterprises to employ a DataOps management strategy to drive transformative insights for better business outcomes. Attunity’s Data Integration platform accelerates the discovery and availability of analytics-ready data by automating real-time data streaming, refinement, cataloging and publishing. Attunity empowers companies to lead with data, uncover revenue opportunities, improve customer services and further their overall data literacy. Trusted by Fortune 1000 enterprises, Attunity provides software directly and indirectly through partners including Amazon Web Services, Microsoft, Google, SAP, Oracle, IBM and Hewlett Packard Enterprise. For more information, visit www.attunity.com.
Bosch: Established in 1886 as a workshop for Precision Mechanics and Electrical Engineering, Bosch has developed into a multinational company with roughly 410,000 associates worldwide and revenues of 78.5 billion Euros. Bosch supplies technologies and services throughout the world for Mobility Solutions, Industrial Technology, Consumer Goods, and Energy Building & Technology. After 130 years the group still acts with the vision and values of its founder, Robert Bosch. With an eye on the future, Bosch aims to build reliable, robust machines that will be able to learn continuously and act intelligently. Bosch has been conducting AI research for many years, and will incorporate AI into all of its products within the next decade, making AI one of its core competencies. From autonomous cars to smart homes, AI will transform products and services and create greater value for Bosch customers. BCAI: The Bosch Center for Artificial Intelligence, founded in early 2017, deploys cutting-edge AI technologies to generate real-world impact across Bosch products and services. The center’s goal is to achieve a leading position for Bosch in AI by attracting top talent, conducting differentiating research, and applying AI for the transformation of Bosch towards an AI-driven IoT company.
Camunda builds software for workflow and decision automation. The company develops the popular open source Camunda platform that supports the BPMN and DMN standards. Many organizations world-wide use Camunda for mission-critical business process automation, including Allianz, AT&T, NASA, T-Mobile and Universal Music. Headquartered in Berlin, Camunda has local presences in San Francisco and Denver and official partnerships with more than 100 IT system integrators in more than 30 countries.
CIGNEX Datamatics, a subsidiary of Datamatics Global Services Ltd., is a Michigan based global consulting company offering solutions, services and platforms on Open Source, Cloud and Automation tools & technologies. By leveraging multiple delivery models, we help organizations around the world achieve their business goals while significantly reducing their TCO. By leveraging Kakfa, we have created solutions for Stream Processing, Website Activity Tracking, Metrics Collection and Monitoring, Log Aggregation and Event Sourcing. With our expertise, we help enterprises build Big Data & IoT applications using Apache Kafka for real-time data streaming and analysis. For more information, visit https://www.cignex.com.
CrowdStrike is the leader in cloud-delivered endpoint protection. Leveraging artificial intelligence (AI), the CrowdStrike Falcon platform offers instant visibility and protection across the enterprise and prevents attacks on endpoints on or off the network. CrowdStrike Falcon deploys in minutes to deliver actionable intelligence and real-time protection from Day One. It seamlessly unifies next-generation AV with best-in-class endpoint detection and response, backed by 24/7 managed hunting. Its cloud infrastructure and single-agent architecture take away complexity and add scalability, manageability, and speed. CrowdStrike Falcon protects customers against all cyber attack types, using sophisticated signatureless AI and Indicator-of-Attack (IOA) based threat prevention to stop known and unknown threats in real time. Powered by the CrowdStrike Threat Graph™, Falcon instantly correlates over 150 billion security events a day from across the globe to immediately prevent and detect threats. There’s much more to the story of how Falcon has redefined endpoint protection but there’s only one thing to remember about CrowdStrike: We stop breaches
Datadog is a monitoring and analytics platform for cloud-scale infrastructure and applications. Datadog provides full-stack observability by combining logs, infrastructure metrics and events, application performance metrics and end-to-end tracing. With flexible graphs and dashboards, sophisticated alerting, and machine learning functionality for anomaly and outlier detection, the platform provides actionable insight into dynamic, modern environments. Datadog features 250+ vendor-supported integrations, with simple configuration and built-in template dashboards.
DataStax delivers the only active everywhere hybrid cloud database built on Apache Cassandra™: DataStax Enterprise and DataStax Distribution of Apache Cassandra, a production-certified, 100% open source compatible distribution of Cassandra with expert support. The foundation for contextual, always-on, real-time, distributed applications at scale, DataStax makes it easy for enterprises to seamlessly build and deploy modern applications in hybrid cloud. DataStax also offers DataStax Managed Services, a fully managed, white-glove service with guaranteed uptime, end-to-end security, and 24x7x365 lights-out management provided by experts at handling enterprise applications at cloud scale. More than 400 of the world’s leading brands like Capital One, Cisco, Comcast, Delta Airlines, eBay, Macy’s, McDonald’s, Safeway, Sony, and Walmart use DataStax to build modern applications that can work across any cloud. For more information, visit www.DataStax.com and follow us on Twitter @DataStax.
Diamanti delivers the industry’s only purpose-built, fully integrated Kubernetes platform—giving platform architects, IT operations, and application owners the performance, simplicity, efficiency, and enterprise features they need to get cloud-native applications to market fast. Based in San Jose, California, Diamanti is backed by venture investors CRV, DFJ, GSR Ventures, Northgate Capital, Translink Capital, and Goldman Sachs. For more information visit www.diamanti.com or follow @DiamantiCom.
Expero develops custom software exclusively for domain-expert users like scientists, traders, engineers, healthcare professionals and government officials. We succeed not by being experts – though we are – but by quickly learning our clients’ domains and becoming true partners in their problem solving. With decades of combined experience in user experience design, architecture & development, and technology innovation, we build what others say can’t be done
Hazelcast is the leading in-memory computing platform that enables organizations to leverage a highly resilient and elastic memory resource for data at rest and in motion. Our technology is behind many of today’s leading financial, e-commerce/retail, telecommunications, healthcare and government organizations. Whether it is real-time inventory and shipping information, lighting quick fraud detection or gleaning insights that lead to product innovation, Hazelcast enables companies to achieve success in microseconds.
HVR is the leading independent provider of real-time data replication technology powered by log-based Change Data Capture (CDC). Log-based CDC enables customers to stream information from the many places it is stored within their organization, such as Oracle, SQL Server, SAP and more. This gives them the ability to fully optimize their use Apache Kafka technology for a better business. Learn More: hvr-software.com
IBM Event Streams is an event-streaming platform based on the open-source Apache Kafka® project. Event Streams helps you build intelligent, responsive applications that react to events in real-time, to deliver more engaging experiences for your customers.
Digital transformation changes expectations: better service, faster delivery, with less cost. Businesses must transform to stay relevant and data holds the answers. As the world’s leader in Enterprise Cloud Data Management, we’re prepared to help you intelligently lead—in any sector, category or niche. Informatica provides you with the foresight to become more agile, realize new growth opportunities or create new inventions. With 100% focus on everything data, we offer the versatility needed to succeed. We invite you to explore all that Informatica has to offer—and unleash the power of data to drive your next intelligent disruption.
Based in Silicon Valley, and founded by the team that built the technology Facebook uses to understand the behavior of its 2B users, Interana provides the world’s most advanced enterprise platform for product analytics and behavioral analysis in a GDPR era. Used by companies such as Microsoft, Comcast, Goodyear, Uber, Bleacher Report and many others, Interana is the only solution that allows business users to analyze trillions of data points, iteratively and in real-time, to go beyond the static reports and dashboards of traditional BI and analytics tools, and surface business insights that would otherwise remain hidden. www.interana.com / @interanacorp
Lenses.io is a DataOps platform for streaming technologies like Apache Kafka. Lenses® enables a seamless experience for running your Data Platform on-prem, cloud or hybrid and put dataOps in the heart of your business operations. Provides self-service data-in-motion control, build and monitor your data flows whilst security, data governance and data ethics are treated as first-class citizens. As a streaming platform overlay technology, Lenses® integrates with Kubernetes and can run with any distribution of Apache Kafka including AWS MKS and Azure HDInsight. Wanna give it a try? Find more at https://lenses.io
Lyft was founded in 2012 by Logan Green and John Zimmer to improve people’s lives with the world’s best transportation, and is available to approximately 95 percent of the United States population as well as select cities in Canada. Lyft is committed to effecting positive change for our cities by offsetting carbon emissions from all rides, and by promoting transportation equity through shared rides, bikeshare systems, electric scooters, and public transit partnerships.
MemSQL envisions a world where every business can make decisions in real time and every experience is optimized through data. To do that, enterprises need to ingest, analyze, and act on massive volumes of rapidly changing data. This is why we built MemSQL, the No-Limits Database™. Tested and proven as the world’s fastest database for operational analytics, MemSQL gives businesses a platform purpose-built for breakthroughs. Global enterprises in every industry use the MemSQL distributed database to compete and win in today’s insight-driven economy.
mParticle is the customer data platform for every screen. Sophisticated marketers at companies like NBC Universal, Spotify and Airbnb use mParticle to integrate and orchestrate their entire growth stack, enabling them to win in key moments of the customer journey.
Neo4j is the leading graph database platform. The Neo4j Graph Platform helps organizations make sense of their data by revealing how people, processes, locations, and systems are interrelated. This connections-first approach powers applications tackling artificial intelligence, fraud detection, real-time recommendations, and master data. Neo4j boasts the world’s largest dedicated investment in graph technology, has amassed more than 20 million downloads and has a huge developer community deploying graph applications around the globe.
Redis Labs, home of Redis, the world’s most popular in-memory database, and provider of Redis Enterprise, delivers superior performance, reliability and flexibility for personalization, machine learning, IoT, search, ecommerce, social and metering solutions. Modern businesses depend on Redis Labs to deliver instant experiences, reliably and at scale. Redis Enterprise is trusted by three of top five communication and healthcare companies, six of top eight technology companies, and four of top seven retailers. Redis has been voted the most loved database, rated the most popular database container, and #1 cloud database.
Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver reliable and high-performing Linux, hybrid cloud, container, and Kubernetes technologies. Red Hat helps customers develop cloud-native applications, integrate existing and new IT applications, and automate and manage complex environments. A trusted adviser to the Fortune 500, Red Hat provides award-winning support, training, and consulting services that bring the benefits of open innovation to any industry. Red Hat is a connective hub in a global network of enterprises, partners, and communities, helping organizations grow, transform, and prepare for the digital future.
Rockset is a serverless search and analytics engine that delivers millisecond-latency SQL over TBs of raw data, without any ETL. Rockset integrates with Kafka to continuously ingest event streams without requiring a schema, while providing full SQL support for filtering, aggregations and joining streaming data with other data sets. Rockset powers data-driven applications and interactive dashboards without requiring users to manage custom pipelines, servers or databases. Try Rockset, and go from useful data to useful applications in minutes, at rockset.com
Scylla is the real-time big data database, with scale-up performance of 1,000,000 OPS per node, scale-out to hundreds of nodes and 99P latency of <1 msec. Fully compatible with Apache Cassandra, Scylla embraces a shared-nothing approach that increases throughput and storage capacity to 10X that of Cassandra. From the team responsible for the KVM hypervisor, Scylla helps organizations realize order-of-magnitude performance improvements, reduce hardware costs and lessen administration. For more information: ScyllaDB.com
SignalFx, the only real-time cloud monitoring platform for infrastructure, microservices, and applications, collects and analyzes metrics and traces across every component in your cloud environment. Built on a massively scalable streaming architecture, SignalFx applies advanced analytics and data-science-directed troubleshooting to let operators find the root cause of issues in seconds. SignalFx is trusted by leading enterprises across most every industry sector.
Solace provides the only unified advanced event broker technology that supports publish/subscribe, queueing, request/reply, message replay and streaming using open APIs and protocols across hybrid cloud and IoT environments. The company’s smart data movement technologies rapidly and reliably route information between applications, devices and people, as well as across public and private clouds. Established enterprises such as SAP, Barclays and the Royal Bank of Canada, high-growth companies such as VoiceBase, and industry disruptors such as Jio use Solace to modernize legacy applications and successfully pursue analytics, hybrid cloud and IoT strategies.
StreamSets transforms how enterprises flow big and fast data from myriad sources into data centers and cloud analytics platforms. Its DataOps platform helps companies build and operate continuous dataflow topologies, combining award-winning open source data movement software with a cloud-native Control Hub. Enterprises use StreamSets to enable cloud analytics, data lakes, Apache Kafka, IoT and cybersecurity. For more information, visit www.streamsets.com.
TIBCO fuels digital business by enabling better decisions and faster, smarter actions through the TIBCO Connected Intelligence Cloud. From APIs and systems to devices and people, we interconnect everything, capture data in real time wherever it is, and augment the intelligence of your business through analytical insights. Thousands of customers around the globe rely on us to build compelling experiences, energize operations, and propel innovation. Learn how TIBCO makes digital smarter at www.tibco.com.
Tinder is the world’s leading app for meeting new people. Available in 190 countries and 40+ languages, Tinder is a top 5 grossing non-gaming app globally. Kafka at Tinder plays the following critical roles:
1. A central messaging system to power Tinder’s data pipeline that collects, aggregates and transforms billions of events each day for our BI and ML;
2. A robust event processing pipeline that powers critical applications such as payment processing, push notifications, user behavioral classification and abuse detections and much more;
3. A streaming platform to provide consumable, real-time streaming events for change data capture. It enables our backend systems to move toward event-based processing and decouples inter-service dependencies;
4. A highly scaled messaging bus for collecting and transporting logs and observability metrics.
Unravel radically simplifies the way businesses understand and optimize the performance of their modern data applications – and the complex pipelines that power those applications. Providing a unified view across the entire stack, Unravel’s AI-powered data operations platform leverages AI, machine learning, and advanced analytics to offer actionable recommendations and automation for tuning, troubleshooting, and improving performance – both today and tomorrow.
Wavefront is a SaaS-based metrics monitoring and analytics platform that handles the high-scale requirements of modern cloud-native applications. Wavefront’s speed, scale and flexibility allows DevOps and developer teams instant insight into the performance of their highly-distributed cloud-native services. Wavefront’s analytics, query-driven alerts, interactive visualizations, open API, and integrations, all powered by massively scalable time-series database delivers “the first pane of glass” visibility helping DevOps teams detect performance anomalies while ensuring high availability of key cloud services. Developers can self-serve and adapt Wavefront analytics to unique needs of their code while gaining visibility into its production behavior.