Kafka Summit Logo
Organized by

Kafka Summit San Francisco 2016

April 26, 2016 | San Francisco

About this Event

Discover the future of industries

With the arrival of streaming platforms, industries are being rethought with real-time context at the forefront. What was once a ‘batch’ manner of thinking about business is quickly being replaced with streams as real world examples are entering everyday life.

This revolution is transforming industries. What started at companies like LinkedIn, Uber, Netflix and Yelp has made its way to countless others. Today, thousands of companies across the globe build their businesses on top of Apache Kafka. The developers responsible for this revolution need a place to share their experiences on this journey.


Access Kafka Summit 2016 session recordings and slides


Read more about Kafka Summit:

Kafka Summit Sponsors

Citus Data

Citus Data offers solutions that empower real-time big data by scaling out PostgreSQL. The Citus database powers both real-time operational and analytic workloads across billions of events and supports both structured and JSON data. Citus Data also offers cstore_fdw, an open source extension for creating columnar PostgreSQL tables for reduced storage footprint. Based in San Francisco, Citus Data is a Y Combinator alumnus and is backed by investors that include Khosla Ventures, Data Collective, and SV Angel. All Citus Data products are available for download at www.citusdata.com



Datadog is a monitoring and analytics platform for large-scale application infrastructure. Combining metrics from servers, databases, and applications, Datadog delivers sophisticated, actionable alerts, and provides real-time visibility of your entire infrastructure. Datadog includes 100+ vendor-supported, prebuilt integrations and monitors hundreds of thousands of hosts.

GE Digital

As part of GE’s transformative move to be the global leader of the Industrial Internet, GE Digital brings together all of the digital capabilities from across the company into one organization. GE Digital integrates GE’s Predix Platform teams, the expertise of GE’s global IT and commercial software teams, and the industrial security strength of Wurldtech to form the nexus where big data meets big machines.


MemSQL is the leader in real-time databases for transactions and analytics. As a purpose built database for instant access to real-time and historical data, MemSQL uses a familiar SQL interface and a horizontally scalable distributed architecture that runs on commodity hardware or in the cloud. Innovative enterprises use MemSQL to better predict and react to opportunities by extracting previously untapped value in their data to drive new revenue. MemSQL is deployed across hundreds of nodes in high velocity big data environments. Based in San Francisco, MemSQL is a Y Combinator company funded by prominent investors including Accel Partners, Khosla Ventures, First Round Capital and Data Collective. Follow us @MemSQL or visit at www.memsql.com


Heroku, a salesforce.com company and industry pioneer in platform as a service, enables developers to build and run applications entirely in the cloud, without the need to purchase or maintain any servers or software. Over four million apps, including ones from Macy’s, Lutron and Lyft, run on Heroku. With support for the most popular languages, an enterprise-class database service, and an add-ons ecosystem featuring over 150 cloud application services, Heroku provides companies from startups to Fortune 500 enterprises with a faster and more effective way to create, deploy and manage apps.

Hewlett Packard Enterprise

HPE Vertica fits into a Kafka-powered analytical platform to manage and analyze massive volumes of structured and semi-structured data quickly and reliably. With Vertica’s support for Apache Kafka, developers and DBAs can share data between streaming analytics solutions like Spark and use Vertica to perform deep analytics on massive amounts of data. Vertica scales to handle petabytes of data that is often present in log data analysis, fraud detection, customer engagement analytics, in-game analytics, and more. Users can perform analytics on data stored in the database or on Hadoop for the maximum flexibility.


Salesforce is a trusted cloud platform that helps companies connect with their customers. We run 24×7 systems at massive scale, and give our customers innovative new capabilities with social, mobile and data science apps.


SignalFx is the most advanced monitoring solution for modern applications. Our mission is to help cloud-ready organizations drive high levels of availability in today’s elastic, agile, distributed environments. With SignalFx, development and operations teams gain a real-time view of, interact with, and take action on the infrastructure and application metrics that matter. We have enterprise customers including Yelp, Zenefits, Zuora, and Hubspot and thousands of users analyzing billions of metrics every day. SignalFx was founded in 2013 by former Facebook and VMware executives, launched in 2015, and is backed by Andreessen Horowitz and Charles River Ventures.


Striim (pronounced “stream”) is the only end-to-end, streaming integration + intelligence solution. The platform enables real-time data ingestion and Change Data Capture (CDC) across a wide variety of data sources including transactions from enterprise databases (i.e., Oracle, MySQL, HP NonStop), log files, message queues (Kafka), and IoT sensor data. With Striim, enterprises can aggregate and correlate streams, detect anomalies, identify and visualize events of interest, and trigger alerts and workflows – all in-memory, before loading the processed data to a wide variety of targets (i.e., Kafka, Hadoop, AWS). For more information, visit www.striim.com, read our blog at www.striim.com/blog/, or follow @striimteam.


LinkedIn operates the world’s largest professional network on the Internet with more than 400 million members in over 200 countries and territories. Our highly structured dataset gives our data scientists and researchers the ability to conduct applied research that fuels LinkedIn’s data driven products including our search, social graph and machine learning systems. As a members first organization, LinkedIn keeps member privacy and security at the forefront in all our research.

LinkedIn is the birthplace of Apache Kafka, Apache Samza and many related open source projects. Our Kafka deployment processes more than 1.3 trillion messages per day.


MapR provides the industry’s only converged data platform that integrates the power of Hadoop and Spark with global event streaming, real-time database capabilities, and enterprise storage, enabling customers to harness the enormous power of their data. A majority of customers achieves payback in fewer than 12 months and realizes greater than 5X ROI. MapR ensures customer success through world-class professional services and with free on-demand training. Amazon, Cisco, Google, HP, Microsoft, SAP, and Teradata are part of the worldwide MapR partner ecosystem. Investors include Google Capital, Lightspeed Venture Partners, Mayfield Fund, NEA, Qualcomm Ventures and Redpoint Ventures.


Mesosphere’s Datacenter Operating System (DCOS) makes it easy to build and run modern distributed applications in production at scale, by pooling resources across an entire datacenter or cloud. With DCOS you can orchestrate containers at scale with a rock-solid platform powering today’s production hyperscale datacenters, and easily install and manage big data frameworks like Kafka, Spark and Cassandra that power many of today’s Internet of Things and Big Data stacks.


Microsoft is a technology company whose mission is to empower every person and every organization on the planet to achieve more. Our strategy is to build best-in-class platforms and productivity services for a mobile-first, cloud-first world. To carry out our strategy, we are focused on reinventing productivity & business processes, building the intelligent cloud platform, and creating more personal computing.


OpsClarity is a purpose-built monitoring solution for data-first applications that provides end-to-end performance monitoring for complex data processing pipelines along with deep visibility into the individual data frameworks like Kafka, Spark, Elasticsearch, Storm, Solr, Cassandra etc. OpsClarity completely automates metric and metadata collection, and leverages its deep domain expertise about the individual data frameworks to apply data science constructs such as anomaly detection and event correlation to rapidly troubleshoot issues.


Qubole simplifies provisioning, managing and scaling of big data analytics workloads leveraging data stored on AWS, Google, or Azure Cloud environments. Once IT sets policies, any number of data analysts can be set free to collaboratively “click to query” with the power of Hive, Spark, Presto and many other of a growing list of data processing engines. Our platform, Qubole Data Service (QDS), delivers these best-in-class Apache tools integrated into an enterprise-feature rich platform optimized to run in the cloud at petabyte+ scale.

We use cookies to understand how you use our site and to improve your experience. Click here to learn more or change your cookie settings. By continuing to browse, you agree to our use of cookies.