What Is Apache Kafka?
Apache Kafka an open-source distributed streaming platform capable of three main things: it can publish and subscribe to streams of records, store streams of records in a fault-tolerant durable way, and process streams of records as they occur.
Apache Kafka was originally developed by LinkedIn to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Today, Apache Kafka supports both customer-facing applications and connecting downstream systems with real-time data.
Best Apache Kafka Books
Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale
This practical guide was written for software engineers who develop applications that use Kafka’s APIs. It’s also suitable for production engineers who install, configure, tune, and monitor Apache Kafka in production. Its author, Neha Narkhede, is co-founder and CTO at Confluent, and she was responsible for the streaming infrastructure built on top of Apache Kafka and Apache Samza when she worked at LinkedIn. Her vast experience emits from every page of this book. If you’re looking for a quick yet detailed introduction to Apache Kafka, this is the book you should start with.
Apache Kafka 1.0 Cookbook
We firmly believe that all developers should strive to learn as much about the tools they work with as possible, but we also acknowledge that developing solutions at the speed of business sometimes means skipping the technical stuff and figuring things out as you go. If you’d like to see how Apache Kafka can be integrated with other important big data tools, you should add this book to your library because it contains over 100 practical recipes on using distributed enterprise messaging to handle real-time data.
Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing
This practical book has been written for those who would like to explore streaming systems and learn how they are used by data engineers, data scientists, and developers to process event-time data. The book is conceptual and platform-agnostic, making it a great resource not only for Apache Kafka developers but also for everyone else.
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems
From Facebook to Google to startups of many different kinds, data is everywhere today, and those who know how to leverage it to their benefit lead the pack. This book covers data-intensive applications and their design. It’s practical yet comprehensive, and its author, Martin Kleppmann, does a fantastic job helping the reader navigate the increasingly complex field of designing data-driven applications.
Streaming Architecture: New Designs Using Apache Kafka and MapR Streams
The authors of this book cover key elements in good design for streaming analytics, new messaging technologies, including Apache Kafka and MapR Streams, technology choices for streaming analytics, and a lot more. The book is intended for developers and non-technical people alike, and we can wholeheartedly recommend it anyone who would like to know how Apache Kafka fits into the broader stream processing landscape.
(This post contains affiliate links. It is a way for this site to earn advertising fees by advertising or linking to certain products and/or services.)

