Apache Kafka® is a distributed streaming platform. What exactly does that mean?
We think of a streaming platform as having three key capabilities:
- It lets you publish and subscribe to streams of records. In this respect, it is similar to a message queue or enterprise messaging system.
- It lets you store streams of records in a fault-tolerant way.
- It lets you process streams of records as they occur.
What is Apache Kafka good for?
It gets used for two broad classes of application:
- Building real-time streaming data pipelines that reliably get data between systems or applications
- Building real-time streaming applications that transform or react to the streams of data
To understand how Kafka does these things, let’s dive in and explore Kafka’s capabilities from the bottom up.
First a few concepts:
- Kafka is run as a cluster on one or more servers.
- The Kafka cluster stores streams of records in categories called topics.
- Each record consists of a key, a value, and a timestamp.
Kafka has four core APIs:
- The Producer API allows an application to publish a stream of records to one or more Kafka topics.
- The Consumer API allows an application to subscribe to one or more topics and process the stream of records produced to them.
- The Streams API allows an application to act as a stream processor, consuming an input stream from one or more topics and producing an output stream to one or more output topics, effectively transforming the input streams to output streams.
- The Connector API allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table.
In Kafka, the communication between the clients and the servers is done with a simple, high-performance, language agnostic TCP protocol. This protocol is versioned and maintains backward compatibility with the older version. We provide a Java client for Kafka, but clients are available in many languages.
- [KAFKA-8952] – Vulnerabilities found for jackson-databind-2.9.9
- [KAFKA-8347] – Choose next record to process by timestamp
- [KAFKA-4893] – async topic deletion conflicts with max topic length
- [KAFKA-5998] – /.checkpoint.tmp Not found exception
- [KAFKA-6290] – Kafka Connect cast transformation should support logical types
- [KAFKA-6605] – Flatten SMT does not properly handle fields that are null
- [KAFKA-7157] – Connect TimestampConverter SMT doesn’t handle null values
- [KAFKA-7941] – Connect KafkaBasedLog work thread terminates when getting offsets fails because broker is unavailable
- [KAFKA-8229] – Connect Sink Task updates nextCommit when commitRequest is true
- [KAFKA-8290] – Streams Not Closing Fenced Producer On Task Close
- [KAFKA-8340] – ServiceLoader fails when used from isolated plugin path directory
- [KAFKA-8351] – Log cleaner must handle transactions spanning multiple segments
- [KAFKA-8379] – Flaky test KafkaAdminClientTest.testUnreachableBootstrapServer
- [KAFKA-8404] – Authorization header is not passed in Connect when forwarding REST requests
- [KAFKA-8412] – Still a nullpointer exception thrown on shutdown while flushing before closing producers
- [KAFKA-8523] – InsertField transformation fails when encountering tombstone event
- [KAFKA-8536] – Error creating ACL Alter Topic in 2.2
- [KAFKA-8550] – Connector validation fails with aliased converters
- [KAFKA-8564] – NullPointerException when loading logs at startup
- [KAFKA-8570] – Downconversion could fail when log contains out of order message formats
- [KAFKA-8586] – Source task producers silently fail to send records
- [KAFKA-8591] – NPE when reloading connector configuration using WorkerConfigTransformer
- [KAFKA-8602] – StreamThread Dies Because Restore Consumer is not Subscribed to Any Topic
- [KAFKA-8615] – Change to track partition time breaks TimestampExtractor
- [KAFKA-8620] – Race condition in StreamThread state change
- [KAFKA-8637] – WriteBatch objects leak off-heap memory
- [KAFKA-8649] – Error while rolling update from Kafka Streams 2.0.0 -> Kafka Streams 2.1.0
- [KAFKA-8678] – LeaveGroup request getErrorResponse is incorrect on throttle time and error setting
- [KAFKA-8774] – Connect REST API exposes plaintext secrets in tasks endpoint if config value contains additional characters
- [KAFKA-8816] – RecordCollector offsets updated indirectly by StreamTask
- [KAFKA-8819] – Plugin path for converters not working as intended
- [KAFKA-8861] – Fix flaky RegexSourceIntegrationTest.testMultipleConsumersCanReadFromPartitionedTopic
- [KAFKA-8896] – NoSuchElementException after coordinator move
- [KAFKA-8945] – Incorrect null check in the constructor for ConnectorHealth and AbstractState
- [KAFKA-8947] – Connect framework incorrectly instantiates TaskStates for REST extensions
- [KAFKA-8974] – Sink Connectors can’t handle topic list with whitespaces
- [KAFKA-9014] – AssertionError thrown by SourceRecordWriteCounter when SourceTask.poll returns an empty list
- [KAFKA-9053] – AssignmentInfo#encode hardcodes the LATEST_SUPPORTED_VERSION