I’ll cover Apache Kafka core terms, Confluent Platform extensions, and Confluent Cloud additions.
Kafka & Confluent Terminology – Complete Glossary
🔹 Core Kafka Concepts
- Kafka Cluster
A group of servers (called brokers) working together to store, process, and stream data. - Broker
A single Kafka server that stores data and serves client requests (produce/consume). - Producer
An application that sends data (messages) into Kafka topics. - Consumer
An application that reads data (messages) from Kafka topics. - Consumer Group
A group of consumers working together to read data from a topic. Kafka ensures each message is processed by only one consumer within the group. - Topic
A named channel where producers send messages and consumers read messages (like a folder or queue). - Partition
A topic is divided into slices called partitions. Messages in a partition are ordered. Partitions allow parallelism and scalability. - Offset
The position of a message in a partition (like a bookmark). Consumers use offsets to track what they’ve read. - Record / Message
A single unit of data in Kafka. It has:- Key (optional, used for partitioning/order)
- Value (the actual payload)
- Headers (extra metadata)
- Log
A partition is stored as an append-only log (new messages are always written at the end). - Replication
Kafka keeps copies of partitions across multiple brokers for fault tolerance. - Leader & Follower
- Leader: The main replica of a partition that handles all reads/writes.
- Follower: Copies the leader’s data for backup.
- ISR (In-Sync Replicas)
A set of replicas that are fully caught up with the leader. - Retention Policy
Defines how long Kafka keeps data (e.g., 7 days, forever, or until size limit). - Compaction
A cleanup policy that keeps only the latest value per key, deleting older duplicates. - Throughput
The rate at which Kafka processes messages (messages per second). - Latency
The time it takes for a message to travel from producer → broker → consumer.
🔹 Kafka Internals
- ZooKeeper (Legacy)
Used in older Kafka versions to manage cluster metadata and leader election. (Being replaced by KRaft). - KRaft (Kafka Raft Metadata mode)
New architecture where Kafka itself manages metadata, removing the need for ZooKeeper. - Controller
A special broker responsible for managing partition leaders. - Rebalancing
When consumers join/leave a group, Kafka redistributes partitions among them. - Coordinator
The broker responsible for managing a consumer group. - ACL (Access Control List)
Security rules defining which user/app can access which topic or resource. - Quotas
Limits on how much data a client can produce/consume to prevent abuse. - Idempotent Producer
Ensures no duplicate messages are produced even if retries happen. - Exactly-Once Semantics (EOS)
Guarantee that messages are processed only once, even during failures. - Transactions
A way to group multiple messages into an atomic unit of work.
🔹 Confluent-Specific Terms
- Confluent Platform
An enterprise distribution of Kafka with additional tools for management, monitoring, and integration. - Confluent Cloud
A fully managed Kafka service hosted by Confluent on AWS, Azure, or GCP. - Schema Registry
Stores and enforces schemas (data formats) for messages (e.g., Avro, JSON, Protobuf) to ensure compatibility. - kSQL / ksqlDB
A SQL-like engine to query, process, and transform Kafka streams in real-time. - Kafka Connect
A framework to move data in/out of Kafka using connectors (e.g., JDBC, S3, Elasticsearch). - Connector
A plugin used with Kafka Connect to integrate Kafka with external systems.- Source Connector: Pulls data into Kafka.
- Sink Connector: Pushes data out of Kafka.
- Confluent Hub
A marketplace of prebuilt Kafka connectors. - Confluent Control Center
A GUI tool for monitoring Kafka clusters, topics, connectors, and schemas. - Replicator
A Confluent tool to copy topics from one Kafka cluster to another (useful for multi-region). - Confluent REST Proxy
Allows producing/consuming data using REST APIs instead of Kafka clients. - Confluent RBAC (Role-Based Access Control)
Fine-grained access control for Kafka resources. - Confluent CLI
A command-line tool for managing Confluent Cloud clusters, topics, and connectors. - Tiered Storage
A Confluent feature that offloads older Kafka data to cheaper cloud storage (e.g., S3, GCS). - Cluster Linking
A Confluent Cloud feature to link clusters across regions/clouds for data replication. - Confluent Cloud Metrics API
Provides usage and performance metrics for monitoring clusters.
🔹 Stream Processing Terms
- Kafka Streams
A Java library for building real-time streaming applications on top of Kafka. - Stream
A continuous flow of data records in Kafka. - Stream Processor
An application that transforms or processes Kafka data in real-time. - Topology
The workflow (graph of processors) that defines how streams are processed. - State Store
Local storage used by stream processing apps to maintain state (e.g., counts, aggregations). - Global Store
A replicated state store available to all stream tasks. - Windowing
Grouping data by time intervals (e.g., 5-minute sales totals).
🔹 Advanced Kafka Concepts
- Reassignment
Moving partitions across brokers for load balancing. - Throttling
Slowing down producers/consumers to avoid overwhelming the cluster. - Backpressure
When consumers can’t keep up with producers, causing slowdowns. - Dead Letter Queue (DLQ)
A special topic where failed or invalid messages are sent for later debugging. - MirrorMaker 2.0
Kafka’s built-in tool for replicating data across clusters (open-source equivalent of Confluent Replicator). - Metrics & JMX
Kafka exposes metrics via JMX for monitoring cluster health. - Log Segment
Each partition’s log is broken into smaller files called log segments. - Message Key Partitioning
The method Kafka uses to decide which partition a message goes to (based on key hash). - Rack Awareness
Kafka spreads replicas across different racks/data centers for reliability.
I’m a DevOps/SRE/DevSecOps/Cloud Expert passionate about sharing knowledge and experiences. I have worked at Cotocus. I share tech blog at DevOps School, travel stories at Holiday Landmark, stock market tips at Stocks Mantra, health and fitness guidance at My Medic Plus, product reviews at TrueReviewNow , and SEO strategies at Wizbrand.
Do you want to learn Quantum Computing?
Please find my social handles as below;
Rajesh Kumar Personal Website
Rajesh Kumar at YOUTUBE
Rajesh Kumar at INSTAGRAM
Rajesh Kumar at X
Rajesh Kumar at FACEBOOK
Rajesh Kumar at LINKEDIN
Rajesh Kumar at WIZBRAND