How to install Apache Kafka on Ubuntu 20.04 LTS

How To Install Apache Kafka On Ubuntu 20.04 LTS

On this article we will discuss how to install Apache Kafka on Ubuntu 20.04 LTS and accessing it via Cluster Manager for Apache Kafka, CMAK.

Introduction

If we are going to build an application where the domain case is related to the event streaming, just try the Apache Kafka.Why ? Because Apache Kafka is event streaming platform which is designed to handle the high-performance data pipelines, streaming analytics applications.

As mentioned in the Apache Kafka official web, Event streaming is the digital equivalent of the human body’s central nervous system. It is the technological foundation for the ‘always-on’ world where businesses are increasingly software-defined and automated, and where the user of software is more software. Event streaming ensures a continuous flow and interpretation of data so that the right information is at the right place, at the right time.

Apache Kafka is an open-source distributed event streaming platform written in Scala and Java. It is used for high-performance data pipelines, streaming analytics, data integration, and mission-crititcal applications. Kafka was originally developed by LinkedIn and to be open sourced on early 2011 under Apache Software Foundation. On this article we will discuss how to install Apache Kafka on Ubuntu 20.04 LTS.

Apache Kafka Installation on Ubuntu 20.04 LTS

Before we are going to start the installation, we have to confirm if our system environment is supports for the Apache Kafka installation process. The installation process will consist of several stages, namely:

  1. Prerequisite
  2. Download And Extract Apache Kafka Binary Source
  3. Creating Kafka and Zookeeper Systemd Unit Files
  4. Installing and Configuring Cluster Manager for Apache Kafka (CMAK)
  5. Starting And Accessing Kafka services
  6. Creating Cluster and Topic (An Example)
    • Command line
    • CMAK Interface

Prerequisite

Before we are going to install Apache Kafka on Ubuntu 20.04 LTS, we have to prepare the environment first.

  • Ubuntu 20.04 LTS System
  • root or ordinary account with sudo privilege
  • Sufficient disk space
  • Java installed (Oracle JDK or OpenJDK). We could verify the Java installed on our system by submitting command line java --version, as mentioned below.
ramans@otodiginet:~$ java --version 
openjdk 11.0.8 2020-07-14 
OpenJDK Runtime Environment (build 11.0.8+10-post-Ubuntu-0ubuntu120.04) 
OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Ubuntu-0ubuntu120.04, mixed mode, sharing)
Java --verion. verifying java version for Kafka installation

If your system is still doesn’t have Java installed, just install Java first. We could found an articles related to Java installation on https://otodiginet.com/software/how-to-install-java-openjdk-11-on-centos-8/ for Open JDK installation or https://otodiginet.com/software/how-to-install-oracle-java-12-in-linux-ubuntu-18-04/ for Oracle Java installation.

Download And Extract Apache Kafka Binary File

After all requirement were fulfilled, then we download and extract binary file of Apach Kafka. For your reference, just check the latest version of Kafka on https://kafka.apache.org/downloads web page. The newest stable version of Apache Kafka when this article made is version 2.6.0 which was released on Auguts 03, 2020. We will download the source and extract it to the /usr/local/kafka-server/ directory.

1. Download Kafka source

ramans@otodiginet:~/Desktop$ cd ~
ramans@otodiginet:~$ wget https://downloads.apache.org/kafka/2.6.0/kafka_2.13-2.6.0.tgz
--2020-09-20 18:33:53-- https://downloads.apache.org/kafka/2.6.0/kafka_2.13-2.6.0.tgz
Resolving downloads.apache.org (downloads.apache.org)… 88.99.95.219, 2a01:4f8:10a:201a::2
Connecting to downloads.apache.org (downloads.apache.org)|88.99.95.219|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 65537909 (63M) [application/x-gzip]
Saving to: ‘kafka_2.13-2.6.0.tgz’
kafka_2.13-2.6.0.tgz 100%[=================================================>] 62.50M 369KB/s in 3m 48s
2020-09-20 18:37:45 (281 KB/s) - ‘kafka_2.13-2.6.0.tgz’ saved [65537909/65537909]
download Kafka binary file

2. Create /usr/local/kafka-server/ directory and extract To It

The /usr/local/kafka-server/ will be used for storing the extracted Apace Kafka binary files.

ramans@otodiginet:~$ sudo mkdir /usr/local/kafka-server && cd /usr/local/kafka-server
[sudo] password for ramans:
ramans@otodiginet:/usr/local/kafka-server$ sudo tar -xvzf ~/kafka_2.13-2.6.0.tgz --strip 1
kafka_2.13-2.6.0/LICENSE
kafka_2.13-2.6.0/NOTICE
kafka_2.13-2.6.0/bin/
kafka_2.13-2.6.0/bin/kafka-delete-records.sh
kafka_2.13-2.6.0/bin/trogdor.sh
kafka_2.13-2.6.0/bin/kafka-preferred-replica-election.sh
kafka_2.13-2.6.0/bin/connect-mirror-maker.sh
Extract Kafka binary files
Extract Kafka binary files

Creating Kafka and Zookeeper Systemd Unit Files

To enable Kafka and Zookeeper daemons are easy to manage, then we have to make system unit files for it. By creating Kafka and Zookeeper systemd unit file, we could adapt other services are started, stopped, and restarted which is beneficial and consistent. For this purpose we have to create two files: zookeeper.service and kafka.service.

1. /etc/systemd/system/zookeeper.service file

ramans@otodiginet:/usr/local/kafka-server$ sudo vi /etc/systemd/system/zookeeper.service
[Unit]
Description=Apache Zookeeper Server
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties
ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
zookeeper.properties
zookeeper.properties file

2. /etc/systemd/system/kafka.service file

Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
After=zookeeper.service
[Service]
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64"
ExecStart=/usr/local/kafka-server/bin/kafka-server-start.sh /usr/local/kafka-server/config/server.properties
ExecStop=/usr/local/kafka-server/bin/kafka-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
kafka.service files
kafka.service file

3. Reload Services

After adding these two configuration files, then we should reload the systemd daemon to apply changes and then start the services. We can check the status, whether the services has been running properly. For this purpose we should submit command lines below :

ramans@otodiginet:/usr/local/kafka-server$ sudo systemctl daemon-reload
ramans@otodiginet:/usr/local/kafka-server$ sudo systemctl enable --now zookeeper
ramans@otodiginet:/usr/local/kafka-server$ sudo systemctl enable --now kafka
ramans@otodiginet:/usr/local/kafka-server$ sudo systemctl status kafka zookeeper

Output should be like this :

● kafka.service - Apache Kafka Server
Loaded: loaded (/etc/systemd/system/kafka.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2020-09-20 19:08:33 PDT; 10s ago
Docs: http://kafka.apache.org/documentation.html
Main PID: 14694 (java)
Tasks: 68 (limit: 4624)
Memory: 328.8M
CGroup: /system.slice/kafka.service
└─14694 /usr/lib/jvm/java-11-openjdk-amd64/bin/java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMilli>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,525] INFO [Transaction Marker Channel Man>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,643] INFO [ExpirationReaper-0-AlterAcls]:>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,726] INFO [/config/changes-event-process->
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,770] INFO [SocketServer brokerId=0] Start>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,799] INFO [SocketServer brokerId=0] Start>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,800] INFO [SocketServer brokerId=0] Start>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,827] INFO Kafka version: 2.6.0 (org.apach>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,828] INFO Kafka commitId: 62abe01bee03965>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,828] INFO Kafka startTimeMs: 160065411880>
Sep 20 19:08:38 otodiginet kafka-server-start.sh[14694]: [2020-09-20 19:08:38,831] INFO [KafkaServer id=0] started (kaf>
● zookeeper.service - Apache Zookeeper Server
Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2020-09-20 19:08:20 PDT; 23s ago
Main PID: 14287 (java)
Tasks: 37 (limit: 4624)
Memory: 73.6M
CGroup: /system.slice/zookeeper.service
└─14287 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPer>
Sep 20 19:08:22 otodiginet zookeeper-server-start.sh[14287]: [2020-09-20 19:08:22,737] INFO maxSessionTimeout set to 60>
Sep 20 19:08:22 otodiginet zookeeper-server-start.sh[14287]: [2020-09-20 19:08:22,738] INFO Created server with tickTim>
Sep 20 19:08:22 otodiginet zookeeper-server-start.sh[14287]: [2020-09-20 19:08:22,763] INFO Using org.apache.zookeeper.>
Sep 20 19:08:22 otodiginet zookeeper-server-start.sh[14287]: [2020-09-20 19:08:22,772] INFO Configuring NIO connection >
Sep 20 19:08:22 otodiginet zookeeper-server-start.sh[14287]: [2020-09-20 19:08:22,787] INFO binding to port 0.0.0.0/0.0>
Sep 20 19:08:22 otodiginet zookeeper-server-start.sh[14287]: [2020-09-20 19:08:22,828] INFO zookeeper.snapshotSizeFacto>
Sep 20 19:08:22 otodiginet zookeeper-server-start.sh[14287]: [2020-09-20 19:08:22,834] INFO Snapshotting: 0x0 to /tmp/z>
Zookeeper and Kafka Services

Installing and Configuring Cluster Manager for Apache Kafka (CMAK)

Kafka need a tools for monitoring and managing its services. The developers has built a tools for its purpose called as CMAK (previously known as Kafka Manager) an opensource tool for managing Apache Kafka clusters which was developed by Yahoo. Here, we will use this tools too. So we will install it on our system.

We will clone the Cluster Manager for Apache Kafka (CMAK) files from github, by submitting command line below :

ramans@otodiginet:~$ cd ~
ramans@otodiginet:~$ git clone https://github.com/yahoo/CMAK.git
CMAK files

We will use the Zookeeper host for hosting the CMAK. For this purpose we will configure it on the CMAK configuration file which is located at ~/CMAK/conf/application.conf file. On this tutorial the Zookeeper host is the localhost.

ramans@otodiginet:~$ vi ~/CMAK/conf/application.conf

Settings prefixed with 'kafka-manager.' will be deprecated, use 'cmak.' instead.
https://github.com/yahoo/CMAK/issues/713
kafka-manager.zkhosts="kafka-manager-zookeeper:2181"
kafka-manager.zkhosts=${?ZK_HOSTS}
cmak.zkhosts="kafka-manager-zookeeper:2181"
cmak.zkhosts=${?ZK_HOSTS}
cmak.zkhosts="localhost:2181"

After updating the file abvoe, then we will deploy the application by submitting command line below.

ramans@otodiginet:~$ cd ~/CMAK/
ramans@otodiginet:~/CMAK$ ./sbt clean dist

It will take time until the process is completed done. The process will be ended until we found the information below.

[info] Compilation completed in 18.304s.
model contains 640 documentable templates
[info] Main Scala API documentation successful.
[success] All package validations passed
[info] Your package is ready in /home/ramans/CMAK/target/universal/cmak-3.0.0.5.zip
[success] Total time: 363 s (06:03), completed Sep 20, 2020, 8:25:09 PM
CMAK compiling done

The CMAK package has been successfully created on /home/ramans/CMAK/target/universal/cmak-3.0.0.5.zip file. Unzip it, start and access the CMAK via web browser. Here is the command line for unziping the file.

ramans@otodiginet:~/CMAK$ cd ~/CMAK/target/universal
ramans@otodiginet:~/CMAK/target/universal$ unzip cmak-3.0.0.5.zip

Starting And Accessing CMAK

1. Starting CMAK services

We will strating the CMAK daemon by submitting command line below.

ramans@otodiginet:~/CMAK/target/universal$
ramans@otodiginet:~/CMAK/target/universal/cmak-3.0.0.5bin/cmak
CMAK starting

Until now, our Kafka installation has almost successfully done. We will access the Kafka service through dashboard (web based) which was provided by CMAK. For accessing it, we just hit the URL with our IP_Address or hostname that was configured on the configuration file above. The default port for this services is port 9000. On our tutorial here, we could hit the http://otodiginet:9000.

Cluster Manager for Apache Kafka (CMAK) dashboard
Cluster Manager for Apache Kafka (CMAK) dashboard

Creating Cluster and Topic (An Example)

So far, we have through so far. We will try to add new cluster and Topic for our new Kafka. On our scenario we will create a new cluster called as ‘Otodiginet_cluster1‘ and topic called as ‘Ramansah_Topic1‘.

Kafka | Add Cluster
Cluster was added information
Cluster view
Creating Topic
Topic Created
Kafka Dashboard

Conclussion

On this article we have tried to install Apache Kafka on Ubuntu 20.04 LTS system. We have finally completed this simple tutorial. Have a nice day, stay at home and stay save.

Share this article via :

Leave a Reply

Your email address will not be published. Required fields are marked *