How to install Apache Drill on Ubuntu 22.04

How to install Apache Drill on Ubuntu 22.04

In this article we will learn how to install Apache Drill on Ubuntu 22.04 LTS operating system.

Introduction

In the era of big data, organizations face the challenge of efficiently querying and analyzing vast amounts of data stored in various formats and across different data sources. Apache Drill, an open-source SQL query engine, addresses this challenge by providing a powerful and flexible solution for querying diverse data sources using standard SQL queries. In this article, we will delve into the world of Apache Drill, exploring its features, benefits, and how it simplifies the process of working with big data.

What is Apache Drill?

Apache Drill is a schema-free SQL query engine designed to perform interactive analysis on large-scale datasets. It enables users to query and analyze data from various sources, including NoSQL databases, cloud storage, Hadoop Distributed File System (HDFS), and more, using familiar SQL syntax.

Key Features of Apache Drill

  • Schema-Free Querying: Apache Drill allows you to query data without requiring a predefined schema. This flexibility is particularly beneficial when dealing with semi-structured or nested data formats like JSON, Parquet, and Avro.
  • Distributed Query Execution: Apache Drill is designed to distribute query processing across multiple nodes, enabling parallel execution and optimizing performance for large datasets.
  • Query Federation: Drill supports querying multiple data sources with a single query. It can seamlessly join data from different sources, providing a unified view for analysis.
  • JSON Data Model: Apache Drill treats JSON data as a first-class citizen, allowing you to query complex nested data structures directly using SQL.
  • ANSI SQL Compatibility: Drill supports a wide range of SQL operations and functions, making it easy for users familiar with SQL to work with big data.
  • Extensibility: Drill can be extended to support additional data formats, data sources, and custom functions, providing flexibility to meet diverse data analysis needs.

Installing Apache Drill On Ubuntu 22.04

Installing Apache Drill on Ubuntu is a relatively straightforward process. Apache Drill provides a distribution package that can be easily installed using the package manager. Here are the steps to install Apache Drill on Ubuntu:

Step 1: Update System Packages

Before installing any new software, it’s a good practice to update the system packages. Open a terminal and run the following command:

$ sudo apt update

Step 2: Install Java

Apache Drill requires Java to run. You can install OpenJDK using the following command:

$ sudo apt install default-jdk

his will install the default version of JDK available in the Ubuntu repositories. Then we will verify it by querying its version. $ java -version

Output :

ramansah@infodiginet:~$ java -version 
openjdk version "11.0.19" 2023-04-18 
OpenJDK Runtime Environment (build 11.0.19+7-post-Ubuntu-0ubuntu122.04.1) 
OpenJDK 64-Bit Server VM (build 11.0.19+7-post-Ubuntu-0ubuntu122.04.1, mixed mode, sharing)

Step 3: Download and Extract Apache Drill

Visit the official Apache Drill download page (https://drill.apache.org/download/) and choose the appropriate distribution package. For Ubuntu, we typically select the “tar.gz” package. Copy the download link of the package. For this tutorial purposer, we will use Apache Drill version 1.20.2.

In the terminal, navigate to the directory where you want to install Apache Drill and use the following command to download and extract the package (replace with the actual download link). We will use wget command line for downloading the Apache Drill source file.

$wget https://archive.apache.org/dist/drill/1.20.2/apache-drill-1.20.2.tar.gz

Output :

ramansah@infodiginet:~$ wget https://archive.apache.org/dist/drill/1.20.2/apache-drill-1.20.2.tar.gz
--2023-08-10 10:20:53--  https://archive.apache.org/dist/drill/1.20.2/apache-drill-1.20.2.tar.gz
Resolving archive.apache.org (archive.apache.org)... 65.108.204.189, 2a01:4f9:1a:a084::2
Connecting to archive.apache.org (archive.apache.org)|65.108.204.189|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 419598928 (400M) [application/x-gzip]
Saving to: ‘apache-drill-1.20.2.tar.gz’

apache-drill-1.20.2.tar.gz    0%[                                                 ]   2,47M   520KB/s    eta 16m 59s

After finish we will get the file :

ramansah@infodiginet:~$ ls -ltr apache-drill-1.20*
-rw-rw-r-- 1 ramansah ramansah 419598928 Agu 1 2022 apache-drill-1.20.2.tar.gz

Then we will extract it by submiiting command line :

$  tar -xvzf apache-drill-1.20.2.tar.gz

Output :

ramansah@infodiginet:~$ tar -xvzf apache-drill-1.20.2.tar.gz
apache-drill-1.20.2/KEYS
apache-drill-1.20.2/LICENSE
apache-drill-1.20.2/README.md
apache-drill-1.20.2/NOTICE
. . .
apache-drill-1.20.2/jars/ext/zookeeper-jute-3.5.7.jar
apache-drill-1.20.2/jars/3rdparty/linux/netty-tcnative-2.0.48.Final-linux-x86_64.jar
apache-drill-1.20.2/jars/3rdparty/fedora/netty-tcnative-2.0.48.Final-linux-x86_64-fedora.jar
apache-drill-1.20.2/jars/3rdparty/windows/netty-tcnative-2.0.36.Final-windows-x86_64.jar
apache-drill-1.20.2/jars/3rdparty/osx/netty-tcnative-2.0.48.Final-osx-x86_64.jar
ramansah@infodiginet:~/$ cd apache-drill-1.20.2
ramansah@infodiginet:~/apache-drill-1.20.2$ pwd
/home/ramansah/apache-drill-1.20.2
ramansah@infodiginet:~/apache-drill-1.20.2$ ls -ltr
total 168
drwxr-xr-x 6 ramansah ramansah  4096 Jan  1  1970 sample-data
-rw-r--r-- 1 ramansah ramansah  1301 Jan  1  1970 README.md
-rw-r--r-- 1 ramansah ramansah   238 Jan  1  1970 NOTICE
-rw-r--r-- 1 ramansah ramansah 63105 Jan  1  1970 LICENSE
-rw-r--r-- 1 ramansah ramansah 71848 Jan  1  1970 KEYS
-rw-r--r-- 1 ramansah ramansah   930 Jan  1  1970 git.properties
drwxrwxr-x 2 ramansah ramansah  4096 Agu 14 10:26 conf
drwxrwxr-x 2 ramansah ramansah  4096 Agu 14 10:26 bin
drwxrwxr-x 3 ramansah ramansah  4096 Agu 14 10:26 winutils
drwxrwxr-x 7 ramansah ramansah  4096 Agu 14 10:26 jars

Step 4: Start Apache Drill

Navigate back to the Apache Drill installation directory and start Drill using the following command:

$ bin/drill-embedded

Output :

ramansah@infodiginet:~/apache-drill-1.20.2$ bin/drill-embedded
Apache Drill 1.20.2
"Drill, baby, Drill."
apache drill> 

Step 5: Access the Web Interface

In this stage, we open a Apache Drill by using web browser. For this purpose we will open web browser and navigate to http://<hostname_or_ipaddress>:8047 to access the Apache Drill Web UI. From here, you can run SQL queries and manage your Drill instance.

Apache Drill Web UI

Until this stage, we have installed Apache Drill on Ubuntu 22.04 LTS operating system successfully.

Conclusion

We have successfully installed Apache Drill on your Ubuntu system. Apache Drill is now ready to be used for querying and analyzing data from various sources using standard SQL queries. Whether you’re dealing with structured or semi-structured data, Apache Drill provides a powerful tool for interactive analysis and exploration. Enjoy the benefits of efficient and flexible data querying with Apache Drill!

You may also like