Single Node Installation
RHEL 8 Setup
Prerequisites
To establish the necessary development tools and Python packages, ensure the installation of Python packages with versions 3.8 and above. Note that while this documentation employs Python 3.8, Apache Airflow is compatible with Python versions starting from 3.8 and higher.
sudo dnf module enable python3.8 -y
sudo dnf install -y python3.8
sudo yum install python38-devel -y
sudo yum -y groupinstall "Development Tools"
Check if the Python 3.8 executables are present in both locations, as shown, before proceeding with the Airflow installation.
ll /usr/local/bin/python3.8*
-rwxr-xr-x 1 root root 5490520 Mar 10 11:20 /usr/local/bin/python3.8*
ll /usr/bin/python3.8*
-rwxr-xr-x 1 root root 5490520 Feb 4 16:02 /usr/bin/python3.8*
Database Setup
For an optimal test drive experience of Airflow, choose a robust database backend such as PostgreSQL or MySQL. By default, Airflow uses SQLite, which is mainly intended for development.
Airflow supports only specific versions of database engines. Verify that your database version is compatible as older versions might not support certain SQL features:
- PostgreSQL: Versions 12, 13, 14, 15, 16
- MySQL: Version 8.0, Innovation
- MSSQL (experimental, support ending in version 2.9.0): 2017, 2019
- SQLite: Version 3.15.0 or later
Before installing Apache Airflow, set up a compatible database. Choose PostgreSQL or MySQL based on your needs and ensure that your database version meets the following minimum requirements:
- MySQL: Version 8.0 or higher
- PostgreSQL: Version 12 or higher
Refer: Setup a Database Backend-Apache Airflow
Follow the respective instructions below for your chosen database system to initialise and configure it for use with Apache Airflow.
PostgreSQL Database Setup
To integrate PostgreSQL with Apache Airflow, complete the following steps to install and configure it:
- Install PostgreSQL:
# Install the repository RPM:
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
# Disable the built-in PostgreSQL module:
sudo dnf -qy module disable postgresql
# Install PostgreSQL:
sudo dnf install -y postgresql12-server
- Initialize and Start PostgreSQL:
# Optionally initialize the database and enable automatic start:
sudo /usr/pgsql-12/bin/postgresql-12-setup initdb
sudo systemctl enable postgresql-12
sudo systemctl start postgresql-12
- Create PostgreSQL Database and User for Airflow:
To set up the database and user for Apache Airflow in PostgreSQL, follow these steps:
Access the PostgreSQL Shell:
sudo -u postgres psql
Inside the PostgreSQL Shell, execute the following commands:
-- Create the Airflow database:
CREATE DATABASE airflow;
-- Create the Airflow user with a password:
CREATE USER airflow WITH PASSWORD 'airflow';
-- Set client encoding, default transaction isolation, and timezone for the Airflow user:
ALTER ROLE airflow SET client_encoding TO 'utf8';
ALTER ROLE airflow SET default_transaction_isolation TO 'read committed';
ALTER ROLE airflow SET timezone TO 'UTC';
-- Grant all privileges on the Airflow database to the Airflow user:
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
-- Exit the PostgreSQL shell:
\q
Now, the PostgreSQL database named airflow
and the user airflow
with the specified settings and privileges have been created. Proceed with the next steps to configure Apache Airflow with this PostgreSQL database.
- Configure PostgreSQL Settings for Airflow: After creating the Airflow database and user in PostgreSQL, modify the PostgreSQL configuration to allow connections from the Apache Airflow server. Follow these steps:
Open the PostgreSQL Configuration File:
vi /var/lib/pgsql/12/data/postgresql.conf
Inside the File, Modify the Following Settings:
# Change and uncomment the listen_addresses to '*':
listen_addresses = '*'
# Uncomment the following line (remove the '#' at the beginning):
# port = 5432
Save and close the file.

Open the pg_hba.conf File:
vi /var/lib/pgsql/12/data/pg_hba.conf
# Add this line at the end of the file:
host airflow airflow {host_IP}/32 md5
Replace {host_IP} with the actual IP address of the machine running Apache Airflow.
Save and close the file.
- Restart PostgreSQL to Apply Changes:
sudo systemctl restart postgresql-12
MySQL Database Setup for Airflow
To configure MySQL as the database backend for Apache Airflow, follow these steps:
- Install MySQL Server:
sudo yum install mysql-server
sudo yum install mysql-devel -y or sudo yum install mariadb-devel -y
- Install the mysqlclient Python package:
pip3.8 install mysqlclient
- Start the MySQL service:
sudo systemctl start mysqld
- Secure MySQL Installation (Optional but Recommended):
sudo mysql_secure_installation
Follow the prompts to secure the MySQL installation, including setting a root password.
- Create Database and User for Airflow:
sudo mysql -u root -p
Enter the root password when prompted. Inside the MySQL shell:
CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE USER 'airflow'@'%' IDENTIFIED BY 'airflow';
GRANT ALL PRIVILEGES ON airflow.* TO 'airflow'@'%';
FLUSH PRIVILEGES;
EXIT;
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 102
Server version: 5.7.42 MySQL Community Server (GPL)
Copyright (c) 2000, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> SHOW GLOBAL VARIABLES LIKE '%timestamp%';
+---------------------------------+-------+
| Variable_name | Value |
+---------------------------------+-------+
| explicit_defaults_for_timestamp | OFF |
| log_timestamps | UTC |
+---------------------------------+-------+
2 rows in set (0.01 sec)
mysql> SET GLOBAL explicit_defaults_for_timestamp = 1;
Query OK, 0 rows affected (0.00 sec)
mysql> SHOW GLOBAL VARIABLES LIKE '%timestamp%';
+---------------------------------+-------+
| Variable_name | Value |
+---------------------------------+-------+
| explicit_defaults_for_timestamp | ON |
| log_timestamps | UTC |
+---------------------------------+-------+
2 rows in set (0.00 sec)
mysql> exit;
Bye
- Restart MySQL to Apply Changes:
sudo systemctl restart mysqld
Now, the MySQL database is set up with a database named airflow
and a user named airflow
with the necessary privileges. Proceed to configure Apache Airflow to use this MySQL database as its backend.
CentOS 7 Setup
Prerequisites
Install Dependencies:
# Install development tools and dependencies
yum install gcc openssl-devel wget bzip2-devel libffi-devel zlib-devel -y
#Install SQLite:
# Download SQLite source
cd /opt
wget https://www.sqlite.org/src/tarball/sqlite.tar.gz?r=release --no-check-certificate
mv sqlite.tar.gz?r=release sqlite.tar.gz
tar xzf sqlite.tar.gz
# Navigate to the SQLite directory
cd sqlite/
# Configure SQLite
./configure --prefix=/usr
# Install Tcl (a dependency for SQLite)
sudo yum install tcl -y
# Build and install SQLite
make
sudo make install
# Check SQLite version
sqlite3 --version
# Print the current PATH
echo $PATH
# Change to the /opt directory
cd /opt
# Install required dependencies
sudo yum install gcc openssl-devel bzip2-devel libffi-devel zlib-devel -y
# Download Python 3.8.12 source tarball
curl -O https://www.python.org/ftp/python/3.8.12/Python-3.8.12.tgz
# Extract the tarball
tar -zxvf Python-3.8.12.tgz
# Change into the Python source directory
cd Python-3.8.12/
# Configure the build, enabling shared libraries
./configure --enable-shared
# Build Python
make
# Install Python
sudo make install
# Copy libpython3.8.so to /lib64/
sudo cp --no-clobber ./libpython3.8.so* /lib64/
# Set the correct permissions for libpython3.8.so
sudo chmod 755 /lib64/libpython3.8.so*
# Add the path to the shared libraries to LD_LIBRARY_PATH in .bashrc
echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/" >> ~/.bashrc
# Reload .bashrc to apply changes
source ~/.bashrc
# Create a symbolic link from /usr/local/bin/python3.8 to /usr/bin/python3.8
sudo ln -s /usr/local/bin/python3.8 /usr/bin/python3.8
# Set permissions for the Python library directory
sudo chmod -R 755 /usr/local/lib/python3.8
# Check Python and SQLite versions
python3.8 --version
sqlite3 --version
# Run ldconfig to update the system library cache
sudo ldconfig
# Check the SQLite version using Python 3.8
python3.8 -c "import sqlite3; print(sqlite3.sqlite_version)"
# Install additional development tools
sudo yum -y groupinstall "Development Tools"
Check if the Python 3.8 executables are present in both locations, as shown, before proceeding with the Airflow installation.
ll /usr/local/bin/python3.8*
-rwxr-xr-x 1 root root 5490520 Mar 10 11:20 /usr/local/bin/python3.8*
ll /usr/bin/python3.8*
-rwxr-xr-x 1 root root 5490520 Feb 4 16:02 /usr/bin/python3.8*
Database Setup
Follow the respective instructions below for your chosen database system to initialize and configure it for use with Apache Airflow.
PostgreSQL Database Setup.
To use PostgreSQL with Apache Airflow, follow these steps to install and configure it:
- Install
psycopg2-binary
Python Package:
pip3.8 install psycopg2-binary
- Install PostgreSQL:
# Install the repository RPM:
sudo yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
# Install the built-in PostgreSQL module:
sudo yum -y install epel-release yum-utils
sudo yum-config-manager --enable pgdg12
# Install PostgreSQL:
sudo yum install postgresql12-server postgresql12 -y
- Initialize and Start PostgreSQL:
# Optionally initialize the database and enable automatic start:
sudo /usr/pgsql-12/bin/postgresql-12-setup initdb
sudo systemctl enable --now postgresql-12
sudo systemctl start postgresql-12
- Create PostgreSQL Database and User for Airflow.
To set up the database and user for Apache Airflow in PostgreSQL, follow these steps:
Access the PostgreSQL Shell:
sudo -u postgres psql
Inside the PostgreSQL Shell, execute the following commands:
-- Create the Airflow database:
CREATE DATABASE airflow;
-- Create the Airflow user with a password:
CREATE USER airflow WITH PASSWORD 'airflow';
-- Set client encoding, default transaction isolation, and timezone for the Airflow user:
ALTER ROLE airflow SET client_encoding TO 'utf8';
ALTER ROLE airflow SET default_transaction_isolation TO 'read committed';
ALTER ROLE airflow SET timezone TO 'UTC';
-- Grant all privileges on the Airflow database to the Airflow user:
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
-- Exit the PostgreSQL shell:
\q
Now, the PostgreSQL database named airflow
and the user airflow
with the specified settings and privileges have been created. Proceed with the next steps to configure Apache Airflow with this PostgreSQL database.
- Configure PostgreSQL Settings for Airflow:
After creating the Airflow database and user in PostgreSQL, modify the PostgreSQL configuration to allow connections from the Apache Airflow server. Follow these steps:
Open the PostgreSQL Configuration File:
vi /var/lib/pgsql/12/data/postgresql.conf
Inside the File, Modify the Following Settings:
# Change and uncomment the listen_addresses to '*':
listen_addresses = '*'
# Uncomment the following line (remove the '#' at the beginning):
# port = 5432
Save and close the file.

Open the pg_hba.conf File:
vi /var/lib/pgsql/12/data/pg_hba.conf
# Add this line at the end of the file:
host airflow airflow {host_IP}/32 md5
Replace {host_IP} with the actual IP address of the machine running Apache Airflow.
Save and close the file.
- Restart PostgreSQL to Apply Changes:
sudo systemctl restart postgresql-12
MySQL Database Setup for Airflow
To set up MySQL as the database backend for Apache Airflow, follow these steps:
- Install MySQL Server:
sudo yum install mysql-server
sudo yum install mysql-devel -y or sudo yum install mariadb-devel -y
- Install the mysqlclient Python package:
pip3.8 install mysqlclient
- Start the MySQL service:
sudo systemctl start mysqld
- Install MySQL Connector for Python:
pip3.8 install mysql-connector-python
pip3.8 install pymysql
- Secure MySQL Installation (Optional but Recommended):
sudo mysql_secure_installation
Follow the prompts to secure the MySQL installation, including setting a root password.
- Create Database and User for Airflow:
sudo mysql -u root -p
Enter the root password when prompted. Inside the MySQL shell:
CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE USER 'airflow'@'%' IDENTIFIED BY 'airflow';
GRANT ALL PRIVILEGES ON airflow.* TO 'airflow'@'%';
FLUSH PRIVILEGES;
EXIT;
- Restart MySQL to Apply Changes:
sudo systemctl restart mysqld
Now, the MySQL database is set up with a database named airflow
and a user named airflow
with the necessary privileges. Proceed to configure Apache Airflow to use this MySQL database as its backend.
Ubuntu20.04 Setup
Prerequisites
apt update
sudo add-apt-repository ppa:deadsnakes/ppa
apt install python3.8 -y
sudo apt install python3.8-venv
sudo apt install -y python3.8-dev
python3.8 --version
pip3.8 --version
Check if the Python 3.8 executables are present in both locations, as shown, before proceeding with the Airflow installation.
ll /usr/local/bin/python3.8*
-rwxr-xr-x 1 root root 5490520 Mar 10 11:20 /usr/local/bin/python3.8*
ll /usr/bin/python3.8*
-rwxr-xr-x 1 root root 5490520 Feb 4 16:02 /usr/bin/python3.8*
Database Setup
Follow the respective instructions below for your chosen database system to initialize and configure it for use with Apache Airflow.
PostgreSQL Database Setup
To use PostgreSQL with Apache Airflow, follow these steps to install and configure it:
- Install
psycopg2-binary
Python Package:
pip3.8 install psycopg2-binary
- Install PostgreSQL:
sudo apt install postgresql-12 postgresql-client-12
- Create PostgreSQL Database and User for Airflow:
To set up the database and user for Apache Airflow in PostgreSQL, follow these steps:
Access the PostgreSQL Shell:
sudo -u postgres psql
Inside the PostgreSQL Shell, execute the following commands:
-- Create the Airflow database:
CREATE DATABASE airflow;
-- Create the Airflow user with a password:
CREATE USER airflow WITH PASSWORD 'airflow';
-- Set client encoding, default transaction isolation, and timezone for the Airflow user:
ALTER ROLE airflow SET client_encoding TO 'utf8';
ALTER ROLE airflow SET default_transaction_isolation TO 'read committed';
ALTER ROLE airflow SET timezone TO 'UTC';
-- Grant all privileges on the Airflow database to the Airflow user:
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
-- Exit the PostgreSQL shell:
\q
Now, the PostgreSQL database named airflow
and the user airflow
with the specified settings and privileges have been created. Proceed with the next steps to configure Apache Airflow with this PostgreSQL database.
- Configure PostgreSQL Settings for Airflow:
After creating the Airflow database and user in PostgreSQL, modify the PostgreSQL configuration to allow connections from the Apache Airflow server. Follow these steps:
Open the PostgreSQL Configuration File:
vi /etc/postgresql/12/main/postgresql.conf
Inside the File, Modify the Following Settings:
# Change and uncomment the listen_addresses to '*':
listen_addresses = '*'
# Uncomment the following line (remove the '#' at the beginning):
# port = 5432
Save and close the file.

Open the pg_hba.conf File:
vi /etc/postgresql/12/main/pg_hba.conf
# Add this line at the end of the file:
host airflow airflow {host_IP}/32 md5
Replace {host_IP} with the actual IP address of the machine running Apache Airflow.
Save and close the file.
- Restart PostgreSQL to Apply Changes:
sudo systemctl restart postgresql.service
MySQL Database Setup for Airflow (Optional)
To set up MySQL as the database backend for Apache Airflow, follow these steps:
- Install MySQL Server:
wget http://repo.mysql.com/mysql-apt-config_0.8.12-1_all.deb
sudo dpkg -i mysql-apt-config_0.8.12-1_all.deb
sudo dpkg-reconfigure mysql-apt-config
sudo apt update
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys B7B3B788A8D3785C
sudo apt update
sudo apt-cache policy mysql-server
sudo apt install -f mysql-client=5.7*
sudo apt install -f mysql-community-server=5.7*
sudo apt install -f mysql-server=5.7*
wget https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-j_8.0.32-1ubuntu20.04_all.deb
sudo dpkg -i mysql-connector-j_8.0.32-1ubuntu20.04_all.deb
ls /usr/share/java/mysql- connector-java.jar
chmod 644 /usr/share/java/mysql-connector-j-8.0.32.jar
- Install
mysqlclient
Python Package:
pip3.8 install mysqlclient
- Start MySQL Service:
sudo systemctl start mysqld
- Install MySQL Connector for Python:
pip3.8 install mysql-connector-python
pip3.8 install pymysql
- Secure MySQL Installation (Optional but Recommended):
sudo mysql_secure_installation
- Create Database and User for Airflow:
sudo mysql -u root -p
Enter the root password when prompted. Inside the MySQL shell:
CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE USER 'airflow'@'%' IDENTIFIED BY 'airflow';
GRANT ALL PRIVILEGES ON airflow.* TO 'airflow'@'%';
FLUSH PRIVILEGES;
EXIT;
- Restart MySQL to Apply Changes:
sudo systemctl restart mysqld
- Now, the MySQL database is set up with a database named
airflow
and a user namedairflow
with the necessary privileges. Proceed to configure Apache Airflow to use this MySQL database as its backend.
Apache Airflow Installation using Mpack on Ambari
Create symbolic links for Python to use Python 3.8:
sudo ln -sf /usr/bin/python3.8 /usr/bin/python3
sudo ln -sf /usr/bin/pip3.8 /usr/bin/pip3
This following provides the steps for installing and setting up Apache Airflow using Management Pack (Mpack) on an Ambari-managed cluster.
Install and Configure Mpack:
- Install Mpack:
ambari-server install-mpack --mpack=ambari-mpacks-airflow-2.8.1.tar.gz --verbose
- Uninstall Previous Mpack (if needed):
ambari-server uninstall-mpack --mpack-name=airflow-ambari-mpack
- Change Symlinks:
- Navigate to the services directory and update the Airflow symlink for each service version:
cd /var/lib/ambari-server/resources/stacks/ODP/3.0/services
unlink AIRFLOW
ln -s /var/lib/ambari-server/resources/mpacks/airflow-ambari-mpack-2.8.1/common-services/AIRFLOW/2.8.1 AIRFLOW
cd /var/lib/ambari-server/resources/stacks/ODP/3.1/services
unlink AIRFLOW
ln -s /var/lib/ambari-server/resources/mpacks/airflow-ambari-mpack-2.8.1/common-services/AIRFLOW/2.8.1 AIRFLOW
cd /var/lib/ambari-server/resources/stacks/ODP/3.2/services
unlink AIRFLOW
ln -s /var/lib/ambari-server/resources/mpacks/airflow-ambari-mpack-2.8.1/common-services/AIRFLOW/2.8.1 AIRFLOW
cd /var/lib/ambari-server/resources/stacks/ODP/3.3/services
unlink AIRFLOW
ln -s /var/lib/ambari-server/resources/mpacks/airflow-ambari-mpack-2.8.1/common-services/AIRFLOW/2.8.1 AIRFLOW
- Restart Ambari Server:
ambari-server restart
Your Apache Airflow installation is now configured and ready for use on your Ambari-managed cluster.
Steps to install Apache Airflow from the Ambari UI

- Add the Airflow service from the Ambari UI.

- Specify the host details for the Airflow Scheduler and Airflow Webserver.

- Choose the slave client configuration.

- Modify or customize the fields as needed.

Database Options:
Choose between MySQL or PostgreSQL as the backend database:
Configuring the Airflow backend database connection string and Celery settings. Users will be prompted to input specific information, including the database name, password, username, database type (choose between MySQL or PostgreSQL), and host IP. The provided script will then automatically generate the necessary configuration details for the database connection string and Celery settings.
Enter Database Information in ambari UI.
- Database Name
- Password
- Username
- Database Type: Choose between mysql or postgresql.
- Host IP
If you are using RabbitMQ then you have to setup and add RabbitMQ configurations.
- RabbitMQ Username
- RabbitMQ Password
- RabbitMQ virtual host
- Celery Broker
Once you have provided all the necessary details, click on the Next button.
- Deploy the Airflow service.

- This step will install all the necessary components and initiate the service.

- The Airflow webserver is up and running. To access the UI, you need to create a username and password. To create the admin user, you must run the
initdb
command from the Ambari UI.

This command will generate an admin user named "airflow" with the password "airflow."
You can utilize these credentials (username: "airflow", password: "airflow") to log in and access the Airflow webserver UI.

On completion of the database initialization, you can access the Airflow Webserver UI. Enter the provided credentials ("airflow" as the username and "airflow" as the password) to log in and access the Airflow webserver UI.

