Apache Airflow Logging Guide
Overview
Apache Airflow 2.8.1 generates logs for various components such as the Scheduler, Webserver, and Worker processes. These logs are essential for debugging and monitoring the system.
Log Storage Location
The Airflow logs are stored under the following directory:
/var/log/airflow/logs/
To list the logs available, use:
ls /var/log/airflow/logs/
This directory contains logs for various DAGs, the scheduler, and the webserver.
Scheduler Logs
The Airflow scheduler logs can be found in:
/var/log/airflow/logs/scheduler/
The logs are organized by date, with the latest logs available under:
/var/log/airflow/logs/scheduler/latest/
To inspect a specific log file, navigate to the corresponding date:
ls /var/log/airflow/logs/scheduler/2025-02-19
Example output:
clean_up.py.log
Webserver Logs
The Airflow webserver logs can be found in:
/var/log/airflow/logs/webserver-access.log
/var/log/airflow/logs/webserver-error.log
To check the last 100 lines of the webserver logs, use:
tail -n 100 /var/log/airflow/logs/webserver-access.log
For errors, check:
tail -n 100 /var/log/airflow/logs/webserver-error.log
DAG Logs
Each DAG execution logs its runs under:
/var/log/airflow/logs/dag_id=<dag_id>/run_id=<run_id>/task_id=<task_id>/attempt=<attempt>.log
Example:
ls /var/log/airflow/logs/dag_id=dataset_consumes_1/run_id=manual__2025-02-14T08:04:06.983781+00:00/task_id=consuming_1/
Output:
attempt=1.log
To view the logs of a task attempt:
cat /var/log/airflow/logs/dag_id=dataset_consumes_1/run_id=manual__2025-02-14T08:04:06.983781+00:00/task_id=consuming_1/attempt=1.log
Debugging with Systemd
Since Airflow components are managed using systemd
, you can check the status and logs using:
Webserver
Check the webserver status:
systemctl status airflow-webserver
View the last 100 logs:
journalctl -u airflow-webserver -n 100 --no-pager
Scheduler
Check the scheduler status:
systemctl status airflow-webserver
View the last 100 logs:
journalctl -u airflow-webserver -n 100 --no-pager
Worker (if applicable)
If using Celery workers, check their status:
systemctl status airflow-worker
View the last 100 logs:
journalctl -u airflow-worker -n 100 --no-pager