Title
Create new category
Edit page index title
Edit category
Edit link
Apache Airflow Logging Guide
Overview
Apache Airflow 2.8.1 generates logs for various components such as the Scheduler, Webserver, and Worker processes. These logs are essential for debugging and monitoring the system.
Log Storage Location
The Airflow logs are stored under the following directory:
xxxxxxxxxx_var_log_airflow_logs_To list the logs available, use:
xxxxxxxxxxls _var_log_airflow_logs_This directory contains logs for various DAGs, the scheduler, and the webserver.
Scheduler Logs
The Airflow scheduler logs can be found in:
xxxxxxxxxx_var_log_airflow_logs_scheduler_The logs are organized by date, with the latest logs available under:
xxxxxxxxxx_var_log_airflow_logs_scheduler_latest_To inspect a specific log file, navigate to the corresponding date:
xxxxxxxxxxls _var_log_airflow_logs_scheduler_2025-02-19Example output:
xxxxxxxxxxclean_up.py.logWebserver Logs
The Airflow webserver logs can be found in:
xxxxxxxxxx_var_log_airflow_logs_webserver-access.log_var_log_airflow_logs_webserver-error.logTo check the last 100 lines of the webserver logs, use:
xxxxxxxxxxtail -n 100 _var_log_airflow_logs_webserver-access.logFor errors, check:
xxxxxxxxxxtail -n 100 _var_log_airflow_logs_webserver-error.logDAG Logs
Each DAG execution logs its runs under:
xxxxxxxxxx_var_log_airflow_logs_dag_id=<dag_id>_run_id=<run_id>_task_id=<task_id>_attempt=<attempt>.logExample:
xxxxxxxxxxls _var_log_airflow_logs_dag_id=dataset_consumes_1_run_id=manual__2025-02-14T08:04:06.983781+00:00_task_id=consuming_1_Output:
xxxxxxxxxxattempt=1.logTo view the logs of a task attempt:
cat _var_log_airflow_logs_dag_id=dataset_consumes_1_run_id=manual__2025-02-14T08:04:06.983781+00:00_task_id=consuming_1_attempt=1.logDebugging with Systemd
Since Airflow components are managed using systemd, you can check the status and logs using:
Webserver
Check the webserver status:
xxxxxxxxxxsystemctl status airflow-webserverView the last 100 logs:
xxxxxxxxxxjournalctl -u airflow-webserver -n 100 --no-pagerScheduler
Check the scheduler status:
xxxxxxxxxxsystemctl status airflow-webserverView the last 100 logs:
xxxxxxxxxxjournalctl -u airflow-webserver -n 100 --no-pagerWorker (if applicable)
If using Celery workers, check their status:
xxxxxxxxxxsystemctl status airflow-workerView the last 100 logs:
xxxxxxxxxxjournalctl -u airflow-worker -n 100 --no-pager