System Requirements
To run the converter, you must have Python 3.8 or later installed.
Install the Oozie-to-Airflow (O2A) Converter
Build the required pip packages and install them on all Airflow workers and on the node where you're converting workflows to DAGs.
source /usr/odp/3.3.6.x-x/airflow/bin/activatepip install acceldata-o2apip install acceldata-o2a-libpip install --upgrade wraptIf you're installing the Oozie-to-Airflow converter in an air-gapped environment, use the provided tarballs to install the required pip packages.
pip install https://mirror.odp.acceldata.dev/v2/standalone_tarballs/o2a/1.0.0/acceldata_o2a-1.0.0.tar.gz pip install https://mirror.odp.acceldata.dev/v2/standalone_tarballs/o2a/1.0.0/acceldata_o2a_lib-1.0.0.tar.gz pip install --upgrade wraptBelow is the comprehensive usage guide for the Oozie-to-Airflow (O2A) converter.
x
(airflow) [root@airflowdemonode01 o2a]# ./bin/o2a -helpusage: o2a [-h] -i INPUT_DIRECTORY_PATH -o OUTPUT_DIRECTORY_PATH [-n DAG_NAME] [-u USER] [-s START_DAYS_AGO] [-x SCHEMA_VERSION] [-skv SKIP_VALIDATION] [-v SCHEDULE_INTERVAL] [-d]Convert Apache Oozie workflows to Apache Airflow workflows.options: -h, --help show this help message and exit -i INPUT_DIRECTORY_PATH, --input-directory-path INPUT_DIRECTORY_PATH Path to input directory -o OUTPUT_DIRECTORY_PATH, --output-directory-path OUTPUT_DIRECTORY_PATH Desired output directory -n DAG_NAME, --dag-name DAG_NAME Desired DAG name [defaults to input directory name] -u USER, --user USER The user to be used in place of all ${user.name} [defaults to user who ran the conversion] -s START_DAYS_AGO, --start-days-ago START_DAYS_AGO Desired DAG start as number of days ago -x SCHEMA_VERSION, --schema-version SCHEMA_VERSION Desired Oozie all schema version.[1.0,0.4] -skv SKIP_VALIDATION, --skip-validation SKIP_VALIDATION skip validation -v SCHEDULE_INTERVAL, --schedule-interval SCHEDULE_INTERVAL Desired DAG schedule interval as number of days -d, --dot Renders workflow files in DOT formatApplication Folder Structure for Workflows
The input application directory has to follow the structure defined as follows.
<APPLICATION>/ |- job.properties - job properties that are used to run the job |- hdfs - folder with application - should be copied to HDFS | |- workflow.xml - Oozie workflow xml (1.0 schema) | |- ... - additional folders required to be copied to HDFS |- configuration.template.properties - template of configuration values used during conversion |- configuration.properties - generated properties for configuration valuesOnce the Oozie-to-Airflow (o2a) converter is installed, you can begin converting Oozie workflows to Airflow DAGs.
./bin/o2a -i ../oozie_sample/ -o airflow_sample/ -x 0.4 -skv true[2025-04-09T15:45:42.663+0530] {workflow_xml_parser.py:242} INFO - Parsed EmailError as Action Node of type email.[2025-04-09T15:45:42.663+0530] {workflow_xml_parser.py:81} INFO - Parsed fail as Kill Node.[2025-04-09T15:45:42.664+0530] {workflow_xml_parser.py:81} INFO - Parsed Kill as Kill Node.[2025-04-09T15:45:42.664+0530] {workflow_xml_parser.py:94} INFO - Parsed End as End Node.[2025-04-09T15:45:42.664+0530] {oozie_converter.py:189} INFO - Applying pre-convert transformers[2025-04-09T15:45:42.664+0530] {oozie_converter.py:125} INFO - Converting nodes to tasks and inner relations[2025-04-09T15:45:42.693+0530] {oozie_converter.py:194} INFO - Applying post-convert transformers[2025-04-09T15:45:42.693+0530] {oozie_converter.py:173} INFO - Adding error handlers[2025-04-09T15:45:42.693+0530] {oozie_converter.py:155} INFO - Converting relations between tasks groups.[2025-04-09T15:45:42.693+0530] {oozie_converter.py:150} INFO - Converting dependencies.[2025-04-09T15:45:42.694+0530] {renderers.py:104} INFO - Saving to file: dishtioriginaljob2/.pyFixing /root/o2a/dishtioriginaljob2/.pyUpon successful conversion, the generated Airflow DAG will be located in the specified output directory.
Was this page helpful?