Known Limitations (O2A)
There are a few limitations in the implementation of the Oozie-To-Airflow converter. It's not possible to write a converter that handles all cases of complex workflows from Oozie because some of functionalities available are not possible to map easily to existing Airflow Operators
Many of those limitations are not blockers - the workflows will still be converted to Python DAGs and it should be possible to manually (or automatically) post-process the DAGs to add custom functionality. So even with those limitations in place you can still save a ton of work when converting many Oozie workflows.
Issue 1: Exit status not available (shell actions)
In Oozie, the output (STDOUT) of a Shell job can be made available to the workflow job after the Shell job ends, which can be utilized within decision nodes. The reference is here in the Oozie documentation.
Currently, Airflow's BashOperator
is used, which, when do_xcom_push
is set to True
, stores only the last line of the job's output in an XCom. This line often pertains to the Dataproc job submission status rather than the actual result of the Shell action, rendering it less useful for subsequent tasks.
This limitation has been discussed in the GitHub issue: Finalize shell mapper.
Issue 2: Not all global configuration methods are supported
Oozie offers multiple methods for passing configuration parameters to actions. However, the following existing configuration options are not supported in the Oozie-to-Airflow (O2A) conversion process (though they can be added if needed):
- The config-default.xml file
- Parameters section of workflow.xml
- Handle Global configuration properties
Issue 3: No Shell launcher configuration (shell actions)
Shell launcher configuration can be specified with a file, using the job-xml element, and inline, using the configuration elements. The reference is here in the Oozie documentation.
Currently, there is no mechanism to specify the shell launcher configuration; it is disregarded.
This limitation has been discussed in the GitHub issue: Shell Launcher Configuration
Issue 4: Custom messages missing for Kill Node
The Kill Node may have a custom log message defined. Currently, this feature is not implemented but is planned for future enhancements. The reference is here in the Oozie docs.
Issue 5: Capturing output is not supported
Capturing output from tasks is not currently implemented. The reference is here in the Example Oozie docs
Issue 6: Only the connection to the local Hive instance is supported.
Issue 7: Not all elements are supported
For Hive, these elements are not supported: job-tracker and
name-node``
For Hive2, these elements are not supported: job-tracker
, name-node
, jdbc-url
, and password
.
The limitation has been discussed in the GitHub issue: Hive connection configuration and other elements__, __
Instead, you can create a custom Hive connection in Airflow as shown below

Issue 8: Update the Hive Script Variables
You may need to update your Hive script variables. Variables such as ${INPUT}
, when passed through hiveconf
, might not always resolve correctly unless prefixed with hiveconf:
, especially during Tez execution.
The example is as follows.
CREATE EXTERNAL TABLE test (a INT) STORED AS TEXTFILE LOCATION '${INPUT}';
INSERT OVERWRITE DIRECTORY '${OUTPUT}' SELECT * FROM test;
These variables will get converted to
CREATE EXTERNAL TABLE test (a INT) STORED AS TEXTFILE LOCATION '${hiveconf:INPUT}';
INSERT OVERWRITE DIRECTORY '${hiveconf:OUTPUT}' SELECT * FROM test;
###