AAI Integration for Airflow

AAI for Airflow allows you to put the runtime information of your DAG job executions from different Airflow instances into a single view. This means that you can see running executions (and possible failed ones) of more than one Airflow instance in a single interface, without having to switch from one to another.

The AAI integration for Airflow supports Apache Airflow and Google Cloud Composer.

This topic gives you an overview of the AAI integration for Airflow. For instructions on how to setup and configure your Airflow Connector in Windows and UNIX, see Setting Up the Airflow Connector.

This page includes the following:

Overview

Click the image to expand it.

Graphic overview of the connection between AAI and Airflow

The Airflow Connector establishes the communication between the supported Airflow providers and AAI. It is a stand-alone component. As such, it runs in its own process space, has its own installer and writes its own log files. It consists of two main parts:

  1. The universal connector framework, which handles the communication between the Airflow Connector and AAI. It also triggers, or determines, when the Connector fetches job definitions and events (executions) from Airflow.
  2. A mapper, which extracts job definitions and events (executions) from Airflow and translates them into a format that AAI can process.

Therefore, it is recommended to install the Airflow Connector near the Airflow installation.

The Connector uses the Airflow REST API and AAI to periodically extract job definitions and executions (current and historical runs) from Airflow and import them into AAI.

It has different settings that can be modified at any time. For example, you can define the following behaviors:

  • The interval in which the job definitions and events (executions) are fetched. You can set the interval for fetching both individually.
  • Upon starting, how far back should the Connector look the first time it fetches information.
  • How far forward should the Connector look, thus gathering information on planned start times.

Once the Airflow Connector is installed and running for the first time, it reaches to AAI and registers itself. It also reads either job definitions or events (executions) from Airflow, translates them into the relevant format and passes the information to the framework, which then passes the information to AAI.

Regardless of the Airflow provider that you want to integrate with AAI, you only require one Airflow Connector. Each instance of the respective Airflow provider requires its own Scheduler. Several schedulers can be linked to one Connector. However, only one Connector can be linked to an AAI environment. For more information, see Adding/Editing/Deleting Airflow Schedulers.

Java Requirements

Make sure that Java JDK 1.8.x is installed on the AAI application server and configure the JAVA_HOME environment variable accordingly.

Note:

A full JDK installation is required to run the AAI server. The JRE (Java Runtime Environment) is not sufficient.

For more information, see Compatibility Information.

See also: