A JVM SDK for Apache Airflow. You can use any JVM-compatible language to write workflow bundles, and have Airflow consume the result.
The SDK and execution-time logic is implemented in Kotlin. An example is bundled showing how the SDK can be used in Java.
./gradlew build-
Put the DAG with stub tasks to somewhere Airflow can find.
-
Ensure the
javacommand is available in the same environment the Airflow task worker is in. -
Package the example and its dependencies into JARs in
./example/build/install/example/lib./gradlew :example:installDist
-
Configure Airflow to route tasks in the java queue to be run with Java:
export AIRFLOW__SDK__COORDINATORS='{ "java": { "classpath": "airflow.sdk.coordinators.java.JavaCoordinator", "kwargs": {"jars_root": ["/opt/airflow/java-sdk/example/build/install/example/lib"]} } }' export AIRFLOW__SDK__QUEUE_TO_COORDINATOR='{"java": "java"}'
-
Ensure the Connection and Variable needed by the example DAG are available:
export AIRFLOW_CONN_TEST_HTTP='{ "conn_type": "http", "login": "user", "password": "pass", "host": "example.com", "port": 1234, "extra": {"param1": "val1", "param2": "val2"} }' export AIRFLOW_VAR_MY_VARIABLE=123
The user uses the SDK to implement a Java application that implements task methods, and metadata on which DAG and task each method should be used for.
When the Airflow Supervisor identifies a task should be run with Java, it
launches the Java application as a subprocess. The Java application accepts
flags --comm and --logs from the command line to identify TCP sockets it
should connect to, and communicates with the Supervisor through these channels
during execution.
- On connection, the Supervisor immediately sends a StartupDetails message through the comm socket.
- The Java application finds and executes the relevant method.
- During execution, the Java application uses the comm socket to retrieve information (e.g. Variable) from, and send data (e.g. XCom) to Airflow.
- The Java application informs the comm socket to tell the Supervisor the task's terminal state.
- The Java application exits.
During the Java application's lifetime, it also sends log messages generated by the SDK (not user code) through the logs socket, so the Supervisor can append them to Airflow logs.
Communication uses the same formats as the Python-based processes.
See Architectural Design Records in the adr directory to learn more.