Aller au contenu principal

Cloud Composer

What is Cloud Composer?

Cloud Composer is a fully managed workflow orchestration service that helps you create, schedule, monitor, and manage workflows across services and environments.

Cloud Composer is based on Apache Airflow, an open-source workflow orchestration tool.

Advantages

  • Fully managed: Google Cloud manages the infrastructure, so you can focus on your workflows.
  • Scalable: You can easily scale your workflows up and down.
  • Portable: You can write your workflows in Python, and you can run it on multiple execution engines (Apache Flink, Apache Spark, etc.)
  • Observability: You can monitor your workflows with Cloud Composer monitoring interface

Airflow

Apache Airflow is an open-source workflow orchestration tool that allows you to create, schedule, and monitor workflows.

Airflow work with DAGs

A DAG in Airflow is a representation of your workflow as an Directed Acyclic Graph:

  • Directed: Tasks run in a specific order (one direction)
  • Cyclic: Tasks cannot create loops (no cycles).
  • Graph: A collection of tasks and their dependencies

Cloud Composer Environments

Cloud Composer environments are composed of the following resources:

  • GKE cluster: The GKE cluster runs the Airflow scheduler, web server, and workers.
  • Server Airflow: The Airflow web server provides the user interface for Airflow.
  • Database Airflow: The Airflow database stores metadata for Airflow.
  • Cloud Storage bucket: The Cloud Storage bucket stores DAGs, logs, and plugins.

U can create multiple Cloud Composer environments in the same project who contain one or more DAG

Cloud Composer Components

Workflow Scheduling

U have 2 types of scheduling in Cloud Composer:

  • Periodic: U can schedule your workflow to run at specific intervals (every day, every hour, etc.)
  • Triggered: U can trigger your workflow to run when a specific event occurs (file uploaded, message received, etc.) with Cloud Functions.