Use Case: Delivering Data Pipelines to Production Using Multi-Cloud Orchestration

This use case breaks down the basic components of a tangible, real-life scenario, bringing together heterogeneous functions (integration packages, Jobs, Connection objects, authentication requirements) deployed in a single Workflow, to meet the needs of a Cloud-based operation. The use case looks at various core mechanisms (storage, data transformation, workflow management and container technology, with a focus on authentication) and showcases Automic Automation's orchestration capabilities. It provides a deep-dive into several Cloud integrations (Kubernetes, Azure, Google, AWS) and demonstrates the ease with which Automic Automation federates and streamlines disparate capabilities, using easy-to-use orchestration mechanisms.

We explain this use case in detail in a playlist and in a course on Broadcom Software Academy. Click the links below to open the use case either from the Academy or from our YouTube channel:

Both the playlist and the course provide the same content, although the course is enriched with some additional comments. This is a short summary of the content of this use case:

  1. Overview

    This module/video describes the use case (the Workflow) and its basic requirements. The Workflow extracts data from an on-premises application. The data is uploaded to the cloud before undergoing various transformation processes. Then it is consolidated and sent to a containerized inventory platform.

    • The Workflow extracts data from on-premises applications.

    • On-premises applications (Oracle, SAP) generate and consolidate this data and sends the result to Kubernetes.

    • Cloud applications (Azure and Google Cloud Platform) transform this data.

    • Containerized application (Kubernetes) generate aggregated inventory data.

  2. Installing, Configuring and Starting Cloud Integration Agents

    This module/video explains how to download the Agents from Marketplace, install and start them.

  3. Uploading Local Data to the Cloud: Azure Blob Integration

    This module/video focuses on Azure Blob Storage. The Workflow contains an Azure Blob Storage Job that uploads the data provide by the on-premises application to the cloud.

    1. Purpose of Azure Blob Storage and role of the integration in the Workflow.

    2. Explanation of the Agent file system.

    3. Description of the template objects that are created when the Agent connects to the JCP: The Connection object and the various Azure Blob Storage Job templates (Copy, Delete, Download, Exist, Monitor, Upload).

    4. Introduction to the authentication mechanisms (a dedicated module in this course/playlist explains them in detail).

  4. Transforming the Data: Integrating Databricks

    This module/video focuses on Databricks.

    The data has been updated to Azure Blob Storage and it must be transformed and prepared so that it can later be analyzed for inventory purposes. This video explains the Databricks integration:

    1. Purpose of Databricks and the role of the integration in the Workflow.

    2. Explanation of the Agent file system.

    3. Description of the template objects that are created when the Agent connects to the JCP: The Connection object and the Databricks Job templates (Run and Start or Stop Cluster).

  5. Understanding Authentication Methods: Azure

    This module/video focuses on the authentication methods that you must configure so that the Broadcom's Azure Integration Agent can communicate with the target Azure environments. It explains the following for Azure Blob Storage:

    1. Service Principal.

    2. OAuth Token.

    3. Token from File.

  6. Uploading Local Data to the Cloud: Google Cloud Storage/AWS S3 Integrations

    This module/video focuses on two Agent integrations that are almost identical: Google Cloud Storage and AWS S3.

    The generated data has to be uploaded to the cloud for various data transformation processes in BigQuery. This video explains how to set up the Google Cloud Storage/AWS S3 Agent Integrations, it discusses how to configure the Connection object and outlines how to configure the Jobs.

  7. Triggering Auxiliary Processes: Google Cloud Composer / Airflow Integrations

    This module/video focuses on yet another two Agent integrations that are almost identical: Google Cloud Composer and Airflow.

    The data has been uploaded and we must trigger an auxiliary process in Google Cloud Composer. This video explains how to set up the Airflow/Google Cloud Composer Connection object, it discusses how to configure the Connection object and it explains the most important settings of a Run DAG Job.

  8. Transforming the Data: BigQuery Integration

    This module/video focuses on BigQuery. The Airflow process is completed and the data must be now restructured to make them consumable by the successor tasks. BigQuery transforms our data. This video explains how to set up the Google BigQuery Agent Integration, it discusses how to configure the Connection object and outlines how to configure the Jobs.

  9. Understanding Authentication Methods: Google

    This module/video describes in detail the available authentication methods for the Google integrations, with a particular focus on the Service Account Key, which is central to authenticating with Google Cloud applications.

  10. Understanding Authentication Methods: AWS S3

    This module/video describes in detail the available authentication methods for AWS S3.

  11. Triggering Processes in Clusters: Kubernetes Integration

    This module/video explains the Kubernetes integration. This part of the Workflow builds a gateway into the Kubernetes environment so as to trigger processes in the clusters.

  12. Orchestrating Cloud Processes and Monitoring the Output

    All the pieces are ready so we can execute the Workflow now. This module/video demonstrates the execution and explains its progress. It describes the reports and Agent logs.

Useful Links

For a short overview of the available Automic Automation Cloud Integrations and links to the corresponding product documentation, see Cloud Integrations.

s