airflowhdi.hooks package

class airflowhdi.hooks.AzureHDInsightHook(azure_conn_id='azure_default')

Bases: airflow.hooks.base_hook.BaseHook

Uses the HDInsightManagementClient from the HDInsight SDK for Python to expose several operations on an HDInsight cluster: get cluster state, create, delete.

Example HDInsight connection

Parameters:azure_conn_id (string) – connection ID of the Azure HDInsight cluster. See example above.
create_cluster(cluster_create_properties: azure.mgmt.hdinsight.models._models_py3.ClusterCreateProperties, cluster_name)

Creates an HDInsight cluster

This operation simply starts the deployment, which happens asynchronously in azure. You can call get_cluster_state() for polling on its provisioning.

Note

This operation is idempotent. If the cluster already exists, this call will simple ignore that fact. So this can be used like a “create if not exists” call.

Parameters:
  • cluster_create_properties (ClusterCreateProperties) –

    the ClusterCreateProperties representing the HDI cluster spec. You can explore some sample specs here. This python object follows the same structure as the HDInsight arm template.

    Example ClusterCreateProperties

  • cluster_name (string) – The full cluster name. This is the unique deployment identifier of an HDI cluster in Azure, and will be used for fetching its state or submitting jobs to it HDI cluster names have the following restrictions.
delete_cluster(cluster_name)

Delete and HDInsight cluster

Parameters:cluster_name (string) – the name of the cluster to delete
get_cluster_state(cluster_name) → azure.mgmt.hdinsight.models._models_py3.ClusterGetProperties

Gets the cluster state.

get_conn() → azure.mgmt.hdinsight._hd_insight_management_client.HDInsightManagementClient

Return a HDInsight Management client from the Azure Python SDK for HDInsight

This hook requires a service principal in order to work. You can create a service principal from the az CLI like so:

az ad sp create-for-rbac --name localtest-sp-rbac --skip-assignment \
  --sdk-auth > local-sp.json