airflowhdi.hooks package¶
-
class
airflowhdi.hooks.
AzureHDInsightHook
(azure_conn_id='azure_default')¶ Bases:
airflow.hooks.base_hook.BaseHook
Uses the HDInsightManagementClient from the HDInsight SDK for Python to expose several operations on an HDInsight cluster: get cluster state, create, delete.
Parameters: azure_conn_id (string) – connection ID of the Azure HDInsight cluster. See example above. -
create_cluster
(cluster_create_properties: azure.mgmt.hdinsight.models._models_py3.ClusterCreateProperties, cluster_name)¶ Creates an HDInsight cluster
This operation simply starts the deployment, which happens asynchronously in azure. You can call
get_cluster_state()
for polling on its provisioning.Note
This operation is idempotent. If the cluster already exists, this call will simple ignore that fact. So this can be used like a “create if not exists” call.
Parameters: - cluster_create_properties (ClusterCreateProperties) –
the ClusterCreateProperties representing the HDI cluster spec. You can explore some sample specs here. This python object follows the same structure as the HDInsight arm template.
- cluster_name (string) – The full cluster name. This is the unique deployment identifier of an HDI cluster in Azure, and will be used for fetching its state or submitting jobs to it HDI cluster names have the following restrictions.
- cluster_create_properties (ClusterCreateProperties) –
-
delete_cluster
(cluster_name)¶ Delete and HDInsight cluster
Parameters: cluster_name (string) – the name of the cluster to delete
-
get_cluster_state
(cluster_name) → azure.mgmt.hdinsight.models._models_py3.ClusterGetProperties¶ Gets the cluster state.
-
get_conn
() → azure.mgmt.hdinsight._hd_insight_management_client.HDInsightManagementClient¶ Return a HDInsight Management client from the Azure Python SDK for HDInsight
This hook requires a service principal in order to work. You can create a service principal from the az CLI like so:
az ad sp create-for-rbac --name localtest-sp-rbac --skip-assignment \ --sdk-auth > local-sp.json
-