airflowhdi.operators package

class airflowhdi.operators.AzureHDInsightSshOperator(cluster_name, azure_conn_id='azure_hdinsight_default', *args, **kwargs)

Bases: airflow.contrib.operators.ssh_operator.SSHOperator

Uses the AzureHDInsightHook and SSHHook to run an SSH command on the master node of the given HDInsight cluster

The SSH username and password are fetched from the azure connection passed. The SSH endpoint and port of the cluster is also fetched using the airflowhdi.hooks.AzureHDInsightHook.get_cluster_state() method.

Parameters:
  • cluster_name (str) – Unique cluster name of the HDInsight cluster
  • azure_conn_id – connection ID of the Azure HDInsight cluster.
  • command (str) – command to execute on remote host. (templated)
  • timeout (int) – timeout (in seconds) for executing the command.
  • environment (dict) – a dict of shell environment variables. Note that the server will reject them silently if AcceptEnv is not set in SSH config.
  • do_xcom_push (bool) – return the stdout which also get set in xcom by airflow platform
  • get_pty (bool) – request a pseudo-terminal from the server. Set to True to have the remote process killed upon task timeout. The default is False but note that get_pty is forced to True when the command starts with sudo.
execute(context)
Raises:AirflowException – when the SSH endpoint of the HDI cluster cannot be found
class airflowhdi.operators.AzureHDInsightCreateClusterOperator(cluster_name, cluster_params: azure.mgmt.hdinsight.models._models_py3.ClusterCreateProperties, azure_conn_id='azure_hdinsight_default', *args, **kwargs)

Bases: airflow.models.baseoperator.BaseOperator

See also

See the documentation of airflowhdi.hooks.AzureHDInsightHook for explanation on the parameters of this operator

Parameters:
  • azure_conn_id (string) – connection ID of the Azure HDInsight cluster.
  • cluster_name (str) – Unique cluster name of the HDInsight cluster
  • cluster_params (ClusterCreateProperties) –

    the azure.mgmt.hdinsight.models.ClusterCreateProperties representing the HDI cluster spec. You can explore some sample specs here. This python object follows the same structure as the HDInsight arm template.

    Example ClusterCreateProperties

execute(context)

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

template_fields = ['cluster_params']
class airflowhdi.operators.ConnectedAzureHDInsightCreateClusterOperator(azure_conn_id=None, hdi_conn_id=None, *args, **kwargs)

Bases: airflowhdi.operators.azure_hdinsight_create_cluster_operator.AzureHDInsightCreateClusterOperator

An extension of the AzureHDInsightCreateClusterOperator which allows getting credentials and other common properties for azure.mgmt.hdinsight.models.ClusterCreateProperties from a connection

Parameters:
  • azure_conn_id (string) – connection ID of the Azure HDInsight cluster.
  • hdi_conn_id (str) – connection ID of the connection that contains a azure.mgmt.hdinsight.models.ClusterCreateProperties object in its extra field
  • cluster_params (ClusterCreateProperties) – cluster creation spec
  • cluster_name (str) – Unique cluster name of the HDInsight cluster
execute(context)

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

param_field_types = [<enum 'OSType'>, <enum 'Tier'>, <class 'azure.mgmt.hdinsight.models._models_py3.ClusterDefinition'>, <class 'azure.mgmt.hdinsight.models._models_py3.ComputeProfile'>, <class 'azure.mgmt.hdinsight.models._models_py3.Role'>, <class 'azure.mgmt.hdinsight.models._models_py3.HardwareProfile'>, <class 'azure.mgmt.hdinsight.models._models_py3.LinuxOperatingSystemProfile'>, <class 'azure.mgmt.hdinsight.models._models_py3.OsProfile'>, <class 'azure.mgmt.hdinsight.models._models_py3.StorageProfile'>, <class 'azure.mgmt.hdinsight.models._models_py3.StorageAccount'>]
class airflowhdi.operators.AzureHDInsightDeleteClusterOperator(cluster_name, azure_conn_id='azure_hdinsight_default', *args, **kwargs)

Bases: airflow.models.baseoperator.BaseOperator

See also

See the documentation of airflowhdi.hooks.AzureHDInsightHook for explanation on the parameters of this operator

execute(context)

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.