airflowhdi.sensors package

class airflowhdi.sensors.AzureHDInsightClusterSensor(cluster_name, azure_conn_id='azure_hdinsight_default', provisioning_only=False, poke_interval=20, mode='poke', *args, **kwargs)

Bases: airflow.sensors.base_sensor_operator.BaseSensorOperator

Asks for the state of the HDInsight cluster until it achieves the desired state: provisioning or running If it fails the sensor errors, failing the task.

See also

See the documentation of airflowhdi.hooks.AzureHDInsightHook for explanation on the parameters of this operator

Parameters:
  • cluster_name (str) – name of the cluster to check the state of
  • azure_conn_id (str) – azure connection to get config from
  • provisioning_only (bool) – poke up till provisioning only if True, else poke till the cluster hasn’t achieved a terminal state
FAILED_STATE = [<HDInsightClusterProvisioningState.failed: 'Failed'>, <HDInsightClusterProvisioningState.canceled: 'Canceled'>]
NON_TERMINAL_STATES = [<HDInsightClusterProvisioningState.in_progress: 'InProgress'>, <HDInsightClusterProvisioningState.deleting: 'Deleting'>, <HDInsightClusterProvisioningState.succeeded: 'Succeeded'>]
PROV_ONLY_TERMINAL_STATES = [<HDInsightClusterProvisioningState.in_progress: 'InProgress'>, <HDInsightClusterProvisioningState.deleting: 'Deleting'>]
poke(context)

Function that the sensors defined while deriving this class should override.

template_fields = ['cluster_name']
class airflowhdi.sensors.AzureDataLakeStorageGen1WebHdfsSensor(glob_path, azure_data_lake_conn_id='azure_data_lake_default', *args, **kwargs)

Bases: airflow.operators.sensors.BaseSensorOperator

Waits for blobs matching a wildcard prefix to arrive on Azure Data Lake Storage.

Parameters:
  • glob_path – glob path, allows wildcards
  • azure_data_lake_conn_id – connection reference to ADLS
poke(context)

Function that the sensors defined while deriving this class should override.

template_fields = ('glob_path',)
ui_color = '#901dd2'
class airflowhdi.sensors.WasbWildcardPrefixSensor(container_name, wildcard_prefix, wasb_conn_id='wasb_default', check_options=None, *args, **kwargs)

Bases: airflow.operators.sensors.BaseSensorOperator

Waits for blobs matching a wildcard prefix to arrive on Azure Blob Storage.

Parameters:
  • container_name (str) – Name of the container.
  • wildcard_prefix (str) – Prefix of the blob. Allows wildcards.
  • wasb_conn_id (str) – Reference to the wasb connection.
  • check_options (dict) – Optional keyword arguments that WasbHook.check_for_prefix() takes.

See also

See the documentation of airflow.contrib.sensors.wasb_sensor.WasbBlobSensor

poke(context)

Function that the sensors defined while deriving this class should override.

template_fields = ('container_name', 'wildcard_prefix')