August 15, 2023

tAzureStorageGet – Docs for ESB 6.x

tAzureStorageGet

Retrieves blobs from a given container for an Azure storage account according to
the specified filters applied on the virtual hierarchy of the blobs and then write
selected blobs in a local folder.

tAzureStorageGet Standard properties

These properties are used to configure tAzureStorageGet
running in the Standard Job framework.

The Standard
tAzureStorageGet component belongs to the Cloud family.

The component in this framework is generally available.

Basic settings

Property Type

Select the way the connection details
will be set.

  • Built-In: The connection details will be set
    locally for this component. You need to specify the values for all
    related connection properties manually.

  • Repository: The connection details stored
    centrally in Repository > Metadata will be reused by this component. You need to click
    the […] button next to it and in the pop-up
    Repository Content dialog box, select the
    connection details to be reused, and all related connection
    properties will be automatically filled in.

This property is not available when other connection component is selected
from the Connection Component drop-down list.

Connection Component

Select the component whose connection details will be
used to set up the connection to Azure storage from the drop-down list.

Account Name

Enter the name of the storage account you need to access. A storage account
name can be found in the Storage accounts dashboard of the Microsoft Azure Storage
system to be used. Ensure that the administrator of the system has granted you the
appropriate access permissions to this storage account.

Account Key

Enter the key associated with the storage account you need to access. Two
keys are available for each account and by default, either of them can be used for
this access.

Protocol

Select the protocol for this connection to be created.

Use Azure Shared Access Signature

Select this check box to use a shared access signature (SAS) to access the
storage resources without need for the account key. For more information,
see Using Shared Access Signatures
(SAS)
.

In the Azure Shared Access Signature field displayed,
enter your account SAS URL between double quotation marks. You can get the
SAS URL for each allowed service on Microsoft Azure portal after generating
SAS. The SAS URL format is
https://<$storagename>.<$service>.core.windows.net/<$sastoken>,
where <$storagename> is the storage account name,
<$service> is the allowed service name
(blob, file,
queue or table), and
<$sastoken> is the SAS token value. For more
information, see Constructing the Account SAS
URI
.

Note that the SAS has valid period, you can set the start time at which the
SAS becomes valid and the expiry time after which the SAS is no longer valid
when generating it, and you need to make sure your SAS is still valid when
running your Job.

Container

Enter the name of the container you need to retrieve
blobs from.

Local folder

Enter the path, or browse to the folder in which you need
to store the retrieved blobs.

Blobs

Complete this table to select the blobs to be retrieved.
The parameters to be provided are:

  • Prefix: enter the
    common prefix of the names of the blobs you need to
    retrieve. This prefix allows you to filter the blobs which
    have the specified prefix in their names in the given
    container.

    A blob name contains the virtual hierarchy of the blob itself. This
    hierarchy is a virtual path to that blob and is relative to the container where that
    blob is stored. For example, in a container named photos, the
    name of a photo blob might be 2014/US/Oakland/Talend.jpg.

    For this reason, when you define a prefix, you are actually designating a
    directory level as the blob filter, for example, 2014/ or 2014/US/.

    If you want to select the blobs stored directly beneath the container
    level, that is to say, the blobs without virtual path in their names, remove
    quotation marks and enter null.

  • Include
    sub-directories
    : select this check box to
    retrieve all of the sub-folders and the blobs in those
    folders beneath the designated directory level in the
    Blob prefix
    column. If you leave this check box clear, tAzureStorageGet returns
    only the blobs directly beneath that directory level.

  • Create parent
    directories
    : select this check box to
    replicate the virtual directory of the retrieved blobs in
    the local folder.

    Note that if you leave this check box clear,
    there must be the same directory in the local folder as the
    retrieved blobs have in the container; otherwise, those
    blobs cannot be retrieved.

Die on error

Select the check box to stop the execution of the Job when an error
occurs.

Clear the check box to skip any rows on error and complete the process for
error-free rows. When errors are skipped, you can collect the rows on error using a Row > Reject link.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job level
as well as at each component level.

Global Variables

CONTAINER

The name of the blob container. This is an After variable and it
returns a string.

LOCAL_FOLDER

The local directory used in this component. This is an After variable
and it returns a string.

ERROR_MESSAGE

The error message generated by the component when an error occurs. This
is an After variable and it returns a string.

Usage

Usage rule

This component can be used as a standalone component of a Job or Subjob.

Prerequisites

Knowledge about Microsoft Azure Storage is required.

Scenario: Retrieving files from a Azure Storage container

In this scenario, a five-component Job uses Azure Storage components to write files in
a given Azure Storage system and then retrieve selected files (blobs in terms of Azure
Storage) from that system.

use_case-tazurestorageget1.png

Before replicating this scenario, you must have appropriate rights and permissions to
read and write files in the Azure storage account to be used. For further information,
see Microsoft’s documentation for Azure Storage: http://azure.microsoft.com/en-us/documentation/services/storage/.

The talendcontainer container used in this scenario
was created using tAzureStorageContainerCreate in the
scenario Scenario: Creating a container in Azure Storage.

Linking the components

  1. In the
    Integration
    perspective of the Studio, create an empty Job, named
    azureTalend for example, from the Job Designs node in the Repository tree view.
  2. Drop tAzureStoragePut, tAzureStorageList, tJava and tAzureStorageGet
    onto the workspace.
  3. Connect the Azure Storage components using the Trigger > OnSubjobOk link while connect tAzureStorageList to tJava using the Row > Iterate link.

Connecting to an Azure storage account

  1. Double-click tAzureStorageConnection to open its Component view.

    use_case-tazurestoragecreate2.png

  2. In the Account name
    field, enter the name of the storage account to be connected to. In this
    example, it is
    talendstorage,
    an account that has been created for demonstration purposes.
  3. In the Account key field, paste the
    primary or the secondary key associated with the storage account to be used.
    These keys can be found in the Manage Access Key dashboard in the Azure
    Storage system to be connected to.
  4. From the Protocol list,
    select the protocol for the endpoint of the storage account to be used. In this
    example, it is
    HTTPS.

Writing files in Azure Storage

  1. Double-click tAzureStoragePut to open its Component view.

    use_case-tazurestorageget3.png

  2. Select
    the component whose connection details will be used to set up the Azure storage
    connection. In this example, it is
    tAzureStorageConnection_1.
  3. In the Container name
    field, enter the name of the container you need to write files in. In this
    example, it is
    talendcontainer,
    a container created in the scenario Scenario: Creating a container in Azure Storage.
  4. In the Local folder
    field, enter the path, or browse, to the directory where the files to be used
    are stored. In this scenario, they are some pictures showing technical process
    and stored locally in
    E:/photos.
    Therefore, put
    E:/photos;
    this allows tAzureStoragePut to upload
    all the files of this folder and its sub-folders into the
    talendcontainer
    container.

    For demonstration purposes, the example photos are organized as follows in
    the E:/photos folder.

    • Directly beneath the
      E:/photos
      level:

      components-use_case_triakinput_1.png

      components-use_case_triakinput_2.png

      components-use_case_triakinput_3.png

      components-use_case_triakinput_4.png

    • In the
      E:/photos/mongodb/step1
      directory:

      components-use_case_tmongodbbulkload_1.png

      components-use_case_tmongodbbulkload_2.png

      components-use_case_tmongodbbulkload_3.png

      components-use_case_tmongodbbulkload_4.png

    • In the
      E:/photos/mongodb/step2
      directory:

      components-use_case_tmongodbbulkload_5.png

      components-use_case_tmongodbbulkload_6.png

      components-use_case_tmongodbbulkload_7.png

      components-use_case_tmongodbbulkload_8.png

  5. In the Azure Storage
    folder
    field, enter the directory where you want to write files.
    This directory will be created in the container to be used if it does not exist.
    In this example, enter
    photos.

Verifying the file transfer

Configuring tAzureStorageList

  1. Double-click tAzureStorageList to open its Component view.

    use_case-tazurestorageget2.png

  2. Select
    the component whose connection details will be used to set up the Azure storage
    connection. In this example, it is
    tAzureStorageConnection_1.
  3. In the Container name
    field, enter the name of the container in which you need to check whether the
    given files exist. In this scenario, it is
    talendcontainer.
  4. Under the Blob filter table, click the
    [+] button to add one row in the
    table.
  5. In the Prefix column,
    enter the common prefix of the names of the files (blobs) to be checked. This
    prefix represents a virtual directory level you designate as the starting point
    down from which files (blobs) are checked. In this example, it is
    photos/.

    For further information about blob names, see http://msdn.microsoft.com/en-us/library/dd135715.aspx.

  6. In the Include sub-directories column,
    select the check box in the newly added row. This allows tAzureStorageList to check all the files at any
    hierarchical level beneath the designated starting point.

Configuring tJava

  1. Double-click tJava to
    open its Component view.

    use_case-tazurestorageget4.png

  2. In the Code field, enter
    System.out.println();
  3. In the Outline panel,
    which, by default, is found to the left side of the Component view, expand the tAzureStorageList node.

    use_case-tazurestorageget5.png

  4. From the Outline panel, drop the
    CONTAINER_BLOB global variable into the
    parentheses in the code in the Component
    view so as to make the code read:
    System.out.println(((Boolean)globalMap.get("tAzureStorageList_1_CURRENT_BLOB")));

Retrieving selected files

  1. Double-click tAzureStorageGet to open its Component view.

    use_case-tazurestorageget6.png

  2. Select
    the component whose connection details will be used to set up the Azure storage
    connection. In this example, it is
    tAzureStorageConnection_1.
  3. In the Container name
    field, enter the name of the container from which you need to retrieve files. In
    this scenario, it is
    talendcontainer.
  4. In the Local folder
    field, enter the path, or browse, to the directory where you want to put the
    retrieved files. In this example, it is
    E:/screenshots.
  5. Under the Blob table, click the [+] button to add one row in the table.
  6. In the Prefix column,
    enter the common name prefix of the files (blobs) to be retrieved. In this
    example, it is
    photos/mongodb/.
  7. In the Include
    sub-directories
    column, select the check box in the newly added
    row. This allows tAzureStorageGet to
    retrieve all the files (blobs) beneath the
    photos/mongodb/
    level.
  8. In the Create parent
    directories
    column, select the check box in the newly added row
    to create the same directory in the specified local folder as the retrieved
    blobs have in the container.

    Note that having this same directory is necessary for successfully retrieving
    blobs. If you leave this check box clear, then you need to create the same
    directory yourself in the target local folder.

Executing the Job

  1. Press F6 to run this
    Job.
  2. Check the execution result on the Run console.

    use_case-tazurestorageget7.png

    You can read that the Job returns the list of the blobs with
    the photos prefix in the container.

  3. Double-check the resut in the web console of the Azure storage account.

    use_case-tazurestorageget8.png

  4. Check the retrieved files in the specified local folder.

    use_case-tazurestorageget9.png

    You can see the blobs with the photos/mongodb/ prefix
    have been retrieved and their prefix transformed to directories.


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x