July 30, 2023

tGSPut – Docs for ESB 7.x

tGSPut

Uploads files from a local directory to Google Cloud Storage so that you can manage
them with Google Cloud Storage.

tGSPut Standard properties

These properties are used to configure tGSPut running in the Standard Job framework.

The Standard
tGSPut component belongs to the Big Data and the Cloud families.

The component in this framework is available in all Talend
products
.

Basic settings

Use an existing
connection

Select this check box and in the Component
List
click the relevant connection component to reuse
the connection details you already defined.

Access Key and Secret Key

Type in the authentication information obtained from Google for
making requests to Google Cloud Storage.

These keys can be consulted on the Interoperable Access tab view
under the Google Cloud Storage tab of the project from the Google
APIs Console.

To enter the secret key, click the […] button next to
the secret key field, and then in the pop-up dialog box enter the password between double
quotes and click OK to save the settings.

For more information about the access key and secret key, go to
https://developers.google.com/storage/docs/reference/v1/getting-startedv1?hl=en/
and see the description about developer keys.

The Access Key and
Secret Key fields will be available
only if you do not select the Use an existing
connection
check box.

Bucket name

Type in the name of the bucket into which you want to upload
files.

Local directory

Type in the full path of or browse to the local directory where
the files to be uploaded are located.

Google Storage directory

Type in the Google Storage directory to which you want to upload
files.

Use files list

Select this check box and complete the Files table.

  • Filemask: enter the
    filename or filemask using wildcharacters (*) or regular
    expressions.

  • New name: enter a new
    name for the file after being uploaded.

Die on error

This check box is cleared by default, meaning to skip the row on
error and to complete the process for error-free rows.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the
Job level as well as at each component level.

Global Variables

Global Variables

NB_LINE: the number of rows read by an input component or
transferred to an output component. This is an After variable and it returns an
integer.

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule

This component can be used together with other components,
particularly the tGSGet
component.

Managing files with Google Cloud Storage

The scenario describes a Job which uploads files from the local directory to a
bucket in Google Cloud Storage, then performs copy, move and delete operations on those
files, and finally lists and displays the files in relevant buckets on the console.

tGSPut_1.png

Prerequisites: You have purchased a Google Cloud
Storage account and created three buckets under the same Google Storage directory. In this
example, the buckets created are bighouse, bed_room, and study_room.

tGSPut_2.png

Dropping and linking the components

To design the Job, proceed as follows:

  1. Drop the following components from the Palatte to design the workspace: one tGSConnection component, one tGSPut component, two tGSCopy components, one tGSDelete component, one tGSList component, one tIterateToFlow component, one tLogRow component and one tGSClose component.
  2. Connect tGSConnection to tGSPut using a Trigger > On
    Subjob Ok
    link.
  3. Connect tGSPut to the first tGSCopy using a Trigger >
    On Subjob Ok
    link.
  4. Do the same to connect the first tGSCopy
    to the second tGSCopy, connect the second
    tGSCopy to tGSDelete, connect tGSDelete to tGSList, and
    connect tGSList to tGSClose.
  5. Connect tGSList to tIterateToFlow using a Row >
    Iterate
    link.
  6. Connect tIterateToFlow to tLogRow using a Row >
    Main
    link.

Configuring the components

Opening a connection to Google Cloud Storage

  1. Double-click the tGSConnection component
    to open its Basic settings view in the
    Component tab.

    tGSPut_3.png

  2. Navigate to the Google APIs Console in your web browser to access the
    Google project hosting the Cloud Storage services you need to use.
  3. Click Google Cloud Storage > Interoperable Access to open its view, and
    copy the access key and secret key.
  4. In the Component view of the Studio,
    paste the access key and secret key to the corresponding fields
    respectively.

Uploading files to Google Cloud Storage

  1. Double-click the tGSPut component to open
    its Basic settings view in the Component tab.

    tGSPut_4.png

  2. Select the Use an existing connection
    check box and then select the connection you have configured earlier.
  3. In the Bucket name field, enter the name
    of the bucket into which you want to upload files. In this example,
    bighouse.
  4. In the Local directory field, browse to
    the directory from which the files will be uploaded, D:/Input/House in this example.

    The files under this directory are shown below:
    tGSPut_5.png

  5. Leave other settings as they are.

Copying all files from one bucket to another bucket

  1. Double-click the first tGSCopy component
    to open its Basic settings view in the
    Component tab.

    tGSPut_6.png

  2. Select the Use an existing connection
    check box and then select the connection you have configured earlier.
  3. In the Source bucket name field, enter
    the name of the bucket from which you want to copy files, bighouse in this example.
  4. Select the Source is a folder check box.
    All files from the bucket bighouse will
    be copied.
  5. In the Target bucket name field, enter
    the name of the bucket into which you want to copy files, bed_room in this example.
  6. Select Copy from the Action list.

Moving a file from one bucket to another bucket and renaming it

  1. Double-click the second tGSCopy component
    to open its Basic settings view in the
    Component tab.

    tGSPut_7.png

  2. Select the Use an existing connection
    check box and then select the connection you have configured earlier.
  3. In the Source bucket name field, enter
    the name of the bucket from which you want to move files, bighouse in this example.
  4. In the Source object key field, enter the
    key of the object to be moved, computer_01.txt in this example.
  5. In the Target bucket name field, enter
    the name of the bucket into which you want to move files, study_room in this example.
  6. Select Move from the Action list. The specified source file computer_01.txt will be moved from the bucket
    bighouse to study_room.
  7. Select the Rename check box. In the
    New name field, enter a new name for
    the moved file. In this example, the new name is laptop.txt.
  8. Leave other settings as they are.

Deleting a file in one bucket

  1. Double-click the tGSDelete component to
    open its Basic settings view in the
    Component tab.

    tGSPut_8.png

  2. Select the Use an existing connection
    check box and then select the connection you have configured earlier.
  3. Select the Delete object from bucket list
    check box. Fill in the Bucket table with
    the file information that you want to delete.

    In this example, the file computer_03.csv will be deleted from the bucket bed_room whose files are copied from the bucket
    bighouse.

Listing all files in the three buckets

  1. Double-click the tGSList component to
    open its Basic settings view in the
    Component tab.

    tGSPut_9.png

  2. Select the Use an existing connection
    check box and then select the connection you have configured earlier.
  3. Select the List objects in bucket list
    check box. In the Bucket table, enter the
    name of the three buckets in the Bucket
    name
    column, bighouse,
    study_room, and bed_room.
  4. Double-click the tIterateToFlow component
    to open its Basic settings view in the
    Component tab.

    tGSPut_10.png

  5. Click Edit schema to define the data to
    pass on to tLogRow.

    In this example, add two columns bucketName and key, and
    set their types to Object.
    tGSPut_11.png

  6. The Mapping table will be populated with
    the defined columns automatically.

    In the Value column, enter globalMap.get(“tGSList_2_CURRENT_BUCKET”) for
    the bucketName column and globalMap.get(“tGSList_2_CURRENT_KEY”) for the
    key column. You can also press
    Ctrl + Space and then choose the
    appopriate variable.
  7. Double-click the tLogRow component to
    open its Basic settings view in the
    Component tab.
  8. Select Table (print values in cells of a
    table)
    for a better view of the results.

Closing the connection to Google Cloud Storage

  1. Double-click the tGSClose component to
    open its Basic settings view in the
    Component tab.
  2. Select the connection you want to close from the Component List.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.
  2. Execute the Job by pressing F6 or
    clicking Run on the Run tab.

    tGSPut_12.png

    The files in the three buckets are displayed. As expected, at first, the
    files from the bucket bighouse are copied
    to the bucket bed_room, then the file
    computer_01.txt from the bucket
    bighouse is moved to the bucket
    study_room and renamed to be
    laptop.txt, finally the file
    computer_03.csv is deleted from the
    bucket bed_room.

Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x