July 30, 2023

tSnowflakeRow – Docs for ESB 7.x

tSnowflakeRow

Executes the SQL command stated onto a specified Snowflake database.

tSnowflakeRow Standard properties

These properties are used to configure tSnowflakeRow running in
the Standard Job framework.

The Standard
tSnowflakeRow component belongs to the Cloud
family.

Note: This component is a specific version of a dynamic database
connector. The properties related to database settings vary depending on your database
type selection. For more information about dynamic database connectors, see Dynamic database components.

Basic settings

Database

Select a type of database from the list and click
Apply.

Property Type

Select the way the connection details
will be set.

  • Built-In: The connection details will be set
    locally for this component. You need to specify the values for all
    related connection properties manually.

  • Repository: The connection details stored
    centrally in Repository > Metadata will be reused by this component. You need to click
    the […] button next to it and in the pop-up
    Repository Content dialog box, select the
    connection details to be reused, and all related connection
    properties will be automatically filled in.

This property is not available when other connection component is selected
from the Connection Component drop-down list.

Connection Component

Select the component that opens the database connection to be reused by this
component.

Account

In the Account field, enter, in double quotation marks, the account name
that has been assigned to you by Snowflake.

User Id and Password

Enter, in double quotation marks, your authentication
information to log in Snowflake.

  • In the User ID field, enter, in double quotation
    marks, your login name that has been defined in Snowflake using the LOGIN_NAME parameter of Snowflake.
    For details, ask the administrator of your Snowflake system.

  • To enter the password, click the […] button next to the
    password field, and then in the pop-up dialog box enter the password between double quotes
    and click OK to save the settings.

Warehouse

Enter, in double quotation marks, the name of the
Snowflake warehouse to be used. This name is case-sensitive and is normally upper
case in Snowflake.

Schema

Enter, within double quotation marks, the name of the
database schema to be used. This name is case-sensitive and is normally upper case
in Snowflake.

Database

Enter, in double quotation marks, the name of the
Snowflake database to be used. This name is case-sensitive and is normally upper
case in Snowflake.

Table

Click the […] button and in the displayed wizard, select the Snowflake
table to be used.

Schema and Edit Schema

A schema is a row description. It defines the number of fields
(columns) to be processed and passed on to the next component. When you create a Spark
Job, avoid the reserved word line when naming the
fields.

Built-In: You create and store the schema locally for this component
only.

Repository: You have already created the schema and stored it in the
Repository. You can reuse it in various projects and Job designs.

If the Snowflake data type to
be handled is VARIANT, OBJECT or ARRAY, while defining the schema in the
component, select String for the
corresponding data in the Type
column of the schema editor wizard.

Click Edit
schema
to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this
    option to view the schema only.

  • Change to built-in property:
    choose this option to change the schema to Built-in for local changes.

  • Update repository connection:
    choose this option to change the schema stored in the repository and decide whether
    to propagate the changes to all the Jobs upon completion. If you just want to
    propagate the changes to the current Job, you can select No upon completion and choose this schema metadata
    again in the Repository Content
    window.

This
component offers the advantage of the dynamic schema feature. This allows you to
retrieve unknown columns from source files or to copy batches of columns from a source
without mapping each column individually. For further information about dynamic schemas,
see
Talend Studio

User Guide.

This
dynamic schema feature is designed for the purpose of retrieving unknown columns of a
table and is recommended to be used for this purpose only; it is not recommended for the
use of creating tables.

Guess Query

Click the button to generate the query which corresponds to the table and the
schema in the Query field.

Query

Specify the SQL command to be executed.

For more information about Snowflake SQL commands, see SQL Command Reference.

Die on error

Select the check box to stop the execution of the Job when an error
occurs.

Clear the check box to skip any rows on error and complete the
process for error-free rows.

When errors are skipped, you can collect the rows on error
using a Row > Reject connection.

Advanced settings

Additional JDBC
Parameters

Specify additional connection properties for the database connection you are
creating. The properties are separated by semicolon and each property is a key-value
pair, for example, encryption=1;clientname=Talend.

This field is available only when you
select Use this Component from the Connection Component drop-down list and select
Internal from the Storage drop-down list in the Basic settings view.

Use Custom Snowflake
Region
Select this check box to specify a custom
Snowflake region. This option is available only when you select Use This Component from the Connection Component drop-down list in the
Basic settings view.

  • Region ID: enter a
    region ID in double quotation marks, for example eu-west-1 or east-us-2.azure.

For more information on Snowflake Region
ID, see Supported Regions.

Login Timeout

Specify the timeout period (in minutes)
of Snowflake login attempts. An error will be generated if no response is received
in this period.

Tracing

Select the log level for the Snowflake JDBC driver. If
enabled, a standard Java log is generated.

Role

Enter, in double quotation marks, the default access
control role to use to initiate the Snowflake session.

This role must already exist and has been granted to the
user ID you are using to connect to Snowflake. If this field is left empty, the
PUBLIC role is automatically granted. For information about Snowflake access control
model, see Understanding the Access Control
Model
.

Propagate QUERYs recordset

Select this check box to propagate the result of the SELECT query to the output
flow.

Use PreparedStatement

Select this check box if you want to query the database using a prepared statement. In
the Set PreparedStatement Parameters table displayed, specify the
value for each parameter represented by a question mark ? in the
SQL statement defined in the Query field.

  • Parameter Index: the position of the parameter in the SQL
    statement.

  • Parameter Type: the data type of the parameter.

  • Parameter Value: the value of the parameter.

For a related use case of this property, see Using PreparedStatement objects to query data.

Commit every

Specify the number of rows to be processed before committing
batches of rows together into the database.

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job level
as well as at each component level.

Dynamic settings

Dynamic settings

Click the [+] button to add a row in the table
and fill the Code field with a context
variable to choose your database connection dynamically from multiple
connections planned in your Job. This feature is useful when you need to
access database tables having the same data structure but in different
databases, especially when you are working in an environment where you
cannot change your Job settings, for example, when your Job has to be
deployed and executed independent of Talend Studio.

For examples on using dynamic parameters, see Reading data from databases through context-based dynamic connections and Reading data from different MySQL databases using dynamically loaded connection parameters. For more information on Dynamic
settings
and context variables, see Talend Studio
User Guide.

Global Variables

NB_LINE

The number of rows processed. This is an After variable and it returns an integer.

ERROR_MESSAGE

The error message generated by the component when an error occurs. This is an After
variable and it returns a string.

Usage

Usage rule

This component offers the flexibility of the database query and
covers all possible SQL queries.

Querying data in a cloud file through a Snowflake external table and a materialized
view

Data in Snowflake is maintained in databases. You can query this data by using:

  • External tables, which reference data files located in a cloud storage. These
    tables stores file-level metadata (such as the filename, a version
    identifiers, and other properties) about a data file stored in an external
    stage, thus providing users a database table interface for querying the data
    in the file. For information about the Snowflake external table feature, see
    https://docs.snowflake.net/manuals/user-guide/tables-external-intro.html

  • Materialized views, which store pre-computed data derived by a query. Since the
    data is pre-computed, querying a materialized view is faster than executing the
    original query. For information about the Snowflake materialized view feature,
    see https://docs.snowflake.net/manuals/user-guide/views-materialized.html.

This scenario describes the way to query data in a file stored in AWS S3 bucket
through a Snowflake external table and a materialized view. It assumes that:

  • You have a valid Amazon S3 user account.

  • The data file (log1.json in
    this example) is in the logs folder under
    your S3 bucket named S3://my-bucket.
  • You have a valid Snowflake user account.

Querying data in a cloud file through a Snowflake external table

This example describes how to query data stored in a cloud file through a Snowflake
external table.

In this example, the file contains the following records.

Creating the Job for querying data through a Snowflake external table

  1. Create a standard Job.
  2. Drop the components listed in the following table onto the
    design workspace.

    A component is assigned a default name automatically in the
    format of <component name>_<sequence number> when it is dropped onto
    the design workspace. This scenario refers the components in the Jobs using
    their default names. The following table also lists the default component names.
    Component Default component name
    tDBConnection tDBConnection_1
    tDBRow tDRRow_1
    tDBRow tDBRow_2
    tDBRow tDBRow_3
    tDBInput tDBInput_1
    tLogRow tLogRow_1
    tDBClose tDBClose_1
  3. Connect the components:

    1. tDBConnection_1
      to tDBRow_1 using a Trigger > On Subjob OK connection
    2. tDBRow_1 to
      tDBRow_2 using a Trigger > On Subjob OK connection
    3. tDBRow_2 to
      tDBRow_3 using a Trigger > On Subjob OK connection
    4. tDBRow_3 to
      tDBInput_1 using a
      Trigger > On Subjob OK connection
    5. tDBInput to tLogRow using
      a Row > Main connection
    6. tDBInput_1 to
      tDBClose using a Trigger > On Subjob OK connection
    tSnowflakeRow_1.png

Configuring the Snowflake external table Job

  1. Configure tDBConnection_1 to establish a connection to
    Snowflake. In the Basic settings view of the
    component:

    1. Select Snowflake from the
      Database list and click
      Apply.
    2. Enter the following Snowflake credential items in the
      rest fields:

      • Snowflake account name in the Account field
      • Snowflake region
      • Snowflake user ID in the User Id field
      • Snowflake account password in the Password field
      • Snowflake warehouse
      • Snowflake schema
      • Snowflake database
  2. Configure tDBRow_1 to create a stage referencing the
    file S3://my-bucket/logs/log1.json. In the Basic
    settings
    view of the component:

    1. Select Snowflake from the
      Database list and click
      Apply;
    2. Select tDBConnection_1 from the Connection
      Component
      list;
    3. Enter the following code in double quotation marks in the
      Query field.

    4. Leave other options as they are.
  3. Configure tDBRow_2 to create an external table for the
    stage. In the Basic settings view of the component:

    1. Select Snowflake from the
      Database list and click
      Apply;
    2. Select tDBConnection_1 from the Connection
      Component
      list;
    3. Enter the following code in double quotation marks in the
      Query field.

    4. Leave other options as they are.
  4. Configure tDBRow_3 to refresh the external table using
    the S3://logs/log1.json file. In the Basic
    settings
    view of the component:

    1. Select Snowflake from the
      Database list and click
      Apply;
    2. Select tDBConnection_1 from the Connection
      Component
      list;
    3. Enter the following code in double quotation marks in the
      Query field.

    4. Leave other options as they are.
  5. Configure tDBInput_1 to query the external table. In
    the Basic settings view of the component:

    1. Select Snowflake from the
      Database list and click
      Apply;
    2. Select tDBConnection_1 from the Connection
      Component
      list;
    3. Enter the following code in double quotation marks in the
      Query field.

    4. Click the three-dot button to the right of Edit
      schema
      . Add the following three columns and click
      OK to propagate the schema.

      • ID, type String and Db
        Column

        ID
      • Name, type String and Db
        Column

        NAME
      • City, type String and Db
        Column

        CITY
      tSnowflakeRow_2.png

    5. Leave other options as they are.
  6. Configure tLogRow_1 to specify the output layout. In the
    Basic settings view of the component, select a
    preferred mode for the output.
  7. Configure tDBClose_1
    to close the connection to Snowflake. In the Basic
    settings
    view of the component:

    1. Select Snowflake
      from the Database list and click
      Apply;
    2. Select tDBConnection_1 from the Connection Component list;
  8. Press Ctrl + S to save the Job.

Executing the Snowflake external table Job and checking the result

Press F6 to run the Job and check the result.
The data in the cloud file is listed in columns.

tSnowflakeRow_3.png

Querying data in a cloud file through a Snowflake materialized view

This example describes how to query data from a
cloud file through a Snowflake materialized view. It is based on the Job described in
the previous example.

Updating the Job for querying data through a Snowflake materialized view

  1. In the Job described in the previous example, add a tDBRow component (default name: tDBRow_4) .
  2. Remove the connection between tDBRow_3 and
    tDBInput_1.
  3. Connect tDBRow_3 to
    tDBRow_4 using a Trigger > On Subjob OK connection.
  4. Connect tDBRow_4 to
    tDBInput_1 using a Trigger > On Subjob OK connection.

    tSnowflakeRow_4.png

Configuring the Snowflake materialized view Job

  1. Configure tDBRow_4 to
    create a materialized view. In the Basic
    settings
    view of the component:

    1. Select Snowflake from
      the Database list and click
      Apply.
    2. Enter the following code in double quotation marks in
      the Query field.

    3. Leave other options as they are.
  2. Configure tDBInput_1 to query the external table through
    the materialized view. In the Basic settings view of the
    component:

    1. Select Snowflake from the
      Database list and click
      Apply;
    2. Select tDBConnection_1 from the Connection
      Component
      list;
    3. Enter the following code in double quotation marks in the
      Query field.

    4. Click the three-dot button to the right of Edit schema. Add the following columns
      and click OK to propagate the
      schema.

      • Name,
        type String and Db Column
        NAME
      • City,
        type String and Db Column
        CITY
      tSnowflakeRow_5.png

    5. Leave other options as they are.
  3. Press Ctrl + S to save the Job.

Executing the Snowflake materialized view Job and checking the result

Press F6 to run the Job and check the result.
The data filtered by the materialized view in the cloud file is listed in columns.

tSnowflakeRow_6.png

Related scenario


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x