July 30, 2023

tSQLDWHBulkExec – Docs for ESB 7.x

tSQLDWHBulkExec

Loads data into an Azure SQL Data Warehouse table from either Azure Blob Storage or
Azure Data Lake Storage.

For more information about loading data into Azure SQL Data Warehouse, see Designing Extract, Load, and Transform (ELT) for Azure SQL
Data Warehouse
.

tSQLDWHBulkExec Standard properties

These properties are used to configure tSQLDWHBulkExec running
in the Standard Job framework.

The Standard
tSQLDWHBulkExec component belongs to two families: Cloud and Databases.

The component in this framework is available in all Talend
products
.

Basic settings

Property Type

Select the way the connection details
will be set.

  • Built-In: The connection details will be set
    locally for this component. You need to specify the values for all
    related connection properties manually.

  • Repository: The connection details stored
    centrally in Repository > Metadata will be reused by this component. You need to click
    the […] button next to it and in the pop-up
    Repository Content dialog box, select the
    connection details to be reused, and all related connection
    properties will be automatically filled in.

Use an existing connection

Select this check box and in the Component List click the relevant connection component to
reuse the connection details you already defined.

When a Job contains the parent Job and the child Job, if you need to
share an existing connection between the two levels, for example, to share the
connection created by the parent Job with the child Job, you have to:

  1. In the parent level, register the database connection to be shared
    in the Basic settings view of the connection
    component which creates that very database connection.

  2. In the child level, use a dedicated connection component to read
    that registered database connection.

For an example about how to share a database connection across Job
levels, see

Talend Studio
User Guide
.

JDBC Provider

Select the provider of the JDBC driver to be used.

Host

Specify the IP address or hostname of the Azure SQL Data Warehouse to be used.

Port

Specify the listening port number of the Azure SQL Data Warehouse to be used.

Schema

Enter the name of the Azure SQL Data Warehouse schema.

Database

Specify the name of the Azure SQL Data Warehouse to be used.

Username and Password

Enter the user authentication data to access the Azure SQL Data Warehouse.

To enter the password, click the […] button next to the
password field, and then in the pop-up dialog box enter the password between double quotes
and click OK to save the settings.

Additional JDBC Parameters

Specify additional connection properties for the database
connection you are creating. The properties are separated by semicolon and
each property is a key-value pair. For example, encrypt=true;trustServerCertificate=false;
hostNameInCertificate=*.database.windows.net;loginTimeout=30; for
Azure SQL database connection.

Table

Specify the name of the SQL Data Warehouse table into which data will be
loaded.

Action on table

Select an operation to be performed on the table defined.

  • None: No operation is carried out.

  • Drop and create table: The table is removed
    and created again.

  • Create table: The table does not exist and
    gets created.

  • Create table if not
    exists
    : The table is created if it does not exist.

  • Drop table if exists and
    create
    : The table is removed if it already exists and created again.

  • Clear table: The table content is
    deleted. You have the possibility to rollback the operation.

  • Truncate table: The table content is
    deleted. You do not have the possibility to rollback the operation.

Schema and Edit schema

A schema is a row description. It defines the number of fields
(columns) to be processed and passed on to the next component. When you create a Spark
Job, avoid the reserved word line when naming the
fields.

  • Built-In: You create and store the schema locally for this component
    only.

  • Repository: You have already created the schema and stored it in the
    Repository. You can reuse it in various projects and Job designs.

Click Edit
schema
to make changes to the schema.

Note: If you
make changes, the schema automatically becomes built-in.
  • View schema: choose this
    option to view the schema only.

  • Change to built-in property:
    choose this option to change the schema to Built-in for local changes.

  • Update repository connection:
    choose this option to change the schema stored in the repository and decide whether
    to propagate the changes to all the Jobs upon completion. If you just want to
    propagate the changes to the current Job, you can select No upon completion and choose this schema metadata
    again in the Repository Content
    window.

Azure Storage

Select the type of the Azure Storage from which data will be loaded, either
Blob Storage or Data Lake
Store
.

Account Name

Enter the account name for your Azure Blob Storage or Azure Data Lake Storage to
be accessed.

Access key

Enter the key associated with the storage account you need to access. Two
keys are available for each account and by default, either of them can be used for
this access.

This property is available when
Blob Storage is selected from the Azure
Storage
drop-down list.

Container

Enter the name of the blob container.

This property is available when
Blob Storage is selected from the Azure
Storage
drop-down list.

Authentication key

Enter the authentication key needed to access your Azure Data Lake Storage.

This property is available
when Data Lake Storage is selected from the
Azure Storage drop-down list.

Client Id

Enter your application ID (also called client ID).

This property is available
when Data Lake Storage is selected from the
Azure Storage drop-down list.

OAuth 2.0 token endpoint

In the
Token endpoint field, copy-paste the
OAuth 2.0 token endpoint that you can obtain from the
Endpoints list accessible on the
App registrations page on your Azure
portal.

This property is available
when Data Lake Storage is selected from the
Azure Storage drop-down list.

Azure Storage Location

Specify the location where your Azure Blob Storage or Azure Data Lake Storage
account is created.

Advanced settings

File format

Select the file format that defines external data stored in your Azure Blob
Storage or Azure Data Lake Storage, Delimited Text,
Hive RCFile, Hive ORC, or
Parquet.

For more information about the file formats, see CREATE EXTERNAL FILE FORMAT.

Field separator

Specify the character(s) that indicate the end of each field in the delimited
text file.

This property is available when
Delimited Text is selected from the File
format
drop-down list.

Enclosed by

Select this check box and in the field next to it, specify the character that
encloses the string in the delimited file.

This property is available when
Delimited Text is selected from the File
format
drop-down list.

Date format

Select this check box and in the field next to it, specify the custom format
for all date and time data in the delimited file. For more information about
the date format, see CREATE EXTERNAL FILE FORMAT.

This property is available when
Delimited Text is selected from the File
format
drop-down list.

Use type default

Select this check box to store each missing value using the default value of
the data type of the corresponding column.

Clear this check box to store each missing value in the delimited file as
NULL.

This property is available when
Delimited Text is selected from the File
format
drop-down list.

Serde Method

Select a Hive serializer and deserializer method.

This property is available when Hive RCFile is selected
from the File format drop-down list.

Compressed by

Select this check box if external data is compressed, and from the drop-down
list displayed next to it, select the compression method.

Data import reject options

Select this check box to specify the following reject options.

  • Reject type: Specify how you want to deal with
    reject rows.

    • Value: If the number of rejected rows exceeds
      the value specified in the Reject value field,
      the load fails.
    • Percentage: If the percentage of rejected rows
      exceeds the value specified in the Reject value
      field, the load fails.
  • Reject value: The reject value according to the
    reject type. For percentage, it is the percent value without the symbol
    %.

  • Reject sample value: The reject percentage sample
    value.

For more information about the reject options, see CREATE EXTERNAL TABLE.

Distribution Option

Select the sharding pattern used to distribute data in the table,
Round Robin, Hash, or
Replicate. For more information about the sharding
pattern supported by Azure SQL Data Warehouse, see Azure SQL Data Warehouse – Massively parallel
processing (MPP) architecture
.

This property is available when
any option related to table creation is selected from the Action on
table
drop-down list.

Distribution Column Name

The name of the distribution column for a hash-distribution table.

This property is available when Hash is selected from
the Distribution Option drop-down list.

Table Option

Select the index type of the table, Clustered Columnstore
Index
, Heap, or Clustered
Index
. For more information, see Indexing tables in SQL Data
Warehouse
.

This property is available when
any option related to table creation is selected from the Action on
table
drop-down list.

Index column(s)

Specify the name of one or more key columns in the index. If multiple columns
are specified, separate them with comma.

This property is available when Clustered Index is
selected from the Table Option drop-down list.

Partition

Select this check box to specify the following partition options:

  • Partition column name: Specify the name of the
    column used to partition the table.

  • Range: Specify how the limit value is included in
    the range of the limit.

    • Left: The limit value is included in the
      left range of the limit.

    • Right: The limit value is included in the
      right range of the limit.

  • Partition For Values: Specify the values
    (separated by comma) used for partition.

For more information about the table partition, see Partitioning tables in SQL Data
Warehouse
.

This property is available when
any option related to table creation is selected from the Action on
table
drop-down list.

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job level
as well as at each component level.

Global Variables

ERROR_MESSAGE

The error message generated by the component when an error occurs. This is an After
variable and it returns a string.

NB_LINE_INSERTED

The number of rows inserted. This is an After variable and it returns an integer.

Usage

Usage rule

This component can be used as a standalone component of a Job or
subJob.

Limitation

Note that some features that are supported by
other databases are not supported by Azure SQL Data Warehouse. For more information,
see Unsupported table features.

Related scenario

No scenario is available for this component yet.


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x