tSQLDWHBulkExec

Loads data into an Azure SQL Data Warehouse table from either Azure Blob Storage or
Azure Data Lake Storage.

For more information about loading data into Azure SQL Data Warehouse, see Designing Extract, Load, and Transform (ELT) for Azure SQL
Data Warehouse.

tSQLDWHBulkExec Standard properties

These properties are used to configure tSQLDWHBulkExec running
in the Standard Job framework.

The Standard
tSQLDWHBulkExec component belongs to two families: Cloud and Databases.

The component in this framework is available in all Talend
products.

Basic settings

Property Type	Select the way the connection details will be set. Built-In: The connection details will be set locally for this component. You need to specify the values for all related connection properties manually. Repository: The connection details stored centrally in Repository > Metadata will be reused by this component. You need to click the […] button next to it and in the pop-up Repository Content dialog box, select the connection details to be reused, and all related connection properties will be automatically filled in.
Use an existing connection	Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined. When a Job contains the parent Job and the child Job, if you need to share an existing connection between the two levels, for example, to share the connection created by the parent Job with the child Job, you have to: In the parent level, register the database connection to be shared in the Basic settings view of the connection component which creates that very database connection. In the child level, use a dedicated connection component to read that registered database connection. For an example about how to share a database connection across Job levels, see Talend Studio User Guide.
JDBC Provider	Select the provider of the JDBC driver to be used.
Host	Specify the IP address or hostname of the Azure SQL Data Warehouse to be used.
Port	Specify the listening port number of the Azure SQL Data Warehouse to be used.
Schema	Enter the name of the Azure SQL Data Warehouse schema.
Database	Specify the name of the Azure SQL Data Warehouse to be used.
Username and Password	Enter the user authentication data to access the Azure SQL Data Warehouse. To enter the password, click the […] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.
Additional JDBC Parameters	Specify additional connection properties for the database connection you are creating. The properties are separated by semicolon and each property is a key-value pair. For example, encrypt=true;trustServerCertificate=false; hostNameInCertificate=.database.windows.net;loginTimeout=30;* for Azure SQL database connection.
Table	Specify the name of the SQL Data Warehouse table into which data will be loaded.
Action on table	Select an operation to be performed on the table defined. None: No operation is carried out. Drop and create table: The table is removed and created again. Create table: The table does not exist and gets created. Create table if not exists: The table is created if it does not exist. Drop table if exists and create: The table is removed if it already exists and created again. Clear table: The table content is deleted. You have the possibility to rollback the operation. Truncate table: The table content is deleted. You do not have the possibility to rollback the operation.
Schema and Edit schema	A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. When you create a Spark Job, avoid the reserved word `line` when naming the fields. Built-In: You create and store the schema locally for this component only. Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Click Edit schema to make changes to the schema. Note: If you make changes, the schema automatically becomes built-in. View schema: choose this option to view the schema only. Change to built-in property: choose this option to change the schema to Built-in for local changes. Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the Repository Content window.
Azure Storage	Select the type of the Azure Storage from which data will be loaded, either `Blob Storage` or `Data Lake Store`.
Account Name	Enter the account name for your Azure Blob Storage or Azure Data Lake Storage to be accessed.
Access key	Enter the key associated with the storage account you need to access. Two keys are available for each account and by default, either of them can be used for this access. This property is available when `Blob Storage` is selected from the Azure Storage drop-down list.
Container	Enter the name of the blob container. This property is available when `Blob Storage` is selected from the Azure Storage drop-down list.
Authentication key	Enter the authentication key needed to access your Azure Data Lake Storage. This property is available when `Data Lake Storage` is selected from the Azure Storage drop-down list.
Client Id	Enter your application ID (also called client ID). This property is available when `Data Lake Storage` is selected from the Azure Storage drop-down list.
OAuth 2.0 token endpoint	In the Token endpoint field, copy-paste the OAuth 2.0 token endpoint that you can obtain from the Endpoints list accessible on the App registrations page on your Azure portal. This property is available when `Data Lake Storage` is selected from the Azure Storage drop-down list.
Azure Storage Location	Specify the location where your Azure Blob Storage or Azure Data Lake Storage account is created.

Advanced settings

File format	Select the file format that defines external data stored in your Azure Blob Storage or Azure Data Lake Storage, `Delimited Text`, `Hive RCFile`, `Hive ORC`, or `Parquet`. For more information about the file formats, see CREATE EXTERNAL FILE FORMAT.
Field separator	Specify the character(s) that indicate the end of each field in the delimited text file. This property is available when `Delimited Text` is selected from the File format drop-down list.
Enclosed by	Select this check box and in the field next to it, specify the character that encloses the string in the delimited file. This property is available when `Delimited Text` is selected from the File format drop-down list.
Date format	Select this check box and in the field next to it, specify the custom format for all date and time data in the delimited file. For more information about the date format, see CREATE EXTERNAL FILE FORMAT. This property is available when `Delimited Text` is selected from the File format drop-down list.
Use type default	Select this check box to store each missing value using the default value of the data type of the corresponding column. Clear this check box to store each missing value in the delimited file as NULL. This property is available when `Delimited Text` is selected from the File format drop-down list.
Serde Method	Select a Hive serializer and deserializer method. This property is available when `Hive RCFile` is selected from the File format drop-down list.
Compressed by	Select this check box if external data is compressed, and from the drop-down list displayed next to it, select the compression method.
Data import reject options	Select this check box to specify the following reject options. Reject type: Specify how you want to deal with reject rows. Value: If the number of rejected rows exceeds the value specified in the Reject value field, the load fails. Percentage: If the percentage of rejected rows exceeds the value specified in the Reject value field, the load fails. Reject value: The reject value according to the reject type. For percentage, it is the percent value without the symbol %. Reject sample value: The reject percentage sample value. For more information about the reject options, see CREATE EXTERNAL TABLE.
Distribution Option	Select the sharding pattern used to distribute data in the table, `Round Robin`, `Hash`, or `Replicate`. For more information about the sharding pattern supported by Azure SQL Data Warehouse, see Azure SQL Data Warehouse – Massively parallel processing (MPP) architecture. This property is available when any option related to table creation is selected from the Action on table drop-down list.
Distribution Column Name	The name of the distribution column for a hash-distribution table. This property is available when `Hash` is selected from the Distribution Option drop-down list.
Table Option	Select the index type of the table, `Clustered Columnstore Index`, `Heap`, or `Clustered Index`. For more information, see Indexing tables in SQL Data Warehouse. This property is available when any option related to table creation is selected from the Action on table drop-down list.
Index column(s)	Specify the name of one or more key columns in the index. If multiple columns are specified, separate them with comma. This property is available when `Clustered Index` is selected from the Table Option drop-down list.
Partition	Select this check box to specify the following partition options: Partition column name: Specify the name of the column used to partition the table. Range: Specify how the limit value is included in the range of the limit. Left: The limit value is included in the left range of the limit. Right: The limit value is included in the right range of the limit. Partition For Values: Specify the values (separated by comma) used for partition. For more information about the table partition, see Partitioning tables in SQL Data Warehouse. This property is available when any option related to table creation is selected from the Action on table drop-down list.
tStatCatcher Statistics	Select this check box to gather the Job processing metadata at the Job level as well as at each component level.

Global Variables

ERROR_MESSAGE	The error message generated by the component when an error occurs. This is an After variable and it returns a string.
NB_LINE_INSERTED	The number of rows inserted. This is an After variable and it returns an integer.

Usage

Usage rule	This component can be used as a standalone component of a Job or subJob.
Limitation	Note that some features that are supported by other databases are not supported by Azure SQL Data Warehouse. For more information, see Unsupported table features.

Related scenario

No scenario is available for this component yet.

Document get from Talend https://help.talend.com

Thank you for watching.

Docs 7.x

0 Comments

Inline Feedbacks

View all comments

tSQLDWHBulkExec – Docs for ESB 7.x

tSQLDWHBulkExec