August 15, 2023

tEXABulkExec – Docs for ESB 6.x

tEXABulkExec

Imports data into an EXASolution database table using the IMPORT command provided
by the EXASolution database in a fast way.

The import will be cancelled after a configurable number
of records fail to import. Erroneous records can be sent to a log table in the same database
or to a local log file.

tEXABulkExec Standard properties

These properties are used to configure tEXABulkExec running in the Standard Job framework.

The Standard
tEXABulkExec component belongs to the Databases family.

The component in this framework is generally available.

Basic settings

Use an existing connection

Select this check box and from the list displayed select the relevant connection
component to reuse the connection details you have already defined.

Note:

When a Job contains the parent Job and the child Job, if you need to share an
existing connection between the two levels, for example, to share the connection created by
the parent Job with the child Job, you have to:

  1. In the parent level, register the database connection to be shared
    in the Basic settings view of the
    connection component which creates that very database connection.

  2. In the child level, use a dedicated connection component to read
    that registered database connection.

For an example about how to share a database connection across Job levels, see


Talend Studio
User Guide
.

Property Type

Either Built-In or Repository.

  • Built-In: No property data
    stored centrally.

  • Repository: Select the
    repository file in which the properties are stored. The database
    connection fields that follow are completed automatically using
    the data retrieved.

Host

Enter the host or host list of the EXASol database servers. EXASol can run in a cluster
environment. The valid value can be a simple IP address (for example, 172.16.173.128), an IP range list (for example, 172.16.173.128..130 that represents three servers 172.16.173.128, 172.16.173.129, and 172.16.173.130), or a
comma-separated host list (for example, server1,server2,server3) of the EXASolution database cluster.

Port

Enter the listening port number of the EXASolution database cluster.

Schema

Enter the name of the schema you want to use.

User and Password

Enter the user authentication data to access the EXASolution database.

To enter the password, click the […] button next to the
password field, and then in the pop-up dialog box enter the password between double quotes
and click OK to save the settings.

Table

Enter the name of the table to be written.

Note:

Typically the table names are stored in upper case. If you need
mixed case identifiers, you have to enter the name in double quotes.
For example, “”TEST_data_LOAD””.

Action on table

On the table defined, you can perform one of the following operations
before running the import:

  • None: No operation is carried
    out.

  • Drop and create table: The
    table is removed and created again.

  • Create table: The table does
    not exist and gets created.

  • Create table if not exists:
    The table is created if it does not exist.

  • Truncate table: The table
    content is deleted. You do not have the possibility to rollback
    the operation.

Schema and Edit
schema

A schema is a row description. It defines the number of fields (columns) to
be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

  • Built-In: You create and store the
    schema locally for this component only. Related topic: see
    Talend Studio

    User Guide.

  • Repository: You have already created
    the schema and stored it in the Repository. You can reuse it in various projects and
    Job designs. Related topic: see
    Talend Studio

    User Guide.

Note:

The columns in the schema must be in the same order as they are in
the CSV file. It is not necessary to fill all columns of the defined
table unless the use case or table definition expects that.

 

Click Edit schema to make changes to the schema.
If the current schema is of the Repository type, three
options are available:

  • View schema: choose this option to view the
    schema only.

  • Change to built-in property: choose this
    option to change the schema to Built-in for
    local changes.

  • Update repository connection: choose this
    option to change the schema stored in the repository and decide whether to propagate
    the changes to all the Jobs upon completion. If you just want to propagate the
    changes to the current Job, you can select No
    upon completion and choose this schema metadata again in the [Repository Content] window.

Advanced settings

Additional JDBC Parameters

Specify additional connection properties for the database connection you are
creating. The properties are separated by semicolon and each property is a key-value
pair, for example, encryption=1;clientname=Talend.

This field is not available if the Use an existing
connection
check box is selected.

Column Formats

Specify the format for Date and numeric columns if the default can not
be applied.

  • Column: The cells in this
    column are automatically filled with the defined schema column
    names.

  • Has Thousand Delimiters:
    Select this check box if the value of the corresponding numeric
    column (only for numeric column) in the file contains thousand
    separators.

  • Alternative Format: Specify
    the necessary format as String value if a special format is
    expected. The necessary format will be created from the schema
    column length and precision. For more information about format
    models, see EXASolution User Manual.

Source table columns

If the source is a database, configure the mapping between the source
columns and the target columns in this table.

Specifically configuring the mapping is optional. If you set nothing
here, it is assumed that the source table has the same structure as the
target table.

  • Column: The schema column
    in the target table.

  • Source column name: The
    name of the column in the source table.

Column Separator

Enter the separator for the columns of a row in the local file.

Column Delimiter

Enter the delimiter that encapsulates the field content in the local
file.

Row Separator

Enter the char used to separate the rows in the local file.

Null representation

Enter the string that represents a NULL value in the local file. If
not specified, NULL values are represented as the empty string.

Skip rows

Enter the number of rows (for example, header or any other prefix
rows) to be omitted.

Encoding

Enter the character set used in the local file. By default, it is
UTF8.

Trim column values

Specify whether spaces are deleted at the border of CSV
columns.

  • No trim: no spaces are
    trimmed.

  • Trim: spaces from both left
    and right sides are trimmed.

  • Trim only left: spaces from
    only the left side are trimmed.

  • Trim only right: spaces from
    only the right side are trimmed.

Default Date Format

Specify the format for datetime values. By default, it is YYYY-MM-DD.

Default Timestamp Format

Specify the timestamp format used. By default, it is YYYY-MM-DD HH24:MI:SS.FF3.

Thousands Separator

Specify the character used to separate thousand groups in a numeric
text value. In the numeric format, the character will be applied to the
placeholder G. If the text values
contain this char, you have to configure it also in the Column Formats table.

Note that this setting affects the connection property NLS_NUMERIC_CHARACTERS that defines the
decimal and group characters used for representing numbers.

Decimal Separator

Specify the character used to separate the integer part of a number
from the fraction. In the numeric format, the character will be applied
to the placeholder D.

Note that this setting affects the connection property NLS_NUMERIC_CHARACTERS that defines the
decimal and group characters used for representing numbers.

Minimal number errors to reject the
transfer

Specify the maximum number of invalid rows allowed during the data
loading process. For example, the value 2 means the loading process will stop if the third error
occurs.

Log Error Destination

Specify the location where error messages will be stored.

  • No Logging: error messages
    will not be saved.

  • Local Log File: error
    messages will be stored in a specified local file.

    • Local Error Log File:
      specify the path to the local file that stores error
      messages.

    • Add current timestamp to log
      file name (before extension)
      : select this
      check box to add the current timestamp before the
      extension of the file name for identification reasons in
      case you use the same file multiple times.

  • Logging Table: error messages
    will be stored in a specified table. The table will be created
    if it does not exist.

    • Error Log Table:
      enter the name of the table that stores error
      messages.

    • Use current timestamp to build
      log table
      : select this check box to use
      the current timestamp to build the log table for
      identification reasons in case you use the same table
      multiple times.

Transfer files secure

Select this check box to transfer the file over HTTPS instead of
HTTP.

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job level
as well as at each component level.

Global Variables

Global Variables

NB_LINE_INSERTED: the number of rows inserted. This is an
After variable and it returns an integer.

NB_LINE_DELETED: the number of rows deleted. This is an
After variable and it returns an integer.

FILENAME: the name of the file processed. This is an
After variable and it returns a string.

ERROR_LOG_FILE: the path to the local log file. This is
an After variable and it returns a string.

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule

This component is usually used as a standalone component.

Dynamic settings

Click the [+] button to add a
row in the table and fill the Code field
with a context variable to choose your database connection dynamically from
multiple connections planned in your Job. This feature is useful when you
need to access database tables having the same data structure but in
different databases, especially when you are working in an environment where
you cannot change your Job settings, for example, when your Job has to be
deployed and executed independent of
Talend Studio
.

The Dynamic settings table is
available only when the Use an existing
connection
check box is selected in the Basic settings view. Once a dynamic parameter is
defined, the Component List box in the
Basic settings view becomes unusable.

For examples on using dynamic parameters, see Scenario: Reading data from databases through context-based dynamic connections and Scenario: Reading data from different MySQL databases using dynamically loaded connection parameters. For more information on Dynamic
settings
and context variables, see
Talend Studio User Guide
.

Settings for different sources of import data

The settings for this component change depending on the source of your import
data.

The component handles data coming from any of the following sources:

  • Local file

  • Remote file

  • EXASol database

  • Oracle database

  • JDBC-compliant database

Local file

The local file is not transferred by uploading the file. Instead, the driver
establishes a (secure) web service that sends the URL to the database, and the
database retrieves the file from this local web service. Because the port of this
service cannot be explicitly defined, this method requires a transparent network
between the local Talend Job and the EXASolution database.

File name

 Specify the path to the local file to be imported.

Remote file

This method works with a file that is accessible on a server through the following
protocols: SCP, SFTP, FTP, HTTP, or HTTPS.

Use predefined
connection

It is possible, via the SQL interface, to set up a named
connection in the EXASolution database itself. Select this
option if you want to use such a connection, and provide its
name.

To know what connections are available, look at the table
SYS.EXA_DBA_CONNECTIONS in the database.

The connection must contain a URL with one of the following
protocols: SCP, SFTP, FTP, HTTP, or HTTPS.

The URL must not contain the file name. The file name is
always dynamic and must be provided by the component
configuration.

Remote file server URL

Specify the URL to the file server, without the file name
itself.

File name

Specify the name of the file you want to fetch from the
server.

Query parameters

If the web service depends on query parameters, specify them
here.

For example, if you want to get a file from an HDFS file
system via the web service, you need to add some additional
parameters such as open=true.

Use user
authentication

Select this check box if you want to use Basic Authentication
when connecting to the web server.

Remote user and Remote users password

Enter the user name and password need to access the web
server.

EXASol database

An EXASolution database can also serve as a remote source for the data. The source
can be a table or a specific query.

Use predefined connection

It is possible, via the SQL interface, to set up a named
connection in the EXASolution database itself. Select this
option if you want to use such a connection, and provide its
name.

To know what connections are available, look at the table
SYS.EXA_DBA_CONNECTIONS in the database.

The username and password must by provided by the component
and not as part of the predefined connection.

EXASol database host

Specify the host of the remote EXASolution database.

This field can also be used to access a cluster.

Use self defined
query

Select this check box if you want to use a specific query to
get the data.

This method is preferred if, for example, your data needs to
be filtered (using a where
condition), joined or converted.

Source query

If you want to use a specific query, enter the query in this
field.

Database or schema

If you are not using a specific query, enter the schema name
for the source table in this field.

Source table

If you are not using a specific query, enter the table name in
this field.

The mapping between the source table columns and the target
table columns (schema columns) can be set in the advanced
settings.

Use user
authentication

Select this check box if you want to use Basic Authentication
when connecting to the source database.

Remote user and Remote users password

Enter the user name and password needed to access the source
database.

Oracle database

An Oracle database can also serve as remote source for the data. Access to an
Oracle database requires an Enterprise license for the EXASolution database and does
not work with the free edition. The source can be a table or a specific
query.

Use predefined connection

It is possible, via the SQL interface, to set up a named
connection in the EXASolution database itself. Select this
option if you want to use such a connection, and provide its
name.

To know what connections are available, look at the table
SYS.EXA_DBA_CONNECTIONS in the database.

The username and password must by provided by the component
and not as part of the predefined connection.

Oracle database URL

Specify the JDBC URL to the Oracle database.

Use self defined
query

Select this check box if you want to use a specific query to
get the data.

This method is preferred if, for example, your data needs to
be filtered (using a where
condition), joined or converted.

Source query

If you want to use a specific query, enter the query in this
field.

Database or schema

If you are not using a specific query, enter the schema name
for the source table in this field.

Source table

If you are not using a specific query, enter the table name in
this field.

The mapping between the source table columns and the target
table columns (schema columns) can be set in the advanced
settings.

Use user
authentication

Select this check box if you want to use Basic Authentication
when connecting to the source database.

Remote user and Remote users password

Enter the user name and password needed to access the source
database.

JDBC-compliant database

The free edition
of the EXASolution database supports MySQL and PostgreSQL databases, and others are
available in the Enterprise edition. The source can be table or a self defined
query.

Nearly all enterprise-grade databases provide a JDBC interface.

Use predefined connection

It is possible, via the SQL interface, to set up a named
connection in the EXASolution database itself. Select this
option if you want to use such a connection, and provide its
name.

To know what connections are available, look at the table
SYS.EXA_DBA_CONNECTIONS in the database.

The username and password must by provided by the component
and not as part of the predefined connection.

JDBC database URL

Specify the JDBC URL to the source database.

Use self defined query

Select this check box if you want to use a specific query to
get the data.

This method is preferred if, for example, your data needs to
be filtered (using a where
condition), joined or converted.

Source query

If you want to use a specific query, enter the query in this
field.

Database or schema

If you are not using a specific query, enter the schema name
for the source table in this field.

Source table

If you are not using a specific query, enter the table name in
this field.

The mapping between the source table columns and the target
table columns (schema columns) can be set in the advanced
settings.

Use user authentication

Select this check box if you want to use Basic Authentication
when connecting to the source database.

Remote user and Remote users password

Enter the user name and password needed to access the source
database.

Scenario: Importing data into an EXASolution database table from a local CSV file

This scenario describes a Job that writes
employee information into a CSV file, then loads the data from this local file into a newly
created EXASolution database table using the tEXABulkExec
component, and finally retrieves the data from the table and displays it on the
console.

components-texabulkexec_s1_job.png

Dropping and linking the components

  1. Create a new Job and add the following components by typing their names in the
    design workspace or dropping them from the Palette: a tFixedFlowInput
    component, a tFileOutputDelimited component, a
    tEXABulkExec component, a tEXAInput component, and a tLogRow component.
  2. Connect the tFixedFlowInput component to the
    tFileOutputDelimited component using a
    Row > Main connection.
  3. Do the same to connect the tEXAInput
    component to the tLogRow component.
  4. Connect the tFixedFlowInput component to the
    tEXABulkExec component using a Trigger > On Subjob
    Ok
    connection.
  5. Do the same to connect the tEXABulkExec
    component to the tEXAInput component.

Configuring the components

Preparing the source data

  1. Double-click the tFixedFlowInput component to
    open its Basic settings view.

    components-texabulkexec_s1_tfixedflowinput.png

  2. Click the […] button next to Edit schema to open the [Schema] dialog box.

    components-texabulkexec_s1_tfixedflowinput_schema.png

  3. Click the [+] button to add six columns:
    EmployeeID of the Integer type, EmployeeName,
    OrgTeam and JobTitle of the String type,
    OnboardDate of the Data type with the yyyy-MM-dd
    date pattern, and MonthSalary of the
    Double type.
  4. Click OK to close the dialog box and accept
    schema propagation to the next component.
  5. In the Mode area, select Use Inline Content (delimited file) and enter the
    following employee data in the Content
    field.

  6. Double-click the tFileOutputDelimited
    component to open its Basic settings
    view.

    components-texabulkexec_s1_tfileoutputdelimited.png

  7. In the File Name field, specify the file into
    which the input data will be written. In this example, it is “E:/employee.csv”.
  8. Click Advanced settings to open the Advanced settings view of the tFileOutputDelimited component.

    components-texabulkexec_s1_tfileoutputdelimited_advanced.png

  9. Select the Advanced separator (for numbers)
    check box and in the Thousands separator and
    Decimal separator fields displayed, specify
    the separators for thousands and decimal. In this example, the default values
    “,” and “.” are used.

Loading the source data into a newly created EXASolution database table

  1. Double-click the tEXABulkExec component to
    open its Basic settings view.

    components-texabulkexec_s1_texabulkexec.png

  2. Fill in the Host, Port, Schema, User and Password
    fields with your EXASolution database connection details.
  3. In the Table field, enter the name of the
    table into which the source data will be written. In this example, the target
    database table is named “employee” and it
    does not exist.
  4. Select Create table from the Action on table list to create the specified
    table.
  5. In the Source area, select Local file as the source for the input data, and then
    specify the file that contains the source data. In this example, it is “E:/employee.csv”.
  6. Click the […] button next to Edit schema to open the [Schema] dialog box and define the schema, which should be the
    same as that of the tFixedFlowInput
    component.

    Then click OK to validate these changes and
    close the dialog box.
  7. Click Advanced settings to open the Advanced settings view of the tEXABulkExec component.

    components-texabulkexec_s1_texabulkexec_advanced.png

  8. In the Column Formats table, for the two
    numeric fields EmployeeID and MonthSalary, select the corresponding check boxes
    in the Has Thousand Delimiters column, and then
    define their format model strings in the corresponding fields of the Alternative Format column. In this example, “99G999” for EmployeeID and “99G999D99”
    for MonthSalary.
  9. Make sure that the Thousands Separator and
    Decimal Separator fields have values
    identical to those of the tFileOutputDelimited
    component and keep the default settings for the other options.

Retrieving data from the EXASolution database table

  1. Double-click the tEXAInput component to open
    its Basic settings view.

    components-texabulkexec_s1_texainput.png

  2. Fill in the Host name, Port, Schema name, Username and Password fields with your EXASolution database connection
    details.
  3. In the Table Name field, enter the name of
    the table from which the data will be retrieved. In this example, it is
    “employee”.
  4. Click the […] button next to Edit schema to open the [Schema] dialog box and define the schema, which should be the
    same as that of the tFixedFlowInput
    component.

    Then click OK to close the dialog box and
    accept schema propagation to the next component.
  5. Click the Guess Query button to fill the
    Query field with the following
    auto-generated SQL statement that will be executed on the specified
    table.

  6. Double-click the tLogRow component to open
    its Basic settings view.

    components-texabulkexec_s1_tlogrow.png

  7. In the Mode area, select the Table (print values in cells of a table) option for
    better readability of the output.

Saving and executing the Job

  1. Press Ctrl + S to save the Job.
  2. Press F6 to execute the Job.

    components-texabulkexec_s1_result.png

    As shown above, the employee data is written into the specified EXASolution
    database table and is then retrieved and displayed on the console.

Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x