August 17, 2023

tNetezzaNzLoad – Docs for ESB 5.x

tNetezzaNzLoad

tNetezzaNzLoad_icon32.png

This component invokes Netezza’s nzload utility to insert records into a Netezza database.
This component can be used either in standalone mode, loading from an existing data file; or
connected to an input row to load data from the connected component.

tNetezzaNzLoad properties

Component family

Databases/Netezza

 

Function

tNetezzaNzLoad inserts data into
a Netezza database table using Netezza’s nzload utility.

Purpose

To bulk load data into a Netezza table either from an existing
data file, an input flow, or directly from a data flow in streaming
mode through a named-pipe.

Basic settings

Property type

Either Built-in or
Repository
.

Since version 5.6, both the Built-In mode and the Repository mode are
available in any of the Talend solutions.

 

 

Built-in: No property data stored
centrally.

 

 

Repository: Select the repository
file in which the properties are stored. The fields that follow are
completed automatically using the data retrieved.

 

Host

Database server IP address.

 

Port

Listening port number of the DB server.

 

Database

Name of the Netezza database.

 

Username and
Password

DB user authentication data.

To enter the password, click the […] button next to the
password field, and then in the pop-up dialog box enter the password between double quotes
and click OK to save the settings.

 

Table

Name of the table into which the data is to be inserted.

 

Action on table

On the table defined, you can perform one of the following
operations before loading the data:

None: No operation is carried
out.

Drop and create a table: The table
is removed and created again.

Create a table: The table does not
exist and gets created.

Create table if not exists: The
table is created if it does not exist.

Drop table if exists and create:
The table is removed if it already exists and created again.

Clear table: The table content is
deleted before the data is loaded.

Truncate table: executes a truncate
statement prior to loading the data to clear the entire content of
the table.

 

Schema and Edit
Schema

A schema is a row description. It defines the number of fields to be processed and passed on
to the next component. The schema is either Built-In or
stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are
available in any of the Talend solutions.

 

 

Built-In: You create and store the schema locally for this
component only. Related topic: see Talend Studio
User Guide.

 

 

Repository: You have already created the schema and
stored it in the Repository. You can reuse it in various projects and Job designs. Related
topic: see Talend Studio User Guide.

   

Click Edit schema to make changes to the schema. If the
current schema is of the Repository type, three options are
available:

  • View schema: choose this option to view the
    schema only.

  • Change to built-in property: choose this option
    to change the schema to Built-in for local
    changes.

  • Update repository connection: choose this option to change
    the schema stored in the repository and decide whether to propagate the changes to
    all the Jobs upon completion. If you just want to propagate the changes to the
    current Job, you can select No upon completion and
    choose this schema metadata again in the [Repository
    Content]
    window.

 

Data file

Full path to the data file to be used. If this component is used
on its own (not connected to another component with input flow) then
this is the name of an existing data file to be loaded into the
database. If it is connected, with an input flow to another
component; this is the name of the file to be generated and written
with the incoming data to later be used with nzload to load into the
database.

 

Use named-pipe

Select this check box to use a named-pipe instead of a data file.
This option can only be used when the component is connected with an
input flow to another component. When the check box is selected, no
data file is generated and the data is transferred to nzload through
a named-pipe. This option greatly improves performance in both Linux
and Windows.

Note

This component on named-pipe mode uses a JNI interface to
create and write to a named-pipe on any Windows platform.
Therefore the path to the associated JNI DLL must be configured
inside the java library path. The component comes with two DLLs
for both 32 and 64 bit operating systems that are automatically
provided in the Studio with the component.

 

Named-pipe name

Specify a name for the named-pipe to be used. Ensure that the name
entered is valid.


Advanced settings

Use existing control file

Select this check box to provide a control file to be used with
the nzload utility instead of specifying all the options explicitly
in the component. When this check box is selected, Data file and the other nzload related
options no longer apply. Please refer to Netezza’s nzload manual for
details on creating a control file.

 

Control file

Enter the path to the control file to be used, between double
quotation marks, or click […] and
browse to the control file. This option is passed on to the nzload
utility via the -cf argument.

 

Field separator

Character, string or regular expression used to separate
fields.

Warning

This is nzload’s delim argument. If you
do not use the Wrap quotes around
fields
option, you must make sure that the
delimiter is not included in the data that’s inserted to the
database. The default value is or TAB. To improve
performance, use the default value.

 

Wrap quotes around fields

This option is only applied to columns of
String, Byte,
Byte[], Char, and
Object types. Select either:

None: do not wrap column values in
quotation marks.

Single quote: wrap column values in
single quotation marks.

Double quote: wrap column values in
double quotation marks.

Warning

If using the Single quote or Double
quote
option, it is necessary to use as the
Escape char.

 

Advanced options

Set the nzload arguments in the corresponding table. Click
[+] as many times as required
to add arguments to the table. Click the Parameter field and choose among the arguments from
the list. Then click the corresponding Value field and enter a value between quotation
marks.

Parameter

-If

Name of the log file to generate. The logs will be appended if the
log file already exists. If the parameter is not specified, the
default name for the log file is
<table_name>.<db_name>.nzlog‘.
And it’s generated under the current working directory where the job
is running.

 

-bf

Name of the bad file to generate. The bad file contains all the
records that could not be loaded due to an internal Netezza error.
The records will be appended if the bad file already exists. If the
parameter is not specified, the default name for the bad file is
<table_name>.<db_name>.nzbad‘.
And it’s generated under the current working directory where the job
is running.

 

-ouputDir

Directory path to where the log and the bad file are generated. If
the parameter is not specified the files are generated under the
current directory where the job is currently running.

 

-logFileSize

Maximum size for the log file. The value is in MB. The default
value is 2000 or 2GB. To save hard disk space, specify a smaller
amount if your job runs often.

 

-compress

Specify this option if the data file is compressed. Valid values
are “TRUE” or “FALSE“. Default value if “FALSE“.

Note

This option is only valid if this component is used by itself
and not connected to another component via an input flow.

 

-skipRows <n>

Number of rows to skip from the beginning of the data file. Set
the value to “1” if you like to skip the header row from the data
file. The default value is “0”.

Note

This option should only be used if this component is used by
itself and not connected to another component via an input
flow.

 

-maxRows <n>

Maximum number of rows to load from the data file.

Note

This option should only be used if this component is used by
itself and not connected to another component via an input
flow.

 

-maxErrors

Maximum number of error records to allow before terminating the
load process. The default value is “1”.

 

-ignoreZero

Binary zero bytes in the input data will generate errors. Set this
option to “NO” to generate error or to “YES” to ignore zero bytes.
The default value is “NO”.

 

-requireQuotes

This option requires all the values to be wrapped in quotes. The
default value is “FALSE”.

Note

This option currently does not work with input flow. Use this
option only in standalone mode with an existing file.

 

-nullValue <token>

Specify the token to indicate a null value in the data file. The
default value is “NULL”. To improve slightly performance you can set
this value to an empty field by specifying the value as single
quotes: “”””.

 

-fillRecord

Treat missing trailing input fields as null. You do not need to
specify a value for this option in the value field of the table.
This option is not turned on by default, therefore input fields must
match exactly all the columns of the table by default.

Note

Trailing input fields must be nullable in the database.

 

-ctrlChar

Accept control chars in char/varchar fields (must escape NUL, CR
and LF). You do not need to specify a value for this option in the
value field of the table. This option is turned off by
default.

 

-ctInString

Accept un-escaped CR in char/varchar fields (LF becomes only end
of row). You do not need to specify a value for this option in the
value field of the table. This option is turned off by
default.

 

-truncString

Truncate any string value that exceeds its declared char/varchar
storage. You do not need to specify a value for this option in the
value field of the table. This option is turned off by
default.

 

-dateStyle

Specify the date format in which the input data is written in.
Valid values are: “YMD”, “Y2MD”, “DMY”, “DMY2”, “MDY”, “MDY2”,
“MONDY”, “MONDY2”. The default value is “YMD”.

Note

The date format of the column in the component’s schema must
match the value specified here. For example if you want to load
a DATE column, specify the date format in the component schema
as “yyyy-MM-dd” and the -dateStyle option as “YMD”.

For more description on loading date and time fields, see Loading DATE, TIME and TIMESTAMP columns.

 

-dateDelim

Delimiter character between date parts. The default value is “-”
for all date styles except for “MONDY[2]” which is ” ” (empty
space).

Note

The date format of the column in the component’s schema must
match the value specified here.

 

-y2Base

First year expressible using two digit year (Y2) dateStyle.

 

-timeStyle

Specify the time format in which the input data is written in.
Valid values are: “24HOUR” and “12HOUR”. The default value is
“24HOUR”. For slightly better performance you should keep the
default value.

Note

The time format of the column in the component’s schema must
match the value specified here. For example if you want to load
a TIME column, specify the date format in the component schema
as “HH:mm:ss” and the -timeStyle option as “24HOUR”.

For more description on loading date and time fields, see Loading DATE, TIME and TIMESTAMP columns.

 

-timeDelim

Delimiter character between time parts. The default value is “:”.

Note

The time format of the column in the component’s schema must
match the value specified here.

 

-timeRoundNanos

Allow but round non-zero digits with smaller than microsecond
resolution.

 

-boolStyle

Specify the format in which Boolean data is written in the data.
The valid values are: “1_0”, “T_F”, “Y_N”, “TRUE_FALSE”, “YES”. The
default value is “1_0”. For slightly better performance keep the
default value.

 

-allowRelay

Allow load to continue after one or more SPU reset or failed over.
The default behaviour is not allowed.

 

-allowRelay <n>

Specify number of allowable continuation of a load. Default value
is “1”.

 

Encoding

Select the encoding type from the list.

 

Specify nzload path

Select this check box to specify the full path to the nzload
executable. You must check this option if the nzload path is not
specified in the PATH environment variable.

 

Full path to nzload executable

Full path to the nzload executable on the machine in use. It is
advisable to specify the nzload path in the PATH environment
variable instead of selecting this option.

 

tStatCatcher Statistics

Select this check box to collect log data at the component
level.

 

Enable parallel execution

Select this check box to perform high-speed data processing, by treating multiple data flows
simultaneously. Note that this feature depends on the database or the application ability to
handle multiple inserts in parallel as well as the number of CPU affected. In the Number of parallel executions field, either:

  • Enter the number of parallel executions desired.

  • Press Ctrl + Space and select the appropriate
    context variable from the list. For further information, see Talend Studio
    User Guide
    .

Note that when parallel execution is enabled, it is not possible to use global variables to
retrieve return values in a subjob.

Warning

  • The Action on table
    field is not available with the parallelization function. Therefore, you
    must use a tCreateTable component if you
    want to create a table.

  • When parallel execution is enabled, it is not possible to use global
    variables to retrieve return values in a subjob.

Global Variables

NB_LINE: the number of rows processed. This is an After
variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio
User Guide.

Usage

This component is mainly used when non particular transformation
is required on the data to be loaded ont to the database.

This component can be used as a standalone or an output
component.

Log4j

The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User
Guide
.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

Due to license incompatibility, one or more JARs required to use this component are not
provided. You can install the missing JARs for this particular component by clicking the
Install button on the Component tab view. You can also find out and add all missing JARs easily on
the Modules tab in the Integration perspective
of your studio. For details, see https://help.talend.com/display/KB/How+to+install+external+modules+in+the+Talend+products
or the section describing how to configure the Studio in the Talend Installation and Upgrade
Guide
.

Loading DATE, TIME and TIMESTAMP columns

When this component is used with an input flow, the date format specified inside
the component’s schema must match the value specified for -dateStyle, -dateDelim,
-timeStyle, and -timeDelim options. Please refer to following examples:

DB Type

Schema date format

-dateStyle

-dateDelim

-timeStyle

-timeDelim

DATE

“yyyy-MM-dd”

“YMD”

“-“

n/a

n/a

TIME

“HH:mm:ss”

n/a

n/a

“24HOUR”

“:”

TIMESTAMP

“yyyy-MM-dd HH:mm:ss”

“YMD”

“-“

“24HOUR”

“:”

Related scenario

For a related use case, see Scenario: Inserting data in MySQL database.


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x