tNetezzaNzLoad

Inserts data into a Netezza database table using Netezza’s nzload
utility.

tNetezzaNzLoad bulk loads data into a Netezza table
either from an existing data file, an input flow, or directly from a data flow in streaming
mode through a named-pipe.

tNetezzaNzLoad Standard properties

These properties are used to configure tNetezzaNzLoad running in the Standard Job framework.

The Standard
tNetezzaNzLoad component belongs to the Databases family.

The component in this framework is available in all Talend
products.

Basic settings

Property type	Either Built-in or Repository .
	Built-in: No property data stored centrally.
	Repository: Select the repository file in which the properties are stored. The fields that follow are completed automatically using the data retrieved.
Host	Database server IP address.
Port	Listening port number of the DB server.
Database	Name of the Netezza database.
Username and Password	DB user authentication data. To enter the password, click the […] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.
Table	Name of the table into which the data is to be inserted.
Action on table	On the table defined, you can perform one of the following operations before loading the data: None: No operation is carried out. Drop and create a table: The table is removed and created again. Create a table: The table does not exist and gets created. Create table if not exists: The table is created if it does not exist. Drop table if exists and create: The table is removed if it already exists and created again. Clear table: The table content is deleted before the data is loaded. Truncate table: executes a truncate statement prior to loading the data to clear the entire content of the table.
Schema and Edit Schema	A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. When you create a Spark Job, avoid the reserved word `line` when naming the fields.
	Built-In: You create and store the schema locally for this component only.
	Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs.
	Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available: View schema: choose this option to view the schema only. Change to built-in property: choose this option to change the schema to Built-in for local changes. Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the Repository Content window.
Data file	Full path to the data file to be used. If this component is used on its own (not connected to another component with input flow) then this is the name of an existing data file to be loaded into the database. If it is connected, with an input flow to another component; this is the name of the file to be generated and written with the incoming data to later be used with nzload to load into the database.
Use named-pipe	Select this check box to use a named-pipe instead of a data file. This option can only be used when the component is connected with an input flow to another component. When the check box is selected, no data file is generated and the data is transferred to nzload through a named-pipe. This option greatly improves performance in both Linux and Windows. Note: This component on named-pipe mode uses a JNI interface to create and write to a named-pipe on any Windows platform. Therefore the path to the associated JNI DLL must be configured inside the java library path. The component comes with two DLLs for both 32 and 64 bit operating systems that are automatically provided in the Studio with the component.
Named-pipe name	Specify a name for the named-pipe to be used. Ensure that the name entered is valid.

Advanced settings

Additional JDBC Parameters	Specify additional JDBC parameters for the database connection created.
Use existing control file	Select this check box to provide a control file to be used with the nzload utility instead of specifying all the options explicitly in the component. When this check box is selected, Data file and the other nzload related options no longer apply. Please refer to Netezza’s nzload manual for details on creating a control file. Note: The global variable `NB_LINE` is not supported when any control file is being used.
Control file	Enter the path to the control file to be used, between double quotation marks, or click […] and browse to the control file. This option is passed on to the nzload utility via the -cf argument.
Field separator	Character, string or regular expression used to separate fields. Warning: This is nzload’s delim argument. If you do not use the Wrap quotes around fields option, you must make sure that the delimiter is not included in the data that’s inserted to the database. The default value is or TAB. To improve performance, use the default value.
Wrap quotes around fields	This option is only applied to columns of String, Byte, Byte[], Char, and Object types. Select either: None: do not wrap column values in quotation marks. Single quote: wrap column values in single quotation marks. Double quote: wrap column values in double quotation marks. Warning: If using the Single quote or Double quoteoption, it is necessary to use as the Escape char.
Advanced options	Set the nzload arguments in the corresponding table. Click [+] as many times as required to add arguments to the table. Click the Parameter field and choose among the arguments from the list. Then click the corresponding Value field and enter a value between quotation marks. For details about the available parameters, see Parameters.
Encoding	Select the encoding type from the list.
Specify nzload path	Select this check box to specify the full path to the nzload executable. You must check this option if the nzload path is not specified in the PATH environment variable.
Full path to nzload executable	Full path to the nzload executable on the machine in use. It is advisable to specify the nzload path in the PATH environment variable instead of selecting this option.
tStatCatcher Statistics	Select this check box to collect log data at the component level.
Enable parallel execution	Select this check box to perform high-speed data processing, by treating multiple data flows simultaneously. Note that this feature depends on the database or the application ability to handle multiple inserts in parallel as well as the number of CPU affected. In the Number of parallel executions field, either: Enter the number of parallel executions desired. Press Ctrl + Space and select the appropriate context variable from the list. For further information, see Talend Studio User Guide. Note that when parallel execution is enabled, it is not possible to use global variables to retrieve return values in a subjob. Warning: The Action on table field is not available with the parallelization function. Therefore, you must use a tCreateTable component if you want to create a table. When parallel execution is enabled, it is not possible to use global variables to retrieve return values in a subjob.

Global Variables

Global Variables	NB_LINE: the number of rows processed. This is an After variable and it returns an integer. ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box. A Flow variable functions during the execution of a component while an After variable functions after the execution of the component. To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it. For further information about variables, see Talend Studio User Guide.

NB_LINE: the number of rows processed. This is an After
variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule	This component is mainly used when non particular transformation is required on the data to be loaded ont to the database. This component can be used as a standalone or an output component.

Usage rule

This component is mainly used when non particular transformation
is required on the data to be loaded ont to the database.

This component can be used as a standalone or an output
component.

Parameters

The following table lists the parameters you can use in the Advanced options table in the Advanced settings tab.

-If	Name of the log file to generate. The logs will be appended if the log file already exists. If the parameter is not specified, the default name for the log file is ‘<table_name>.<db_name>.nzlog‘. And it’s generated under the current working directory where the job is running.
-bf	Name of the bad file to generate. The bad file contains all the records that could not be loaded due to an internal Netezza error. The records will be appended if the bad file already exists. If the parameter is not specified, the default name for the bad file is ‘<table_name>.<db_name>.nzbad‘. And it’s generated under the current working directory where the job is running.
-ouputDir	Directory path to where the log and the bad file are generated. If the parameter is not specified the files are generated under the current directory where the job is currently running.
-logFileSize	Maximum size for the log file. The value is in MB. The default value is 2000 or 2GB. To save hard disk space, specify a smaller amount if your job runs often.
-compress	Specify this option if the data file is compressed. Valid values are “`TRUE`” or “`FALSE`“. Default value if “`FALSE`“. This option is only valid if this component is used by itself and not connected to another component via an input flow.
-skipRows <n>	Number of rows to skip from the beginning of the data file. Set the value to “1” if you like to skip the header row from the data file. The default value is “0”. This option should only be used if this component is used by itself and not connected to another component via an input flow.
-maxRows <n>	Maximum number of rows to load from the data file. This option should only be used if this component is used by itself and not connected to another component via an input flow.
-maxErrors	Maximum number of error records to allow before terminating the load process. The default value is “1”.
-ignoreZero	Binary zero bytes in the input data will generate errors. Set this option to “NO” to generate error or to “YES” to ignore zero bytes. The default value is “NO”.
-requireQuotes	This option requires all the values to be wrapped in quotes. The default value is “FALSE”. This option currently does not work with input flow. Use this option only in standalone mode with an existing file.
-nullValue <token>	Specify the token to indicate a null value in the data file. The default value is “NULL”. To improve slightly performance you can set this value to an empty field by specifying the value as single quotes: “””.
-fillRecord	Treat missing trailing input fields as null. You do not need to specify a value for this option in the value field of the table. This option is not turned on by default, therefore input fields must match exactly all the columns of the table by default. Trailing input fields must be nullable in the database.
-ctrlChar	Accept control chars in char/varchar fields (must escape NUL, CR and LF). You do not need to specify a value for this option in the value field of the table. This option is turned off by default.
-ctInString	Accept un-escaped CR in char/varchar fields (LF becomes only end of row). You do not need to specify a value for this option in the value field of the table. This option is turned off by default.
-truncString	Truncate any string value that exceeds its declared char/varchar storage. You do not need to specify a value for this option in the value field of the table. This option is turned off by default.
-dateStyle	Specify the date format in which the input data is written in. Valid values are: “YMD”, “Y2MD”, “DMY”, “DMY2”, “MDY”, “MDY2”, “MONDY”, “MONDY2”. The default value is “YMD”. The date format of the column in the component’s schema must match the value specified here. For example if you want to load a DATE column, specify the date format in the component schema as “yyyy-MM-dd” and the -dateStyle option as “YMD”. For more description on loading date and time fields, see Loading DATE, TIME and TIMESTAMP columns.
-dateDelim	Delimiter character between date parts. The default value is “-” for all date styles except for “MONDY[2]” which is ” ” (empty space). The date format of the column in the component’s schema must match the value specified here.
-y2Base	First year expressible using two digit year (Y2) dateStyle.
-timeStyle	Specify the time format in which the input data is written in. Valid values are: “24HOUR” and “12HOUR”. The default value is “24HOUR”. For slightly better performance you should keep the default value. The time format of the column in the component’s schema must match the value specified here. For example if you want to load a TIME column, specify the date format in the component schema as “HH:mm:ss” and the -timeStyle option as “24HOUR”. For more description on loading date and time fields, see Loading DATE, TIME and TIMESTAMP columns.
-timeDelim	Delimiter character between time parts. The default value is “:”. Note: The time format of the column in the component’s schema must match the value specified here.
-timeRoundNanos	Allow but round non-zero digits with smaller than microsecond resolution.
-boolStyle	Specify the format in which Boolean data is written in the data. The valid values are: “1_0”, “T_F”, “Y_N”, “TRUE_FALSE”, “YES”. The default value is “1_0”. For slightly better performance keep the default value.
-allowRelay	Allow load to continue after one or more SPU reset or failed over. The default behaviour is not allowed.
-allowRelay <n>	Specify number of allowable continuation of a load. Default value is “1”.

Loading DATE, TIME and TIMESTAMP columns

When this component is used with an input flow, the date format specified inside
the component’s schema must match the value specified for -dateStyle, -dateDelim,
-timeStyle, and -timeDelim options.

DB Type	Schema date format	-dateStyle	-dateDelim	-timeStyle	-timeDelim
DATE	“yyyy-MM-dd”	“YMD”	“-“	n/a	n/a
TIME	“HH:mm:ss”	n/a	n/a	“24HOUR”	“:”
TIMESTAMP	“yyyy-MM-dd HH:mm:ss”	“YMD”	“-“	“24HOUR”	“:”

Related scenario

For a related use case, see Inserting data in bulk in MySQL database.

Document get from Talend https://help.talend.com

Thank you for watching.

Docs 7.x

0 Comments

Inline Feedbacks

View all comments

tNetezzaNzLoad – Docs for ESB 7.x