July 30, 2023

tVerticaOutput – Docs for ESB 7.x

tVerticaOutput

Inserts, updates, deletes, or copies data from an incoming flow into a Vertica
database table.

tVerticaOutput Standard properties

These properties are used to configure tVerticaOutput running
in the Standard Job framework.

The Standard
tVerticaOutput component belongs to the Databases family.

The component in this framework is available in all Talend
products
.

Note: This component is a specific version of a dynamic database
connector. The properties related to database settings vary depending on your database
type selection. For more information about dynamic database connectors, see Dynamic database components.

Basic settings

Database

Select a type of database from the list and click
Apply.

Property Type

Select the way the connection details
will be set.

  • Built-In: The connection details will be set
    locally for this component. You need to specify the values for all
    related connection properties manually.

  • Repository: The connection details stored
    centrally in Repository > Metadata will be reused by this component. You need to click
    the […] button next to it and in the pop-up
    Repository Content dialog box, select the
    connection details to be reused, and all related connection
    properties will be automatically filled in.

tVerticaOutput_1.png

Click the icon to open a database connection wizard and store the database connection
parameters you set in the component.

For more information about setting up and storing database connection parameters, see
Talend Studio
User Guide.

DB Version

Select the version of the database.

Use an existing connection

Select this check box and in the Component List click the relevant connection component to
reuse the connection details you already defined.

When a Job contains the parent Job and the child Job, if you need to
share an existing connection between the two levels, for example, to share the
connection created by the parent Job with the child Job, you have to:

  1. In the parent level, register the database connection to be shared
    in the Basic settings view of the connection
    component which creates that very database connection.

  2. In the child level, use a dedicated connection component to read
    that registered database connection.

For an example about how to share a database connection across Job
levels, see

Talend Studio
User Guide
.

Host

The IP address or hostname of the database.

Port

The listening port number of the database.

Database

The name of the database.

Schema

The schema of the database.

Username and Password

The database user authentication data.

To enter the password, click the […] button next to the
password field, and then in the pop-up dialog box enter the password between double quotes
and click OK to save the settings.

Table

The name of the table into
which data will be written.

Action on table

Select an operation to be performed on the table defined.

  • Default: No operation is carried out.

  • Drop and create table: The table is removed
    and created again.

  • Create table: The table does not exist and
    gets created.

  • Create table if does not exist: The table is
    created if it does not exist.

  • Drop table if exist and create: The table is
    removed if it already exists and created again.

  • Clear table: The table content is
    deleted. You have the possibility to rollback the operation.

This property is not
available when the Enable parallel execution check
box on the Advanced settings view is selected.

Use “drop
cascade”

Select this check box to remove all objects related to
the table which will be dropped.

This property is available only when a table drop related
option is selected from the Action on
table
list.

Action on data

Select an action to be performed on data of the table defined.

  • Insert: Add new entries to the table. If
    duplicates are found, job stops.

  • Update: Make changes to existing
    entries.

  • Insert or update: Insert a new record. If
    the record with the given reference already exists, an update would be made.

  • Update or insert: Update the record with the
    given reference. If the record does not exist, a new record would be inserted.

  • Delete: Remove entries corresponding to
    the input flow.

  • Copy: Read
    data from a text file and insert tuples of entries into the WOS
    (Write Optimized Store) or directly into the ROS (Read Optimized
    Store). This option is ideal for bulk loading. For further
    information, see Vertica SQL Reference Manual.

It is necessary to specify at
least one column as a primary key on which the Update and
Delete operations are based. You can do that by clicking
Edit Schema and selecting the check box(es) next to the
column(s) you want to set as primary key(s). For an advanced use, click the
Advanced settings view where you can simultaneously define
primary keys for the Update and Delete
operations. To do that, select the Use field options check box
and then in the Update Key column, select the check boxes next to
the column names you want to use as a base for the Update
operation. Do the same in the Deletion key column for the
Delete operation.

Schema and Edit schema

A schema is a row description. It defines the number of fields
(columns) to be processed and passed on to the next component. When you create a Spark
Job, avoid the reserved word line when naming the
fields.

  • Built-In: You create and store the schema locally for this component
    only.

  • Repository: You have already created the schema and stored it in the
    Repository. You can reuse it in various projects and Job designs.

When the schema to be reused has default values that are
integers or functions, ensure that these default values are not enclosed within
quotation marks. If they are, you must remove the quotation marks manually.

You can find more details about how to
verify default values in retrieved schema in Talend Help Center (https://help.talend.com).

Click Edit
schema
to make changes to the schema.

Note: If you
make changes, the schema automatically becomes built-in.
  • View schema: choose this
    option to view the schema only.

  • Change to built-in property:
    choose this option to change the schema to Built-in for local changes.

  • Update repository connection:
    choose this option to change the schema stored in the repository and decide whether
    to propagate the changes to all the Jobs upon completion. If you just want to
    propagate the changes to the current Job, you can select No upon completion and choose this schema metadata
    again in the Repository Content
    window.

This
component offers the advantage of the dynamic schema feature. This allows you to
retrieve unknown columns from source files or to copy batches of columns from a source
without mapping each column individually. For further information about dynamic schemas,
see
Talend Studio

User Guide.

This
dynamic schema feature is designed for the purpose of retrieving unknown columns of a
table and is recommended to be used for this purpose only; it is not recommended for the
use of creating tables.

Die on error

Select the check box to stop the execution of the Job when an error
occurs.

Clear the check box to skip any rows on error and complete the
process for error-free rows.

When errors are skipped, you can collect the rows on error
using a Row > Reject connection.

Advanced settings

Use alternate
schema

Select this option to use a schema other than
the one specified by the component that establishes the database connection (that is,
the component selected from the Component list
drop-down list in Basic settings view). After
selecting this option, provide the name of the desired schema in the Schema field.

This option is available when Use an
existing connection
is selected in Basic
settings
view.

Additional JDBC Parameters

Specify additional JDBC parameters for the
database connection created.

This property is not available when the Use an existing connection
check box in the Basic settings view is selected.

Abort on error

Select this check box to stop the Copy operation
if any row is rejected and roll back the operation. Thus no data is
loaded.

This property is available only when COPY is selected from the
Action on data drop-down list.

Maximum rejects

Type in a number to set the REJECTMAX command used by Vertica, which
indicates the upper limit on the number of logical records to be
rejected before a load fails. If not specified or if value is 0, an
unlimited number of rejections are allowed.

This property is available only when COPY is selected from the
Action on data drop-down list.

No commit

Select this check box to prevent the current transaction from committing
automatically.

This property is available only when COPY is selected from the
Action on data drop-down list.

Exception file

Type in the path to, or browse to the file in which messages are written
indicating the input line number and the reason for each rejected data
record.

This property is available only when COPY is selected from the
Action on data drop-down list.

Exception file node

Type in the node of the exception file. If not specified, operations
default to the query’s initiator node.

This property is available only when COPY is selected from the
Action on data drop-down list.

Rejected data file

Type in the path to, or browse to the file in which to write rejected
rows. This file can then be edited to resolve problems and reloaded.

This property is available only when COPY is selected from the
Action on data drop-down list.

Rejected data file node

Type in the node of the rejected data file. If not specified, operations
default to the query’s initiator node.

This property is available only when COPY is selected from the
Action on data drop-down list.

Commit every

Specify the number of rows to be processed before committing
batches of rows together into the database.

This option ensures transaction quality (but not rollback) and, above all, better
performance at executions.

Use batch mode

Select this check box to activate the batch mode for data processing, and in the
Batch size field displayed, specify the number of records to be
processed in each batch.

This property is available only when Insert,
Update, Delete or
Copy is selected from the Action on data
drop-down list.

Additional Columns

This option allows you to call SQL functions to perform actions on columns, which are
not insert, nor update or delete actions, or action that require particular
preprocessing. It is not offered if you create (with or without drop) a database
table.

  • Name: Type in the name of the schema column to be altered or
    inserted as new column.

  • DataType: Type in the data type for the new column.

  • SQL expression: Type in the SQL statement to be executed in
    order to alter or insert the relevant column data.

  • Position: Select Before,
    Replace or After following the
    action to be performed on the reference column.

  • Reference column: Select a column of reference that the
    component can use to place or replace the new or altered column.

Use field options

Select the check box for the corresponding column to customize a request,
particularly if multiple actions are being carried out on the data.

  • Update Key: Select the check box for the
    corresponding column based on which the data is updated.
  • Deletion Key: Select the check box for the
    corresponding column based on which the data is deleted.
  • Updatable: Select the check box if the data in
    the corresponding column can be updated.

  • Insertable: Select the check box if the data
    in the corresponding column can be inserted.

Debug query mode

Select this check box to display each step during processing entries
in a database.

Support null in “SQL WHERE” statement

Select this check box to validate the Null value in the “SQL WHERE” statement.

Create projection when create table

Select this check box to create a projection for a table to be
created.

This check box is available only when the table creation related option
is selected from the Action on table drop-down
list.

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job level
as well as at each component level.

Enable parallel execution

Select this check box to perform high-speed data processing, by treating
multiple data flows simultaneously. Note that this feature depends on the database or
the application ability to handle multiple inserts in parallel as well as the number of
CPU affected. In the Number of parallel executions
field, either:

  • Enter the number of parallel executions desired.
  • Press Ctrl + Space and select the
    appropriate context variable from the list. For further information, see
    Talend Studio User Guide
    .

Note that when parallel execution is enabled, it is not possible to use global
variables to retrieve return values in a subjob.

  • The Action on
    table
    field is not available with the
    parallelization function. Therefore, you must use a tCreateTable component if you
    want to create a table.
  • When parallel execution is enabled, it is not
    possible to use global variables to retrieve return values in a
    subjob.

Global Variables

NB_LINE

The number of rows processed. This is an After variable and it returns an integer.

NB_LINE_COPIED

The number of rows copied. This is an After variable and it returns an integer.

NB_LINE_DELETED

The number of rows deleted. This is an After variable and it returns an integer.

NB_LINE_INSERTED

The number of rows inserted. This is an After variable and it returns an integer.

NB_LINE_REJECTED

The number of rows rejected. This is an After variable and it returns an integer.

NB_LINE_UPDATED

The number of rows updated. This is an After variable and it returns an integer.

ERROR_MESSAGE

The error message generated by the component when an error occurs. This is an After
variable and it returns a string.

Usage

Usage rule

This component is usually used as an output component.
It allows you to carry out actions on a table or on data of a table in a
Vertica database. It also allows you to create a reject flow using a Row > Rejects link to filter data in error. For an example of
tMysqlOutput in use, see
Retrieving data in error with a Reject link.

Talend Studio and the
Vertica database create very fast and affordable data warehouse and data mart applications.
For more information about how to configure
Talend Studio
to connect to Vertica, see Talend and HP Vertica Tips and
Techniques
.

Dynamic settings

Click the [+] button to add a row in the table
and fill the Code field with a context
variable to choose your database connection dynamically from multiple
connections planned in your Job. This feature is useful when you need to
access database tables having the same data structure but in different
databases, especially when you are working in an environment where you
cannot change your Job settings, for example, when your Job has to be
deployed and executed independent of Talend Studio.

The Dynamic settings table is
available only when the Use an existing
connection
check box is selected in the Basic settings view. Once a dynamic parameter is
defined, the Component List box in the
Basic settings view becomes unusable.

For examples on using dynamic parameters, see Reading data from databases through context-based dynamic connections and Reading data from different MySQL databases using dynamically loaded connection parameters. For more information on Dynamic
settings
and context variables, see Talend Studio
User Guide.

Related scenarios

For tVerticaOutput related topics, see:


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x