tGreenplumBulkExec
Improves performance when loading data in a Greenplum
database.
The tGreenplumOutputBulk and
tGreenplumBulkExec components are used
together in a two step process. In the first step, an output file is generated. In the
second step, this file is used in the INSERT statement used to feed a database. These
two steps are fused together in the tGreenplumOutputBulkExec component, detailed in a separate section. The
advantage of using a two step process is that it makes it possible to transform data
before it is loaded in the database.
tGreenplumBulkExec performs an Insert
action on the data.
tGreenplumBulkExec Standard properties
These properties are used to configure tGreenplumBulkExec running in the Standard Job framework.
The Standard
tGreenplumBulkExec component belongs to the Databases family.
The component in this framework is generally available.
Basic settings
Property type |
Either Built-in or |
|
Built-in: No property data stored |
|
Repository: Select the repository |
Use an existing connection |
Select this check box and in the Component Note:
When a Job contains the parent Job and the child Job, if you need to share an
existing connection between the two levels, for example, to share the connection created by the parent Job with the child Job, you have to:
For an example about how to share a database connection across Job levels, see |
Host |
Database server IP address. |
Port |
Listening port number of DB server. |
Database |
Name of the database. |
Schema |
Exact name of the schema. |
Username and |
DB user authentication data. To enter the password, click the […] button next to the |
Table |
Name of the table to be written. Note that only one table can be |
Action on table |
On the table defined, you can perform one of the following
None: No operation is carried
Drop and create a table: The table
Create a table: The table does not
Create a table if not exists: The
Drop a table if exists and create:
Clear a table: The table content is |
Filename |
Name of the file to be loaded. Warning:
This file is located on the machine specified by the URI in |
Schema and Edit |
A schema is a row description. It defines the number of fields (columns) to |
|
Built-In: You create and store the |
|
Repository: You have already created When the schema to be reused has default values that are integers or You can find more details about how to verify default |
Click Edit schema to make changes to the schema.
|
Advanced settings
Action on data |
Select the operation you want to perform:
Bulk insert |
Copy the OID for each row |
Retrieve the ID item for each row. |
Contains a header line with the names of each column in |
Specify that the table contains header. |
File type |
Select the file type to process. |
Null string |
String displayed to indicate that the value is null. |
Fields terminated by |
Character, string or regular expression to separate fields. |
Escape char |
Character of the row to be escaped |
Text enclosure |
Character used to enclose text. |
Force not null for columns |
Define the columns nullability
Force not null: Select the check |
tStat |
Select this check box to collect log data at the component |
Usage
Usage rule |
This component is generally used with a tGreenplumOutputBulk component. Used together they |
Dynamic settings |
Click the [+] button to add a The Dynamic settings table is For examples on using dynamic parameters, see Scenario: Reading data from databases through context-based dynamic connections and Scenario: Reading data from different MySQL databases using dynamically loaded connection parameters. For more information on Dynamic |
Related scenarios
For more information about tGreenplumBulkExec,
see: