The tRedshiftOutputBulk and tRedshiftBulkExec components can be used together in a two step process
to load data to Amazon Redshift from a delimited/CSV file on Amazon S3. In the first
step, a delimited/CSV file is generated. In the second step, this file is used in the
INSERT statement used to feed Amazon Redshift. These two steps are fused together in the
tRedshiftOutputBulkExec component. The advantage of
using two separate steps is that the data can be transformed before it is loaded to
Amazon Redshift.
Component family |
Databases/Amazon Redshift |
|
Function |
This component receives data from the preceding component, |
|
Purpose |
This component allows you to prepare a delimited/CSV file that can |
|
Basic settings |
Data file path at local |
Specify the local path to the file to be generated. Note that the file is generated on the same machine where the |
|
Schema and Edit schema |
A schema is a row description. It defines the number of fields to be processed and passed on Since version 5.6, both the Built-In mode and the Repository mode are |
|
|
Built-In: You create and store the schema locally for this |
|
|
Repository: You have already created the schema and |
|
|
Click Edit schema to make changes to the schema. If the
|
|
Append the local file |
Select this check box to append data to the specified local file if it already exists, |
|
Compress the data file |
Select this check box and select a compression type from the list displayed to compress This check box disappears when the Append the local |
S3 Setting |
Access Key |
Specify the Access Key ID that uniquely identifies an AWS Account. |
|
Secret Key |
Specify the Secret Access Key, constituting the security To enter the secret key, click the […] button next to |
|
Bucket |
Type in the name of the Amazon S3 bucket to which the file is |
|
Key |
Type in an object key to assign to the file uploaded to Amazon S3. |
Advanced settings |
Field Separator |
Enter the character used to separate fields. |
|
Text enclosure |
Select the character in a pair of which the fields are enclosed. |
|
Delete local file after putting it to s3 |
Select this check box to delete the local file after being uploaded to Amazon S3. By |
|
Create directory if not exists |
Select this check box to create the directory specified in the |
|
Encoding |
Select an encoding type for the data in the file to be |
S3 Setting |
Config client |
Select this check box to configure client parameters for Amazon S3. Click the [+] button below the table displayed to
|
|
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at the |
Global Variables |
NB_LINE: the number of rows processed. This is an After ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
This component is more commonly used with the tRedshiftBulkExec component to feed Amazon Redshift |
|
Log4j |
The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html. |
For a related scenario, see Loading/unloading data from/to Amazon S3.