tS3Input
Reads data from a given S3N system (S3 Native Filesystem).
The tS3Input component loads S3N-formatted (S3 Native Filesystem)
files into the MapReduce process you are designing.
tS3Input MapReduce properties (deprecated)
These properties are used to configure tS3Input running in the MapReduce Job framework.
The MapReduce
tS3Input component belongs to the MapReduce family.
The information in this section is only for users who have subscribed to
Talend Data Fabric or to any Talend product with Big Data but it is not
applicable to Talend Open Studio for Big Data users.
The MapReduce framework is deprecated from Talend 7.3 onwards. Use Talend Jobs for Apache Spark to accomplish your integration tasks.
Basic settings
Property type |
A schema is a row description. It defines the number of fields |
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Schema and Edit |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Bucket and Folder |
Enter the bucket name and its folder you need to use. You |
Access key and Secret |
Enter the authentication information required to connect to To enter the password, click the […] button next to the |
Type |
Select the type of the file to be processed. The type of the file may be:
|
Row separator |
The separator used to identify the end of a row. |
Field separator |
Enter character, string or regular expression to separate fields for the transferred |
Header |
Enter the number of rows to be skipped in the beginning of file. |
Custom encoding |
You may encounter encoding issues when you process the stored data. In that Select the encoding from the list or select Custom This option is not available for a Sequence file. |
Advanced settings
Advanced separator (for number) |
Select this check box to change the separator used for numbers. By default, the thousands separator is a comma (,) and the decimal separator is a period (.). This option is not available for a Sequence file. |
Trim all column |
Select this check box to remove the leading and trailing whitespaces from all This option is not available for a Sequence file. |
Check column to trim |
This table is filled automatically with the schema being used. Select the check This option is not available for a Sequence file. |
Enable parallel execution |
Select this check box to perform high-speed data processing, by treating
multiple data flows simultaneously. Note that this feature depends on the database or the application ability to handle multiple inserts in parallel as well as the number of CPU affected. In the Number of parallel executions field, either:
|
Global Variables
Global Variables |
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
In a Once a Map/Reduce Job is opened in the workspace, tS3Input as well as the MapReduce family appears in the Palette of the Studio. Note that in this documentation, unless otherwise |
Hadoop Connection |
You need to use the Hadoop Configuration tab in the This connection is effective on a per-Job basis. |
Related scenario
This component is used in the similar way as the other input components reading data
from a given filesystem. But note that when you configure the Hadoop connection in the
Hadoop configuration tab of the Run view, you need to select the Use
Datanode hostname check box.