tFileOutputDelimited

Outputs the input data to a delimited file according to the defined
schema.

Depending on the Talend solution you
are using, this component can be used in one, some or all of the following Job
frameworks:

Standard: see tFileOutputDelimited Standard properties.

The component in this framework is generally available.
MapReduce: see tFileOutputDelimited MapReduce properties.

The component in this framework is available only if you have subscribed to one
of the
Talend
solutions with Big Data.
Spark Batch: see tFileOutputDelimited properties for Apache Spark Batch.

The component in this framework is available only if you have subscribed to one
of the
Talend
solutions with Big Data.
Spark Stream: see tFileOutputDelimited properties for Apache Spark Streaming.

The component in this framework is available only if you have subscribed to Talend Real-time Big Data Platform or Talend Data
Fabric.

tFileOutputDelimited Standard properties

These properties are used to configure tFileOutputDelimited running in the Standard Job framework.

The Standard
tFileOutputDelimited component belongs to the File family.

The component in this framework is generally available.

Basic settings

Property type	Either Built-In or Repository.
	Built-In: No property data stored centrally.
	Repository: Select the repository file where the properties are stored.
Use Output Stream	Select this check box process the data flow of interest. Once you have selected it, the Output Stream field displays and you can type in the data flow of interest. The data flow to be processed must be added to the flow in order for this component to fetch these data via the corresponding representative variable. This variable could be already pre-defined in your Studio or provided by the context or the components you are using along with this component; otherwise, you could define it manually and use it according to the design of your Job, for example, using tJava or tJavaFlex. In order to avoid the inconvenience of hand writing, you could select the variable of interest from the auto-completion list (Ctrl+Space) to fill the current field on condition that this variable has been properly defined. For further information about how to use a stream, see Scenario 2: Reading data from a remote file in streaming mode.
File name	Name or path to the output file and/or the variable to be used. This field becomes unavailable once you have selected the Use Output Stream check box. For further information about how to define and use a variable in a Job, see Talend Studio User Guide.
Row Separator	The separator used to identify the end of a row.
Field Separator	Enter character, string or regular expression to separate fields for the transferred data.
Append	Select this check box to add the new rows at the end of the file.
Include Header	Select this check box to include the column header to the file.
Compress as zip file	Select this check box to compress the output file in zip format.
Schema and Edit schema	A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository. Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available: View schema: choose this option to view the schema only. Change to built-in property: choose this option to change the schema to Built-in for local changes. Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window. This component offers the advantage of the dynamic schema feature. This allows you to retrieve unknown columns from source files or to copy batches of columns from a source without mapping each column individually. For further information about dynamic schemas, see Talend Studio User Guide. This dynamic schema feature is designed for the purpose of retrieving unknown columns of a table and is recommended to be used for this purpose only; it is not recommended for the use of creating tables.
	Built-In: You create and store the schema locally for this component only. Related topic: see Talend Studio User Guide.
	Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Related topic: see Talend Studio User Guide.
Sync columns	Click to synchronize the output file schema with the input file schema. The Sync function only displays once the Row connection is linked with the output component.

Advanced settings

Advanced separator (for numbers)	Select this check box to change the separator used for numbers. By default, the thousands separator is a comma (,) and the decimal separator is a period (.). Thousands separator: define separators for thousands. Decimal separator: define separators for decimals.
CSV options	Select this check box to specify the following CSV parameters: Escape char: enter the escape character between double quotation marks. For example, you need to type in `"\"` if the backslash character () is used as the escape character. Text enclosure: enter the enclosure character (only one character) between double quotation marks. For example, `"""` needs to be entered when double quotation marks (“) are used as the enclosure character.
Create directory if not exists	This check box is selected by default. It creates the directory that holds the output delimited file, if it does not already exist.
Split output in several files	In case of very big output files, select this check box to divide the output delimited file into several files. Rows in each output file: set the number of lines in each of the output files.
Custom the flush buffer size	Select this check box to define the number of lines to write before emptying the buffer. Row Number: set the number of lines to write.
Output in row mode	Select this check box to ensure atomicity of the flush so that each row of data can remain consistent as a set and incomplete rows of data are never written to a file. This check box is mostly useful when using this component in the multi-thread situation.
Encoding	Select the encoding from the list or select Custom and define it manually. This field is compulsory for database data handling.
Don’t generate empty file	Select this check box if you do not want to generate empty files.
Throw an error if the file already exist	Select this check box to throw an exception if the output file specified in the File Name field on the Basic settings view already exists. Clear this check box to overwrite the existing file.
tStatCatcher Statistics	Select this check box to gather the Job processing metadata at a Job level as well as at each component level.

Global Variables

Global Variables	NB_LINE: the number of rows read by an input component or transferred to an output component. This is an After variable and it returns an integer. FILE_NAME: the name of the file being processed. This is a Flow variable and it returns a string. ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box. A Flow variable functions during the execution of a component while an After variable functions after the execution of the component. To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it. For further information about variables, see Talend Studio User Guide.

NB_LINE: the number of rows read by an input component or
transferred to an output component. This is an After variable and it returns an
integer.

FILE_NAME: the name of the file being processed. This is
a Flow variable and it returns a string.

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule	Use this component to write a delimited file and separate fields using a field separator value.
Limitation	Due to license incompatibility, one or more JARs required to use this component are not provided. You can install the missing JARs for this particular component by clicking the Install button on the Component tab view. You can also find out and add all missing JARs easily on the Modules tab in the Integration perspective of your studio. You can find more details about how to install external modules in Talend Help Center (https://help.talend.com).

Usage rule

Use this component to write
a delimited file and separate fields using a field separator value.

Limitation

Due to license incompatibility, one or more JARs required to use this component are not
provided. You can install the missing JARs for this particular component by clicking the
Install button on the Component tab view. You can also find out and add all missing JARs easily on the
Modules tab in the
Integration
perspective of your
studio. You can find more details about how to install external modules in Talend Help Center (https://help.talend.com).

Scenario 1: Writing data in a delimited file

This scenario describes a three-component Job that extracts certain data from a file
holding information about clients, customers, and then writes the
extracted data in a delimited file.

In the following example, we have already stored the input schema under the Metadata node in the Repository tree view. For more information about storing schema metadata
in the Repository, see
Talend Studio User Guide.

Dropping and linking components

In the Repository tree view, expand
Metadata and File
delimited in succession and then browse to your input schema,
customers, and drop it on the design workspace. A
dialog box displays where you can select the component type you want to
use.
Click tFileInputDelimited and then
OK to close the dialog box. A tFileInputDelimited component holding the name of
your input schema appears on the design workspace.
Drop a tMap component and a tFileOutputDelimited component from the Palette to the design workspace.
Link the components together using Row >
Main connections.

Configuring the components

Configuring the input component

Double-click tFileInputDelimited to open
its Basic settings view. All its property
fields are automatically filled in because you defined your input file
locally.
If you do not define your input file locally in the Repository tree view, fill in the details manually after
selecting Built-in in the Property type list.
Click the […] button next to the
File Name field and browse to the input
file, customer.csv in this example.

Warning:

If the path of the file contains some accented characters, you
will get an error message when executing your Job.
In the Row Separators and Field Separators fields, enter respectively
“
” and “;” as line and field
separators.
If needed, set the number of lines used as header and the number of lines
used as footer in the corresponding fields and then set a limit for the
number of processed rows.

In this example, Header is set to 6 while
Footer and Limit are not set.
In the Schema field,
schema is automatically set to Repository
and your schema is already defined since you have stored your input file locally
for this example. Otherwise, select Built-in and click the […] button next to Edit
Schema to open the [Schema]
dialog box where you can define the input schema, and then click OK to close the dialog box.

Configuring the mapping component

In the design workspace, double-click tMap to open its editor.
In the tMap editor, click

on top of the panel to the right to open the [Add a new output table] dialog box.
Enter a name for the table you want to create, row2
in this example.
Click OK to validate your changes and
close the dialog box.
In the table to the left, row1, select the first
three lines (Id, CustomerName and
CustomerAddress) and drop them to the table to the
right
In the Schema editor view situated in the
lower left corner of the tMap editor,
change the type of RegisterTime to String in the table to the right.
Click OK to save your changes and close
the editor.

Configuring the output component

In the design workspace, double-click tFileOutputDelimited to open its Basic
settings view and define the component properties.
In the Property Type field, set the type
to Built-in and fill in the fields that
follow manually.
Click the […] button next to the
File Name field and browse to the
output file you want to write data in,
customerselection.txt in this example.
In the Row Separator and Field Separator fields, set
“
” and “;” respectively as
row and field separators.
Select the Include Header check box if
you want to output columns headers as well.
Click Edit schema to open the schema
dialog box and verify if the recuperated schema corresponds to the input
schema. If not, click Sync Columns to
recuperate the schema from the preceding component.

Saving and executing the Job

Press Ctrl+S to save your Job.
Press F6 or click Run on the Run tab to
execute the Job.

The three specified columns Id,
CustomerName and
CustomerAddress are output in the defined output
file.

For an example of how to use dynamic schemas with tFileOutputDelimited, see Scenario 5: Writing dynamic columns from a database to an output file.

Scenario 2: Utilizing Output Stream to save filtered data to a local file

Based on the preceding scenario, this scenario saves the filtered data to a local file
using output stream.

Dropping and linking components

Drop tJava from the Palette to the design workspace.
Connect tJava to tFileInputDelimited using a Trigger > On Subjob OK
connection.

Configuring the components

Double-click tJava to open its Basic settings view.
In the Code area, type in the following
command:

new java.io.File("C:/myFolder").mkdirs(); globalMap.put("out_file",new java.io.FileOutputStream("C:/myFolder/customerselection.txt",false));

1
2
3

new java.io.File("C:/myFolder").mkdirs();
globalMap.put("out_file",new
java.io.FileOutputStream("C:/myFolder/customerselection.txt",false));

Note:

In this scenario, the command we use in the Code area of tJava will
create a new folder C:/myFolder where the output
file customerselection.txt will be saved. You can
customize the command in accordance with actual practice.
Double-click tFileOutputDelimited to open
its Basic settings view.
Select Use Output Stream check box to
enable the Output Stream field in which you
can define the output stream using command.

Fill in the Output Stream field with
following command:

(java.io.OutputStream)globalMap.get("out_file")

1

(java.io.OutputStream)globalMap.get("out_file")

Note:

You can customize the command in the Output
Stream field by pressing CTRL+SPACE to
select built-in command from the list or type in the command into
the field manually in accordance with actual practice. In this
scenario, the command we use in the Output
Stream field will call the
java.io.OutputStream class to output the filtered
data stream to a local file which is defined in the Code area of tJava in this scenario.
Click Sync columns to retrieve the schema
defined in the preceding component.
Leave rest of the components as they were in the previous scenario.

Saving and executing the Job

Press Ctrl+S to save
your Job.
Press F6 or click
Run on the Run tab to execute the Job.

The three specified columns Id, CustomerName and CustomerAddress
are output in the defined output file.

For an example of how to use dynamic schemas with
tFileOutputDelimited, see Scenario 5: Writing dynamic columns from a database to an output file.

tFileOutputDelimited MapReduce properties

These properties are used to configure tFileOutputDelimited running in the MapReduce Job framework.

The MapReduce
tFileOutputDelimited component belongs to the MapReduce family.

The component in this framework is available only if you have subscribed to one
of the
Talend
solutions with Big Data.

Basic settings

Property type	Either Built-In or Repository.
	Built-In: No property data stored centrally.
	Repository: Select the repository file where the properties are stored. The properties are stored centrally under the Hadoop Cluster node of the Repository tree. The fields that come after are pre-filled in using the fetched data. For further information about the Hadoop Cluster node, see the Getting Started Guide.
	Click this icon to open a database connection wizard and store the database connection parameters you set in the component Basic settings view. For more information about setting up and storing database connection parameters, see Talend Studio User Guide.
Schema and Edit Schema	A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository. Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available: View schema: choose this option to view the schema only. Change to built-in property: choose this option to change the schema to Built-in for local changes. Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.
	Built-In: You create and store the schema locally for this component only. Related topic: see Talend Studio User Guide.
	Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Related topic: see Talend Studio User Guide.
Folder	Browse to, or enter the path pointing to the data to be used in the file system. This path must point to a folder rather than a file, because a Talend Map/Reduce Job need to write in its target folder not only the final result but also multiple part- files generated in performing Map/Reduce computations. Note that you need to ensure you have properly configured the connection to the Hadoop distribution to be used in the Hadoop configuration tab in the Run view.
Action	Select an operation for writing data: Create: Creates a file and write data in it. Overwrite: Overwrites the file existing in the directory specified in the Folder field.
Row separator	The separator used to identify the end of a row.
Field separator	Enter character, string or regular expression to separate fields for the transferred data.
Include Header	Select this check box to include the column header to the file.
Custom encoding	You may encounter encoding issues when you process the stored data. In that situation, select this check box to display the Encoding list. Select the encoding from the list or select Custom and define it manually. This field is compulsory for database data handling.
Compress the data	Select the Compress the data check box to compress the output data. Hadoop provides different compression formats that help reduce the space needed for storing files and speed up data transfer. When reading a compressed file, the Studio needs to uncompress it before being able to feed it to the input flow.
Merge result to single file	Select this check box to merge the final part files into a single file and put that file in a specified directory. Once selecting it, you need to enter the path to, or browse to the folder you want to store the merged file in. This directory is automatically created if it does not exist. The following check boxes are used to manage the source and the target files: Remove source dir: select this check box to remove the source files after the merge. Override target file: select this check box to override the file already existing in the target location. This option does not override the folder. This option is not available for a Sequence file.

Advanced settings

Advanced separator (for number)	Select this check box to change the separator used for numbers. By default, the thousands separator is a comma (,) and the decimal separator is a period (.). This option is not available for a Sequence file.
CSV options	Select this check box to include CSV specific parameters such as Escape char and Text enclosure.
Enable parallel execution	Select this check box to perform high-speed data processing, by treating multiple data flows simultaneously. Note that this feature depends on the database or the application ability to handle multiple inserts in parallel as well as the number of CPU affected. In the Number of parallel executions field, either: Enter the number of parallel executions desired. Press Ctrl + Space and select the appropriate context variable from the list. For further information, see Talend Studio User Guide.

Advanced separator (for number)

Select this check box to change the separator used for numbers. By default, the thousands separator is a comma (,) and the decimal separator is a period (.).

This option is not available for a Sequence file.

CSV options

Select this check box to include CSV specific parameters such as Escape char and Text
enclosure.

Enable parallel execution

Select this check box to perform high-speed data processing, by treating multiple data flows
simultaneously. Note that this feature depends on the database or the application ability to
handle multiple inserts in parallel as well as the number of CPU affected. In the Number of parallel executions field, either:

Enter the number of parallel executions desired.
Press Ctrl + Space and select the appropriate
context variable from the list. For further information, see
Talend Studio
User Guide.

Global Variables

Global Variables	ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box. A Flow variable functions during the execution of a component while an After variable functions after the execution of the component. To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it. For further information about variables, see Talend Studio User Guide.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule	In a Talend Map/Reduce Job, it is used as an end component and requires a transformation component as input link. The other components used along with it must be Map/Reduce components, too. They generate native Map/Reduce code that can be executed directly in Hadoop. Once a Map/Reduce Job is opened in the workspace, tFileOutputDelimited as well as the MapReduce family appears in the Palette of the Studio. Note that in this documentation, unless otherwise explicitly stated, a scenario presents only Standard Jobs, that is to say traditional Talend data integration Jobs, and non Map/Reduce Jobs.
Hadoop Connection	You need to use the Hadoop Configuration tab in the Run view to define the connection to a given Hadoop distribution for the whole Job. This connection is effective on a per-Job basis.

Usage rule

In a
Talend
Map/Reduce Job, it is used as an end component and requires
a transformation component as input link. The other components used along with it must be
Map/Reduce components, too. They generate native Map/Reduce code that can be executed
directly in Hadoop.

Once a Map/Reduce Job is opened in the workspace, tFileOutputDelimited as well as the
MapReduce family appears in the Palette of the Studio.

Note that in this documentation, unless otherwise
explicitly stated, a scenario presents only Standard Jobs,
that is to say traditional
Talend
data integration Jobs, and non Map/Reduce Jobs.

Hadoop Connection

You need to use the Hadoop Configuration tab in the
Run view to define the connection to a given Hadoop
distribution for the whole Job.

This connection is effective on a per-Job basis.

Related scenarios

No scenario is available for the Map/Reduce version of this component yet.

tFileOutputDelimited properties for Apache Spark Batch

These properties are used to configure tFileOutputDelimited running in the Spark Batch Job framework.

The Spark Batch
tFileOutputDelimited component belongs to the File family.

The component in this framework is available only if you have subscribed to one
of the
Talend
solutions with Big Data.

Basic settings

Define a storage configuration component	Select the configuration component to be used to provide the configuration information for the connection to the target file system such as HDFS. If you leave this check box clear, the target file system is the local system. The configuration component to be used must be present in the same Job. For example, if you have dropped a tHDFSConfiguration component in the Job, you can select it to write the result in a given HDFS system.
Property type	Either Built-In or Repository.
	Built-In: No property data stored centrally.
	Repository: Select the repository file where the properties are stored. The properties are stored centrally under the Hadoop Cluster node of the Repository tree. The fields that come after are pre-filled in using the fetched data. For further information about the Hadoop Cluster node, see the Getting Started Guide.
	Click this icon to open a database connection wizard and store the database connection parameters you set in the component Basic settings view. For more information about setting up and storing database connection parameters, see Talend Studio User Guide.
Schema and Edit Schema	A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository. Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available: View schema: choose this option to view the schema only. Change to built-in property: choose this option to change the schema to Built-in for local changes. Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.
	Built-In: You create and store the schema locally for this component only. Related topic: see Talend Studio User Guide.
	Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Related topic: see Talend Studio User Guide.
Folder	Browse to, or enter the path pointing to the data to be used in the file system. Note that this path must point to a folder rather than a file. The button for browsing does not work with the Spark Local mode; if you are using the Spark Yarn or the Spark Standalone mode, ensure that you have properly configured the connection in a configuration component in the same Job, such as tHDFSConfiguration.
Action	Select an operation for writing data: Create: Creates a file and write data in it. Overwrite: Overwrites the file existing in the directory specified in the Folder field.
Row separator	The separator used to identify the end of a row.
Field separator	Enter character, string or regular expression to separate fields for the transferred data.
Include Header	Select this check box to include the column header to the file.
Custom encoding	You may encounter encoding issues when you process the stored data. In that situation, select this check box to display the Encoding list. Select the encoding from the list or select Custom and define it manually. This field is compulsory for database data handling.
Compress the data	Select the Compress the data check box to compress the output data. Hadoop provides different compression formats that help reduce the space needed for storing files and speed up data transfer. When reading a compressed file, the Studio needs to uncompress it before being able to feed it to the input flow.
Merge result to single file	Select this check box to merge the final part files into a single file and put that file in a specified directory. Once selecting it, you need to enter the path to, or browse to the folder you want to store the merged file in. This directory is automatically created if it does not exist. The following check boxes are used to manage the source and the target files: Remove source dir: select this check box to remove the source files after the merge. Override target file: select this check box to override the file already existing in the target location. This option does not override the folder. This option is not available for a Sequence file.

Advanced settings

Advanced separator (for number)	Select this check box to change the separator used for numbers. By default, the thousands separator is a comma (,) and the decimal separator is a period (.). This option is not available for a Sequence file.
CSV options	Select this check box to include CSV specific parameters such as Escape char and Text enclosure.

Advanced separator (for number)

Select this check box to change the separator used for numbers. By default, the thousands separator is a comma (,) and the decimal separator is a period (.).

This option is not available for a Sequence file.

CSV options

Select this check box to include CSV specific parameters such as Escape char and Text
enclosure.

Usage

Usage rule	This component is used as an end component and requires an input link. This component, along with the Spark Batch component Palette it belongs to, appears only when you are creating a Spark Batch Job. Note that in this documentation, unless otherwise explicitly stated, a scenario presents only Standard Jobs, that is to say traditional Talend data integration Jobs.
Spark Connection	You need to use the Spark Configuration tab in the Run view to define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files: Yarn mode: when using Google Dataproc, specify a bucket in the Google Storage staging bucket field in the Spark configuration tab; when using other distributions, use a tHDFSConfiguration component to specify the directory. Standalone mode: you need to choose the configuration component depending on the file system you are using, such as tHDFSConfiguration or tS3Configuration. This connection is effective on a per-Job basis.

Usage rule

This component is used as an end component and requires an input link.

This component, along with the Spark Batch component Palette it belongs to, appears only
when you are creating a Spark Batch Job.

Note that in this documentation, unless otherwise
explicitly stated, a scenario presents only Standard Jobs,
that is to say traditional
Talend
data integration Jobs.

Spark Connection

You need to use the Spark Configuration tab in
the Run view to define the connection to a given
Spark cluster for the whole Job. In addition, since the Job expects its dependent jar
files for execution, you must specify the directory in the file system to which these
jar files are transferred so that Spark can access these files:

Yarn mode: when using Google
Dataproc, specify a bucket in the Google Storage staging
bucket field in the Spark
configuration tab; when using other distributions, use a
tHDFSConfiguration
component to specify the directory.
Standalone mode: you need to choose
the configuration component depending on the file system you are using, such
as tHDFSConfiguration
or tS3Configuration.

This connection is effective on a per-Job basis.

Related scenarios

No scenario is available for the Spark Batch version of this component
yet.

tFileOutputDelimited properties for Apache Spark Streaming

These properties are used to configure tFileOutputDelimited running in the Spark Streaming Job framework.

The Spark Streaming
tFileOutputDelimited component belongs to the File family.

The component in this framework is available only if you have subscribed to Talend Real-time Big Data Platform or Talend Data
Fabric.

Basic settings

Define a storage configuration component	Select the configuration component to be used to provide the configuration information for the connection to the target file system such as HDFS. If you leave this check box clear, the target file system is the local system. The configuration component to be used must be present in the same Job. For example, if you have dropped a tHDFSConfiguration component in the Job, you can select it to write the result in a given HDFS system.
Property type	Either Built-In or Repository.
	Built-In: No property data stored centrally.
	Repository: Select the repository file where the properties are stored. The properties are stored centrally under the Hadoop Cluster node of the Repository tree. The fields that come after are pre-filled in using the fetched data. For further information about the Hadoop Cluster node, see the Getting Started Guide.
	Click this icon to open a database connection wizard and store the database connection parameters you set in the component Basic settings view. For more information about setting up and storing database connection parameters, see Talend Studio User Guide.
Schema and Edit Schema	A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository. Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available: View schema: choose this option to view the schema only. Change to built-in property: choose this option to change the schema to Built-in for local changes. Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.
	Built-In: You create and store the schema locally for this component only. Related topic: see Talend Studio User Guide.
	Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Related topic: see Talend Studio User Guide.
Folder	Browse to, or enter the path pointing to the data to be used in the file system. Note that this path must point to a folder rather than a file. The button for browsing does not work with the Spark Local mode; if you are using the Spark Yarn or the Spark Standalone mode, ensure that you have properly configured the connection in a configuration component in the same Job, such as tHDFSConfiguration.
Action	Select an operation for writing data: Create: Creates a file and write data in it. Overwrite: Overwrites the file existing in the directory specified in the Folder field.
Row separator	The separator used to identify the end of a row.
Field separator	Enter character, string or regular expression to separate fields for the transferred data.
Include Header	Select this check box to include the column header to the file.
Custom encoding	You may encounter encoding issues when you process the stored data. In that situation, select this check box to display the Encoding list. Select the encoding from the list or select Custom and define it manually. This field is compulsory for database data handling.
Compress the data	Select the Compress the data check box to compress the output data. Hadoop provides different compression formats that help reduce the space needed for storing files and speed up data transfer. When reading a compressed file, the Studio needs to uncompress it before being able to feed it to the input flow.

Advanced settings

Advanced separator (for number)	Select this check box to change the separator used for numbers. By default, the thousands separator is a comma (,) and the decimal separator is a period (.). This option is not available for a Sequence file.
CSV options	Select this check box to include CSV specific parameters such as Escape char and Text enclosure.

Advanced separator (for number)

Select this check box to change the separator used for numbers. By default, the thousands separator is a comma (,) and the decimal separator is a period (.).

This option is not available for a Sequence file.

CSV options

Select this check box to include CSV specific parameters such as Escape char and Text
enclosure.

Usage

Usage rule	This component is used as an end component and requires an input link. This component, along with the Spark Streaming component Palette it belongs to, appears only when you are creating a Spark Streaming Job. Note that in this documentation, unless otherwise explicitly stated, a scenario presents only Standard Jobs, that is to say traditional Talend data integration Jobs.
Spark Connection	You need to use the Spark Configuration tab in the Run view to define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files: Yarn mode: when using Google Dataproc, specify a bucket in the Google Storage staging bucket field in the Spark configuration tab; when using other distributions, use a tHDFSConfiguration component to specify the directory. Standalone mode: you need to choose the configuration component depending on the file system you are using, such as tHDFSConfiguration or tS3Configuration. This connection is effective on a per-Job basis.

Usage rule

This component is used as an end component and requires an input link.

This component, along with the Spark Streaming component Palette it belongs to, appears
only when you are creating a Spark Streaming Job.

Note that in this documentation, unless otherwise explicitly stated, a scenario presents
only Standard Jobs, that is to say traditional
Talend
data
integration Jobs.

Spark Connection

Yarn mode: when using Google
Dataproc, specify a bucket in the Google Storage staging
bucket field in the Spark
configuration tab; when using other distributions, use a
tHDFSConfiguration
component to specify the directory.
Standalone mode: you need to choose
the configuration component depending on the file system you are using, such
as tHDFSConfiguration
or tS3Configuration.

This connection is effective on a per-Job basis.

Related scenarios

No scenario is available for the Spark Streaming version of this component
yet.

Document get from Talend https://help.talend.com

Thank you for watching.

Docs 6.x

0 Comments

Inline Feedbacks

View all comments

tFileOutputDelimited – Docs for ESB 6.x

tFileOutputDelimited

tFileOutputDelimited Standard properties

Basic settings

Advanced settings

Global Variables

Usage

Scenario 1: Writing data in a delimited file

Dropping and linking components

Configuring the components

Configuring the input component

Configuring the mapping component

Configuring the output component

Saving and executing the Job

Scenario 2: Utilizing Output Stream to save filtered data to a local file

Dropping and linking components

Configuring the components

Saving and executing the Job

tFileOutputDelimited MapReduce properties

Basic settings

Advanced settings

Global Variables

Usage

Related scenarios

tFileOutputDelimited properties for Apache Spark Batch

Basic settings

Advanced settings

Usage

Related scenarios

tFileOutputDelimited properties for Apache Spark Streaming

Basic settings

Advanced settings

Usage

Related scenarios

My Website Links

Tags