Component family |
Orchestration |
|
Function |
Duplicate the incoming schema into two identical output If you have subscribed to one of the Talend solutions with Big Data, you are |
|
Purpose |
Allows you to perform different operations on the same |
|
Basic settings |
Schema and Edit |
A schema is a row description, it defines the number of fields Since version 5.6, both the Built-In mode and the Repository mode are Click Edit schema to make changes to the schema. If the
Click Sync columns to retrieve This component offers the advantage of the dynamic schema feature. This allows you to This dynamic schema feature is designed for the purpose of retrieving unknown columns |
|
|
Built-in: The schema will be |
|
|
Repository: The schema already |
Global Variables |
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
This component is not startable (green background), it requires an |
|
Usage in Map/Reduce Jobs |
If you have subscribed to one of the Talend solutions with Big Data, you can also For further information about a Talend Map/Reduce Job, see the sections Note that in this documentation, unless otherwise explicitly stated, a scenario presents |
|
Usage in Storm Jobs |
If you have subscribed to one of the Talend solutions with Big Data, you can also The Storm version does not support the use of the global variables. You need to use the Storm Configuration tab in the This connection is effective on a per-Job basis. For further information about a Talend Storm Job, see the sections Note that in this documentation, unless otherwise explicitly stated, a scenario presents |
|
Connections |
Outgoing links (from this component to another): Row: Main.
Trigger: Run if; On Component Ok; Incoming links (from one component to this one): Row: Main; Reject; For further information regarding connections, see |
|
Log4j |
The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html. |
The scenario describes a Job that reads an input flow which contains names and states
from a CSV file, replicates the input flow, then sorts the two identical flows based on
name and state respectively, and displays the sorted data on the console.
-
Drop the following components from the Palette to the design workspace: one tFileInputDelimited component, one tReplicate component, two tSortRow components, and two tLogRow components.
-
Connect tFileInputDelimited to tReplicate using a Row > Main link.
-
Repeat the step above to connect tReplicate to two tSortRow
components respectively and connect tSortRow to tLogRow. -
Label the components to better identify their functions.
-
Double-click the tFileInputDelimited
component to open its Basic settings view
in the Component tab. -
Click the […] button next to the
File name/Stream field to browse to the
file from which you want to read the input flow. In this example, the input
file is Names&States.csv, which
contains two columns: name and state.12345678910111213name;stateAndrew Kennedy;MississippiBenjamin Carter;LouisianaBenjamin Monroe;West VirginiaBill Harrison;TennesseeCalvin Grant;VirginiaChester Harrison;Rhode IslandChester Hoover;KansasChester Kennedy;MarylandChester Polk;IndianaDwight Nixon;NevadaDwight Roosevelt;MississippiFranklin Grant;Nebraska -
Fill in the Header, Footer and Limit fields
according to your needs. In this example, type in 1 in the Header field to
skip the first row of the input file. -
Click Edit schema to define the data
structure of the input flow. -
Double-click the first tSortRow component
to open its Basic settings view. -
In the Criteria panel, click the
[+] button to add one row and set the
sorting parameters for the schema column to be processed. To sort the input
data by name, select name under Schema column. Select alpha as the sorting type and asc as the sorting order.For more information about those parameters, see tSortRow properties.
-
Double-click the second tSortRow
component and repeat the step above to define the sorting parameters for the
state column. -
In the Basic settings view of each
tLogRow component, select Table in the Mode area for a better view of the Job execution
result.