Configuring the Job for aggregating values based on dynamic schema
Configure the Job to aggregate some task assignment data in a CSV
file based on a dynamic schema column using the tAggregateRow component.
Then this Job displays the aggregated data on the console
using the tLogRow component and writes it into an
output CSV file using the tFileOutputDelimited
component.
-
Double-click the tFileInputDelimited
component to open its Basic settings view. -
In the File name/Stream field, specify
the path to the CSV file that holds the following task assignment data, D:/tasks.csv in this example.1234567task;team;statustask1;team1;donetask2;team2;donetask3;team1;donetask4;team2;pendingtask5;team1;pendingtask6;team2;pending -
In the Header field, enter the number
of rows to be skipped in the beginning of the file, 1 in this example.Note that the dynamic schema feature is only supported in the Built-In mode and requires the input file to have a
header row. -
Click the

button next to Edit schema to
open the schema dialog box and define the schema by adding two columns, task of String type and other of Dynamic type. When done, click OK to save the changes and close the schema dialog box.Note that the dynamic column must be defined in the last row of the schema. For
more information about dynamic schema, see
Talend Studio User
Guide. -
Double-click the tAggregateRow
component, and on its Basic settings view, click
the Sync columns button to retrieve the schema from
the preceding component.
-
Add one row in the Group by table by
clicking the
button below it, and select other from both the Output column
and Input column position column fields to group
the input data by the other dynamic
column.Note that the dynamic column aggregation can be carried out only for the grouping
operation. -
Add one row in the Operations table
and define the operation to be carried out. In this example, the operation function
is list. Then select task from both the Output column
and Input column position column fields to list the
entries in the task column in the grouping result. -
Double-click the tLogRow component to
open its Basic settings view, and then select
Table (print values in cells of a table) in the
Mode area for better readability of the
result. -
Double-click the tFileOutputDelimited
component to open its Basic settings view, and in
the File Name field, specify the path to the CSV
file into which the aggregated data will be written, D:/tasks_aggregated.csv in this example. -
Select the Include Header check box to
include the header of each column in the CSV file.