Component family |
Processing |
|
Function |
tFilterRow filters input rows by |
|
Purpose |
tFilterRow helps parametrizing |
|
Basic settings |
Schema and Edit |
A schema is a row description, it defines the number of fields to The schema of this component is read-only. This component offers the advantage of the dynamic schema feature. This allows you to This dynamic schema feature is designed for the purpose of retrieving unknown columns |
|
Logical operator used to combine conditions |
Select a logical operator to combine simple conditions and to And: returns the boolean value of Or: returns the boolean value of |
|
Conditions |
Click the plus button to add as many simple conditions as needed.
Input column: Select the column of
Function: Select the function on
Operator: Select the operator to
Value: Type in the filtered value, |
|
Use advanced mode |
Select this check box when the operations you want to perform If multiple advanced conditions are defined, use a logical
|
Advanced settings |
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at the Note that this check box is not available in |
Global Variables |
ERROR_MESSAGE: the error message generated by the NB_LINE: the number of rows read by an input component or NB_LINE_OK: the number of rows matching the filter. This
NB_LINE_REJECTED: the number of rows rejected. This is an A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
This component is not startable (green background) and it requires |
|
Usage in Map/Reduce Jobs |
If you have subscribed to one of the Talend solutions with Big Data, you can also For further information about a Talend Map/Reduce Job, see the sections Note that in this documentation, unless otherwise explicitly stated, a scenario presents |
|
Usage in Storm Jobs |
If you have subscribed to one of the Talend solutions with Big Data, you can also The Storm version does not support the use of the global variables. You need to use the Storm Configuration tab in the This connection is effective on a per-Job basis. For further information about a Talend Storm Job, see the sections Note that in this documentation, unless otherwise explicitly stated, a scenario presents |
|
Log4j |
The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html. |
The following scenario shows a Job that uses simple conditions to filter a list of
records. This scenario will output two tables: the first will list all male persons with
a last name shorter than nine characters and aged between 10 and 80 years; the second
will list all rejected records. An error message for each rejected record will display
in the same table to explain why such a record has been rejected.
-
Drop tFixedFlowInput, tFilterRow and tLogRow from the Palette
onto the design workspace. -
Connect the tFixedFlowInput to the
tFilterRow, using a Row > Main
link. Then, connect the tFilterRow to the
tLogRow, using a Row > Filter
link. -
Drop tLogRow from the Palette onto the design workspace and rename it
as reject. Then, connect the tFilterRow to the reject, using a
Row > Reject link. -
Label the components to better identify their roles in the Job.
-
Double-click tFixedFlowInput to display
its Basic settings view and define its
properties. -
Click the […] button next to Edit schema to define the schema for the input
data. In this example, the schema is made of the following four columns:
LastName (type String), Gender
(type String), Age (type Integer) and
City (type String).When done, click OK to validate the
schema setting and close the dialog box. A new dialog box opens and asks you
if you want to propagate the schema. Click Yes. -
Set the row and field separators in the corresponding fields if needed. In
this example, use the default settings for both, namely the row separator is
a carriage return and the field separator is a semi-colon. -
Select the Use Inline Content(delimited
file) option in the Mode
area and type in the input data in the Content field.The input data used in this example is shown
below:123456789101112131415Van Buren;M;73;ChicagoAdams;M;40;AlbanyJefferson;F;66;New YorkAdams;M;9;AlbanyJefferson;M;30;ChicagoCarter;F;26;ChicagoHarrison;M;40;New YorkRoosevelt;F;15;ChicagoMonroe;M;8;BostonArthur;M;20;AlbanyPierce;M;18;New YorkQuincy;F;83;AlbanyMcKinley;M;70;BostonCoolidge;M;4;ChicagoMonroe;M;60;Chicago -
Double-click tFilterRow to display its
Basic settings view and define its
properties. -
In the Conditions table, add four
conditions and fill in the filtering parameters.-
From the InputColumn list field
of the first row, select LastName, from the
Function list field, select
Length, from the Operator list field, select Lower than, and in the Value column, type in
9 to limit the length of last names to nine
characters. -
From the InputColumn list field
of the second row, select Gender, from the
Operator list field, select
Equals, and in the Value column, type in
M in double quotes to filter records of
male persons.Warning
In the Value field, you must
type in your values between double quotes for all types of
values, except for integer values, which do not need
quotes. -
From the InputColumn list field
of the third row, select Age, from the
Operator list field, select
Greater than, and in the
Value column, type in
10 to set the lower limit to 10
years. -
From the InputColumn list field
of the four row, select Age, from the Operator list field, select Lower than, and in the Value column, type in
80 to set the upper limit to 80
years.
-
-
To combine the conditions, select And as
that only those records that meet all the defined conditions are
accepted. -
In the Basic settings of tLogRow components, select Table (print values in cells of a table) in the Mode area.
-
Save your Job and press F6 to execute
it.As shown above, the first table lists the records of male persons aged
between 10 and 80 years, whose last names are made up of less than nine
characters, and the second table lists all the records that do not match the
filter conditions. Each rejected record has a corresponding error message
that explains the reason of rejection.
Based on the previous scenario, this scenario further filters the input data so that
only those records of people from New York and Chicago are accepted. Without changing
the filter settings defined in the previous scenario, advanced conditions are added in
this scenario to enable both logical AND and logical OR operations in the same tFilterRow component.
-
Double-click the tFilterRow component to show
its Basic settings view. -
Select the Use advanced mode check box, and
type in the following expression in the text field:1input_row.City.equals("Chicago") || input_row.City.equals("New York")This defines two conditions on the City
column of the input data to filter records that contain the cities of Chicago
and New York, and uses a logical OR to combine the two conditions so that
records satisfying either condition will be accepted. -
Press Ctrl+S to save the Job and press
F6 to execute it.As shown above, the result list of the previous scenario has been further
filtered, and only the records containing the cities of New York and Chicago are
accepted.