Component family |
Data Quality |
|
Function |
tIntervalMatch receives a main |
|
Purpose |
Helps to return a value based on a Join relation. |
|
Basic settings |
Schema and Edit |
A schema is a row description. It defines the number of fields to be processed and passed on Since version 5.6, both the Built-In mode and the Repository mode are |
|
|
Built-in: The schema will be |
|
|
Repository: The schema already |
Click Edit schema to make changes to the schema. If the
|
||
Search Column |
Select the main flow column containing the values to be matched |
|
|
Column (LOOKUP) |
Select the lookup flow column containing the values to be returned |
|
Lookup Column (min) / Include the bound |
Select the column containing the minimum value of the range. |
|
Lookup Column (max) / Include the bound |
Select the column containing the maximum value of the range. |
Advanced settings |
tStatCatcher |
Select this check box to collect log data at the component |
Global Variables |
NB_LINE: the number of rows read by an input component or ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
This component handles flow of data therefore it requires input |
|
Log4j |
The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html. |
|
Limitation |
n/a |
This scenario describes a four-component Job that checks the server IP addresses
listed in the main input file against a list of IP ranges given in a lookup file to
identify the hosting country for each server.
The Job requires two tFileInputDelimited
components, a tIntervalMatch component and a
tLogRow component.
-
Drop the components onto the design workspace.
-
Connect the components using Row >
Main connection.Note that the connection from the second tFileInputDelimited component to the tIntervalMatch component will appear as a Lookup
connection.
-
Double-click the first tFileInputDelimited component to open its Basic settings view.
-
Browse to the file to be used as the main input, which provides a list of
servers and their IP addresses:12345Server;IPServer1;057.010.010.010Server2;001.010.010.100Server3;057.030.030.030Server4;053.010.010.100 -
Click the […] button next to Edit schema to open the [Schema] dialog box and define the input schema. According
to the input file structure, the schema is made of two columns, respectively
Server and IP, both of type
String. Then click OK to close the dialog box. -
Define the number of header rows to be skipped, and keep the other
settings as they are. -
Define the properties of the second tFileInputDelimited component similarly.
The file to be used as the input to the lookup flow in this example lists
some IP address ranges and the corresponding countries:1234567StartIP;EndIP;Country001.000.000.000;001.255.255.255;USA002.006.190.056;002.006.190.063;UK011.000.000.000;011.255.255.255;USA057.000.000.000;057.255.255.255;France012.063.178.060;012.063.178.063;Canada053.000.000.000;053.255.255.255;GermanyAccordingly, the schema of the lookup flow should have the following
structure: -
Double-click the tIntervalMatch component
to open its Basic settings view. -
From the Search Column list, select the
main flow column containing the values to be matched with the range values.
In this example, we want to match the servers’ IP addresses with the range
values from the lookup flow. -
From the Column (LOOKUP) list, select the
lookup column that holds the values to be returned. In this example, we want
to get the names of countries where the servers are hosted. -
Set the min and max lookup columns corresponding to the range bounds
defined in the lookup schema, StartIP and
EndIP respectively in this
example.