
Warning
This component will be available in the Palette of
Talend Studio on the condition that you have subscribed to one of
the Talend Platform products.
Component family |
Data Quality |
|
Function |
tFindRegexlibExpressions connects |
|
Purpose |
tFindRegexlibExpressions returns |
|
Basic settings |
Schema and Edit |
These fields are read-only. The schema of this component contains |
|
Regexp Substring |
Define a regular expression substring you want to use as a filter |
|
Key Words |
Enter the key word(s) you want to use as a filter on the regular |
|
Min Rate |
Define a regular expression rating you want to use as a filter on |
|
Relative path |
Type in the relative path pointing to the pattern folder you need In order to create definitely the pattern folder in the DQ |
Advanced settings |
tStat |
Select this check box to collect log data at the component |
Global Variables |
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
This component is a start component. It requires an output flow, For more information about importing patterns, see Talend Studio User |
|
Limitation |
n/a |
This scenario is a three-component Java Job created in Talend Studio.
This scenario:
-
uses the tFindRegexlibExpression component to
connect to a web server and collects all regular expressions that have the word
“email” in their description field, -
uses the tMap component to reorganize the
incoming data in the output flow and also to concatenate the two fields from the
incoming data flow in one output column, -
and finally writes all collected expressions in an csv file.
This Job can also be generated automatically from the Patterns
> Regex node in the DQ Repository
tree view. For further information about how to generate a Job to recuperate regular
expressions, see the Talend Studio User Guide.

-
Drop the following components from the Palette onto the design workspace: tFindRegexlibExpressions, tMap, and tFileOutputDelimited.
-
Double-click the tFindRegexlibExpressions component to open its Basic settings view and define its
properties.The schema of this component is read-only and it contains the following
fields: Title, Expression,
Description, Matches,
Non-Matches, Author,
Rating and
Relative_path. -
In the Regexp Substring field, define a
regular expression substring you want to use as a filter on the regular
expression list. -
In the Key Words field, define the key
word(s) you want to use as a filter on the regular expression list. -
In the Min Rate field, define a regular
expression rating you want to use as a filter on the regular expression
list. -
In the Relative path field, type in the relative path pointing to the
folder to be created in the Patterns >
Regex node of the DQ
Repository tree view for the retrieved patterns. In this
example, this folder is email.In this scenario we want tFindRegexlibExpressions to collect all regular expressions
on the web server that have the word “email” in their
Description field and those which rate is at least
1. -
Connect tFindRegexlibExpressions and
tMap using a Main row link.
-
Double-click the tMap component to open
the Map Editor and do necessary fields
reorganization and concatenation. -
In the Map Editor, click the plus button
in the upper-right corner to open a dialog box where you can give a name to
the new output table, regex in this scenario.This will create a new link in the tMap
component holding the same name and that you can use to connect tMap to the next component. -
In the lower-right corner of the map Editor, click the plus button to
define the fields in the regex output table. -
In the upper half of the Map Editor, drop fields from the input table to
fill the fields of the output schema as necessary. For more information
regarding data mapping, see Talend Studio User
Guide.In this scenario, we want to concatenate the Matches,
and Non-Matches fields from the incoming
data flow in one output column: Purpose.We want as well
to have a new column in the output schema called Path.
And finally, we do not want to have any rating-related information in the
output schema. -
Click Ok to validate and close the Map
Editor. -
Right-click tMap and select the regex link to connect tMap to tFileOutputDelimited.
-
Double-click tFileOutputDelimited to
display its Basic settings and define its
properties. -
Click the three-dot button next to the File
Name field to browse to the file where you want to write the
output data. -
Define the row and field separators in the corresponding fields.
-
Select the Append check box if you want
to add the new rows at the end of the records. -
Select the Include Header check box to
include column headers in the output data. -
If needed, click Edit schema to view the
input and output data flows.
Save your Job an press F6 to execute it.
tFindRegexlibExpressions connects to the web
server and collects all regular expressions that match the request, tMap does all defined filed reorganization and
concatenation and passes the output flow to tFileOutptdelimited. The output file will look something like the
following:

You can later import all collected regular expressions from a well formatted csv
file into Talend Studio. for more
information about importing patterns, see Talend Studio
User Guide.