tFindRegexlibExpressions
Returns a dataset holding information about all of the regular expressions that
match the request sent to the web server.
tFindRegexlibExpressions connects to
a web service at http://regexlib.com to get a list of regular expressions for all languages, even those that are not
supported by Talend.
tFindRegexlibExpressions Standard properties
These properties are used to configure tFindRegexlibExpressions running in the Standard Job framework.
The Standard
tFindRegexlibExpressions component belongs to the Data Quality family.
This component is available in Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, Talend MDM Platform and Talend Data Fabric.
Basic settings
Schema and Edit |
These fields are read-only. The schema of this component contains |
Regexp Substring |
Define a regular expression substring you want to use as a filter |
Key Words |
Enter the key word(s) you want to use as a filter on the regular |
Min Rate |
Define a regular expression rating you want to use as a filter on |
Relative path |
Type in the relative path pointing to the pattern folder you need In order to create definitely the pattern folder in the DQ |
Advanced settings
tStat |
Select this check box to collect log data at the component |
Global Variables
Global Variables |
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
This component is a start component. It requires an output flow, For more information about importing patterns, see |
Connecting to a web service and returning a list of regular
expressions
This scenario applies only to Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, Talend MDM Platform and Talend Data Fabric.
This scenario is a three-component Java Job created in
Talend Studio
.
This scenario:
-
uses the tFindRegexlibExpression component to connect to a web server
and collects all regular expressions that have the word “email” in their
description field, -
uses the tMap component to
reorganize the incoming data in the output flow and also to concatenate the two
fields from the incoming data flow in one output column, -
and finally writes all collected expressions in an csv file.
In this scenario we want tFindRegexlibExpressions
to collect all regular expressions on the web server that have the word “email” in their
Description field and those which rate is at least 1.
This Job can also be generated automatically from the Patterns > Regex node in the DQ Repository tree view. For further information about how to generate
a Job to recuperate regular expressions, see the
Talend Studio User Guide.
Configuring the tFindRegexlibExpressions component
- Drop the following components from the Palette onto the design workspace: tFindRegexlibExpressions, tMap, and tFileOutputDelimited.
-
Double-click the tFindRegexlibExpressions component to open its Basic settings view and define its
properties.The schema of this component is read-only and it contains the following
fields: Title, Expression, Description,
Matches, Non-Matches, Author, Rating and
Relative_path. -
In the Regexp Substring field, define a
regular expression substring you want to use as a filter on the regular
expression list. -
In the Key Words field, define the key
word(s) you want to use as a filter on the regular expression list. -
In the Min Rate field, define a regular
expression rating you want to use as a filter on the regular expression
list. -
In the Relative path field, type in the relative path pointing to the
folder to be created in the Patterns >
Regex node of the DQ
Repository tree view for the retrieved patterns. In this
example, this folder is email. -
Connect tFindRegexlibExpressions and
tMap using a Main row link.
Configuring the tMap component
-
Double-click the tMap component to open
the Map Editor and do necessary fields
reorganization and concatenation. -
In the Map Editor, click the plus button
in the upper-right corner to open a dialog box where you can give a name to
the new output table, regex in this scenario.This will create a new link in the tMap component
holding the same name and that you can use to connect
tMap to the next component. -
In the lower-right corner of the map Editor, click the plus button to
define the fields in the regex output table. -
In the upper half of the Map Editor, drop fields from the input table to
fill the fields of the output schema as necessary. For more information
regarding data mapping, see
Talend Studio User
Guide.In this scenario, we want to concatenate the Matches, and
Non-Matches fields from the incoming data flow in one output column:
Purpose.We want as well to have a new column in the output schema called
Path. And finally, we do not want to have any rating-related information
in the output schema. -
Click Ok to validate and close the Map
Editor. - Right-click tMap and select the regex link to connect tMap to tFileOutputDelimited.
Configuring the output component
-
Double-click tFileOutputDelimited to
display its Basic settings and define its
properties. -
Click the three-dot button next to the File
Name field to browse to the file where you want to write the
output data. - Define the row and field separators in the corresponding fields.
-
Select the Append check box if you want
to add the new rows at the end of the records. -
Select the Include Header check box to
include column headers in the output data. -
If needed, click Edit schema to view the
input and output data flows.
Saving and executing the Job
-
Press Ctrl + S to save
the Job. -
Press F6 to run the
Job.
tFindRegexlibExpressions
connects to the web server and collects all regular expressions that match the
request, tMap does all defined filed
reorganization and concatenation and passes the output flow to tFileOutptdelimited. The output file will look something
like the following:
You can later import all collected regular expressions from a well
formatted csv file into
Talend Studio
. for more information about importing
patterns, see
Talend Studio
User Guide.