Scenario 2: Generating a matching model
This scenario applies only to a subscription-based Talend Platform solution with Big data or Talend Data Fabric.
The tMatchModel component reads the
suspect sample pairs generated by the tMatchPairing component and manually labeled by you.
For further information, see the
tMatchPairing documentation on Talend Help Center (https://help.talend.com).
The tMatchModel component generates several matching models,
searches the best combination of the learning parameters automatically and keeps the
best matching model which comes out as the result of cross validation.
The use case described here uses the following components:
-
A tFileInputDelimited component reads the
source file, which contains the suspect data pairs generated by tMatchPairing. -
A tMatchModel component
generates the features from the suspect records, implements the Random Forest
algorithm and creates a classification model.