August 15, 2023

Grouping the duplicate records – Docs for ESB 6.x

Grouping the duplicate records

  1. Right-click tMatchGroup to open its
    contextual menu and select Configuration
    Wizard
    .

    From the wizard, you can see how your groups look like and you can adjust
    the component settings in order to correctly get the similar matches.
    use_case-trulesurvivorship4-config_wizard.png

  2. Click the plus button under the Key
    Definition
    table to add one row.
  3. In the Input Key Attribute column of this
    row, select acctName. This way, this
    column becomes the reference used to match the duplicates of the input data.
  4. In the Matching Function column, select
    the Jaro-Winkler matching algorithm.
  5. In the Match threshold field, enter the
    numerical value to indicate at which value two record fields match each
    other. In this example, type in 0.6.
  6. Click Chart to execute this matching rule
    and show the result in this wizard.

    If the input records are not put into one single group, replace 0.6 with a smaller value and click Chart again to check the result until all of the
    four records are in the same group.
    The Job in this scenario puts four similar records into one single
    duplicates group so that tRuleSurvivorship
    is able to create one survivor from them. This simple sample allows you to
    have a clear picture about how tRuleSurvivorship works along with other components to
    create the best data. However, in the real-world case, you may need to
    process much more data with complex duplicate situation and thus put the
    data into much more groups.
  7. Click OK to close this Configuration wizard and the Basic settings view of the tMatchGroup component is automatically filled with the
    parameters you have set.

    For further information about the Configuration
    wizard
    , see Configuration wizard

Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x