August 15, 2023

Scenario 2: Matching customer data through multiple passes – Docs for ESB 6.x

Scenario 2: Matching customer data through multiple passes

This scenario applies only to a subscription-based Talend Platform solution or Talend Data Fabric.

The Job in this scenario, groups similar customer records by running through two
subsequent matching passes (tMatchGroup components) and
outputs the calculated matches in groups. Each pass provides its matches to the pass
that follows in order for the latter to add more matches identified with new rules and
blocking keys.


In this Job:

  • The tMysqlInput component connects to the
    customer records to be processed.

  • Each of the tGenKey components defines a way
    to partition data records. The first key partitions data to many groups and the
    second key creates fewer groups that overlaps the previous blocks depending on
    the blocking key definition.

  • The tMap component renames the key generated
    by the second tGenKey component.

  • The first tMatchGroup processes the
    partitions defined by the first tGenKey, and
    the second tMatchGroup processes those defined
    by the second tGenKey.


    The two tMatchGroup components must
    have the same schema.

  • The tLogRow component presents the matching
    results after the two passes.

Document get from Talend
Thank you for watching.
Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x