August 15, 2023

The Simple VSR Matcher algorithm – Docs for ESB 6.x

The Simple VSR Matcher algorithm

The Simple VSR Matcher algorithm compares each record within same block with the
previous master records in the lookup table.

If a record does not match any of the previous master records, it is considered as a new
master record and added to the lookup table. This means that the first record of the
dataset is necessarily a master record.

When a record matches a master record, the Simple VSR Matcher algorithm does not further
attempt to match with other master records because all the master records in the lookup
table are not similar. So, once a record matches a master record, the chance of matching
another master record is low.

This means a record can only exist in one group of records and be linked to one master
record.

For example, take the following set of records as input:

id

fullName

1

John Doe

2

Donna Lewis

3

John B. Doe

4

Louis Armstrong

The algorithm processes the input records as follows:

  1. The algorithm takes record 1 and compares it with an empty set of records. Since
    record 1 does not match any record, it is added to the lookup table.
  2. The algorithm takes record 2 and compares it with record 1. Since it is not a match,
    record 2 is added to the lookup table.
  3. The algorithm takes record 3 and compare it with record 1 and record 2. Record 3
    matches record 1. So, record 3 is added to the group of record 1.

  4. The algorithm takes record 4 and compares it with record 1 and record 2 but not
    with record 3, which is not a master record. Since it is not a match, record 4
    is added to the lookup table.

The output will look like this:

id

fullName

Grp_ID

Grp_Size

Master

Score

GRP_QUALITY

1

John Doe

0

2

true

1.0

0.72

3

John B. Doe

0

0

false

0.72

0

2

Donna Lewis

1

1

true

1.0

1.0

4

Louis Armstrong

2

1

true

1.0

1.0


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x