August 15, 2023

Scenario 3: Extracting exact match by using Index rules – Docs for ESB 6.x

Scenario 3: Extracting exact match by using Index rules

This scenario applies only to a subscription-based Talend Platform solution or Talend Data Fabric.

In this scenario, you will standardize some long descriptions of customer products by
matching the input flow with the data contained in an index. This scenario explains how
to use Index rules to tokenize product data and then
check each token against an index to extract exact match.

For this scenario, you must first create an index by using a Job with the tSynonymOutput component. You need to create indexes for the
brand, range, color and unit of the customer products. Use the tSynonymOutput component to generate the indexes and feed them with
entries and synonyms. The below capture shows an example Job:

use_case-tstandardizerow-index_rules.png

Below is a sample of the generated indexes for this scenario:

use_case-tstandardizerow-indexes.png

Each of the generated indexes has strings (sequences of words) in one column and their
corresponding synonyms in the second column. These strings are used as a reference data
against which the product data, generated by
tFixedFlowInput
, will be matched. For further information about index
creation, see tSynonymOutput.

In this scenario, the generated indexes are defined as context variable. For further
information about context variables, see
Talend Studio User
Guide
.


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x