August 15, 2023

Run the analysis with different probability distributions – Docs for ESB 6.x

Run the analysis with different probability distributions

  1. Switch back to the
    Integration
    perspective,
    select Poisson distribution in the basic
    settings of tDuplicateRow and run the
    Job.
  2. In the
    Profiling
    perspective, click Chart below the Matching Key
    table to show the duplicates generated according to the Poisson distribution.
  3. Run the Job with the Geometric distribution,
    then click the Chart in the Profiling to show the duplicates generated according
    to the Geometric distribution.

    The table below shows how results of the generated duplicates differ according
    to the probability distribution you select in the tDuplicateRow component.

    Probability distribution

    Duplicate results

    Description

    Bernoulli distribution

    use_case-tduplicaterow-bernoulli_results.png

    The curve is symmetrical. The groups of duplicates are
    distributed evenly on each side of an average value, 4 in
    this example. This average value is the average number of
    duplicates in a group of duplicates and this value is the
    number you set in the Average group
    size
    field in the basic settings of the
    tDuplicateRow
    component.

    Poisson distribution

    use_case-tduplicaterow-poisson_results.png

    The curve is not symmetrical. The groups of duplicates are
    distributed unevenly.

    Geometric distribution

    use_case-tduplicaterow-geometric_results.png

    The form of the curve is decided by the percentage you set
    for the duplicated records in the tDuplicateRow basic settings. The higher the
    percentage is, the fewer groups with many records you will
    have.

    In this example the percentage for the duplicate records
    is set to 80%. This is why many groups
    with two-record duplicates are generated
    (148 groups), while there is only
    one group that has 14, 15 and 16
    duplicates.


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x