August 15, 2023

Loading the input data and removing duplicates – Docs for ESB 6.x

Loading the input data and removing duplicates

  1. Double-click tPigLoad to open its
    Basic settings view.

    Use_Case_tPigFilterRow2.png

  2. Click the […] button next to Edit schema to open the [Schema] dialog box.

    Use_Case_tPigFilterRow3.png

  3. Click the [+] button to add three columns
    according to the data structure of the input file: Name
    (string), Country (string) and Age
    (integer), and then click OK to save the
    setting and close the dialog box.
  4. Click Local in the Mode area.
  5. Fill in the Input file URI field with the
    full path to the input file.
  6. Select PigStorage from the Load function list, and leave rest of the
    settings as they are.
  7. Double-click tPigDistinct to open its
    Basic settings view, and click
    Sync columns to make sure that the
    input schema structure is correctly propagated from the preceding
    component.

    This component will remove any duplicates from the data flow.

Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x