August 15, 2023

Loading the test set into the Job – Docs for ESB 6.x

Loading the test set into the Job

  1. Double-click tFileInputDelimited to open its
    Component view.

    use_case-evaluation2.png

  2. Select the Define a storage configuration component check box
    and select the tHDFSConfiguration component
    to be used.

    tFileInputDelimited uses this
    configuration to access the training set to be used.
  3. Click the […] button next to Edit
    schema
    to open the schema editor.
  4. Click the [+] button five times to add five rows and in the
    Column column, rename them to reallabel, sms_contents, num_currency,
    num_numeric and num_exclamation, respectively.

    use_case-evaluation3.png

    The reallabel and the sms_contents columns carries the raw data which is
    composed of the SMS text messages in the sms_contents column and the labels indicating whether a message
    is spam in the reallabel column.
    The other columns are used to carry the features added to the raw datasets
    as explained previously in this scenario. They contains the number of
    currency symbols, the number of numeric values and the number of exclamation
    marks found in each SMS message.
  5. In the Type column, select Integer for the num_currency, num_numeric and
    num_exclamation columns.
  6. Click OK to validate these changes.
  7. In the Folder/File field, enter the
    directory where the test set to be used is stored.
  8. In the Field separator field, enter
    , which is the separator used by the
    datasets you can download for use in this scenario.

Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x