July 30, 2023

tDTDValidator – Docs for ESB 7.x

tDTDValidator

Helps at controlling data and structure quality of the file to be
processed

Validates the XML input file against a DTD file and sends the
validation log to the defined output.

tDTDValidator Standard properties

These properties are used to configure tDTDValidator running in the Standard Job framework.

The Standard
tDTDValidator component belongs to the XML family.

The component in this framework is available in all Talend
products
.

Basic settings

Schema and Edit
Schema

A schema is a row description, it defines the number of fields to
be processed and passed on to the next component.

The schema of this component is read-only. It contains standard
information regarding the file validation.

DTD file

Filepath to the reference DTD file.

XML file

Filepath to the XML file to be validated.

If XML is valid, display
If XML is
invalid, display

Type in a message to be displayed in the Run console based on the result of the
comparison.

Print to console

Select this check box to display the validation message.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the processing metadata at the Job
level as well as at each component level.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

DIFFERENCE: the result of the validation. This is a Flow
variable and it returns a string.

VALID: the validation result. This is a Flow variable and
it returns a boolean.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule

This component can be used as standalone component but it is
usually linked to an output component to gather the log data.

Validating XML files

This scenario describes a Job that validates the specified type of
files from a folder, displays the validation result on the Run tab console, and
outputs the log information for the invalid files into a delimited file.

tDTDValidator_1.png

Validating XML files

  1. Drop the following components from the Palette to the design workspace: tFileList, tDTDValidator,
    tMap, tFileOutputDelimited.
  2. Connect the tFileList to the tDTDValidator with an Iterate
    link and the remaining component using a main
    row.
  3. Set the tFileList component properties, to
    fetch an XML file from a folder.

    tDTDValidator_2.png

    Click the plus button to add a filemask line and enter the filemask: *.xml.
    Remember Java code requires double quotes.
    Set the path of the XML files to be verified.
    Select No from the Case Sensitive drop-down
    list.
  4. In the tDTDValidate
    Component view, the schema is read-only as it
    contains standard log information related to the validation process.

    tDTDValidator_3.png

    In the Dtd file field, browse to the DTD file
    to be used as reference.
  5. Click in the XML file field, press Ctrl+Space
    bar
    to access the variable list, and double-click the current
    filepath global variable: tFileList.CURRENT_FILEPATH.
  6. In the various messages to display in the Run
    tab console, use the jobName variable to recall the job
    name tag. Recall the filename using the relevant global variable:
    ((String)globalMap.get("tFileList_1_CURRENT_FILE")). Remember
    Java code requires double quotes.

    Select the Print to Console check box.
  7. In the tMap component, drag and drop the
    information data from the standard schema that you want to pass on to the output
    file.

    tDTDValidator_4.png

  8. Once the Output schema is defined as required, add a filter condition to only
    select the log information data when the XML file is invalid.

    Follow the best practice by typing first the wanted value for the variable,
    then the operator based on the type of data filtered then the variable that
    should meet the requirement. In this case: 0 == row1.validate.
  9. Then connect (if not already done) the tMap
    to the tFileOutputDelimited component using a
    Row > Main connection. Name it as relevant, in this example:
    log_errorsOnly.
  10. In the tFileOutputDelimited
    Basic settings, define the destination
    filepath, the field delimiters and the encoding.
  11. Save your Job and press F6 to run it.

    tDTDValidator_5.png

    On the Run console the messages defined
    display for each of the files. At the same time the output file is filled with
    the log data for invalid files.

Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x