August 15, 2023

tHMap – Docs for ESB 6.x

tHMap

Executes transformations (called maps) between different sources and destinations
by harnessing the capabilities of
Talend
Data Mapper
, available in the
Mapping
perspective.

tHMap transforms data from a wide
range of sources to a wide range of destinations. If you want to use
multiple inputs and/or outputs, you must use Talend Data Mapper I/O functions. For more
information, see Talend Data Mapper
User Guide
.

tHMap Standard properties

These properties are used to configure tHMap running in the Standard Job framework.

The Standard
tHMap component belongs to the Processing family.

This component is available in the Palette of the Studio only if you have subscribed to one of the Talend Platform products.

Basic settings

Open Map Editor

Click the […] button to open the
tHMap Structure Generate/Select
wizard where you can either have the hierarchical mapper structure
generated automatically based on the schema, or select an existing
hierarchical mapper structure. You must do this for both the input and
output sides of your Map.

Map Path

Specifies the map to be executed.

If the map was automatically created using the wizard described above,
this is path is set automatically.

If you want to use an existing map, click the […] button next to the Map
Path
field to open a dialog box in which you can select
the map you want to use, then click the […] button next to Open Map
Editor
to work with the map selected. Note that this map
must have previously been created in the
Mapping
perspective.

Read Input As

Select the radio button which corresponds to how you want the input to
be read. Depending on your map, only some of the options may be
available.

  • Data Integration columns
    (default): Use this option if you are working with

    Talend Data Integration
    metadata.

  • Single column: Use this
    option if you are working with
    Talend Data Mapper
    metadata.

Write Output As

Select the radio button which corresponds to how you want the output
to be written. Depending on your map, only some of the options may be
available.

  • Data Integration columns
    (default): Use this option if you are working with

    Talend Data Integration
    metadata.

  • String (single column): Use
    this option if the data in the output column is to be a
    String.

  • Byte array (single column):
    Use this option if the data in the output column is to be a
    Byte array.

  • InputStream (single column):
    Use this option if you are working with
    Talend Data Mapper
    metadata and the input data
    is a stream.

  • Document (single column); Use
    this option if the output column is to be a
    Document.

Advanced settings

Map Variable

In this field, enter a context variable that you can use to define the
relative path to a map file. For instance, if you enter ${context.mymapfile}, then mymapfile can point to different map files
at runtime. This can be useful in cases where you want to use multiple
maps without creating a new Job each time.

In the Contexts tab, the value must
be an relative path. For instance, assuming you have a map called
mapA in the folder Maps/FolderA, your context variable should
contain the value “FolderA/mapA.xml”.
The .xml extension is needed because this is a reference to a file on
the file system.

Note that all maps that might be referenced by the context variable
must be present in the same Project. This way, when the job is built, it
will contain all candidate maps and it will be possible to switch from
one to another at runtime.

For further information on working with context variables, see

Talend Studio

User Guide.

Map each row (disable virtual
component)

Select this check box to have tHMap
process the input as a single output row. This prevents tHMap from buffering the input rows before
delivering them downstream.

This can be useful, for example, when you use the tHMap component with tSAPIDocReceiver as the input component and any
schema-aware component as the output component, because telling the
tSAPIDocReceiver component to keep
listening forever would otherwise lead to rows never being
delivered.

Log Level

From the drop-down list, select how often you want events to be
logged.

  • Infrequent: Logs only events
    related to startup, shutdown and exceptions.

  • Frequent (default): Logs
    events related to startup, shutdown and exceptions, and once per
    map execution.

  • Info: Logs all events at an
    informational level or higher.

  • All: Logs all events.

  • None: Logs nothing.

Exception Threshold


Talend Data Mapper
returns an execution status with
an severity value which can be OK,
Info, Warning, Error or
Fatal. By setting the exception
threshold, you can specify the severity level at which an exception is
thrown, thus enabling downstream components to detect the error in cases
other than the default value of Fatal.

From the drop-down list, select the severity level at which an
exception may be thrown during the execution of a map.

  • Fatal (default): An exception
    is thrown when a fatal error occurs.

  • Error: An exception is thrown
    when an error (or higher) occurs.

  • Warning: An exception is
    thrown when a warning (or higher) occurs.

Note that, in order to help you diagnose problems with your map, when
you test the map in the Studio, any errors that occur which are at
warning level or above will be printed in the console window, regardless
of the setting of the Exception
Threshold
.

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job
level as well as at each component level.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

EXECUTION_STATUS: the pointer to the ExecutionStatus object, which is returned whenever tHMap executes a
Talend Data Mapper
map. This is an
After variable and it returns a string.

EXECUTION_SEVERITY: the Overall
Severity
numeric value. This is an After variable and it returns an
integer.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule

tHMap is used for Jobs that require
complex data mapping from a variety of different sources.

The input and output connections can use
Talend Data Mapper
metadata,
Talend Data Integration

metadata, or a combination of the two. Each connection is
independent.

When you open the Map Editor for the first time for each connection,
it either generates a
Talend Data Mapper

structure definition based on the schema of the
Talend Data Integration
component, or allows you to select an
existing
Talend Data Mapper
structure if you are using

Talend Data Mapper
metadata. It then creates a map with the
structure selected or generated.

This component can be used in several ways:

Note:

For further information about performing transformations using
Talend Data Mapper
, see
Talend Data Mapper User
Guide
.

Scenario 1: Using Talend Data Mapper metadata

This scenario applies only to a subscription-based Talend Platform solution or Talend Data Fabric.

The following scenario creates a three-component Job, reading data from an input file
that is transformed using a map that was previously created in the
Mapping

perspective and then outputting the transformed data in a new file. It works with

Talend Data Mapper
metadata.

use_case-thmap.png

Copying an editable version of the example files

  1. In the
    Mapping
    perspective, in
    the Data Mapper view, expand the Hierarchical Mapper node and the Other Projects folder, right-click Examples and then select Copy in the contextual menu.
  2. In the Data Mapper view, right-click at the
    root of the Hierarchical Mapper node, and then
    select Paste in the contextual menu.

    This copies an editable version of all the read-only example files to your
    local workspace.

Adding and linking the components

  1. In the
    Integration
    perspective,
    create a new Job and call it tdm_to_tdm.
  2. Click the point in the design workspace where you want to add the first
    component, start typing tFileInputRaw, and then
    click the name of the component when it appears in the list proposed in order to
    select it.
  3. Do the same to add a tHMap component and a
    tFileOutputRaw component as well.
  4. Connect the tFileInputRaw component to the
    tHMap component using a Row > Main link and rename it input, then connect the tHMap
    component to the tFileOutputRaw component using
    a Row > Main link and name it output. When you are asked if you want to get the
    schema of the target component, click Yes.

Defining the properties of tFileInputRaw

  1. Select the tFileInputRaw component to define
    its properties.

    use_case-thmap-tfileinput.png

  2. In the Basic settings tab, click the
    […] button next to the Filename field then browse to the location on your
    file system where the input file is stored, or enter the path manually between
    double quotes. For this example, use <PATH_TO_WORKSPACE>/<PROJECT_NAME>/Sample
    Data/CSV/PurchaseOrderPayPal/PayPalPO.csv
    .
  3. Set the Mode as Read
    the file as a string
    , and leave all the other parameters
    unchanged.

Defining the properties of tFileOutputRaw

  1. Select the tFileOutputRaw component to define
    its properties.

    use_case-thmap-tfileoutput.png

  2. In the Basic settings tab, click the
    […] button then browse to the location on
    your file system where the output file is to be stored, or enter the path
    manually between double quotes. Leave the other parameters unchanged.

Defining the properties of tHMap

  1. Select the tHMap component to define its
    properties.

    use_case-thmap-thmap.png

  2. Click the […] button next to the Map Path field to open the picker and select the map
    to use, Maps/CSV/POPayPalCsv_PO2, then click
    OK. This map transforms a CSV file into an
    XML file.
  3. Check that Read Input As is set to Single Column.
  4. Check that Write Output As is set to String (single column).

Saving and executing the Job

  1. Press Ctrl+S to save your Job.
  2. In the Run tab, click Run to execute the Job.
  3. Browse to the location on your file system where the output file is stored to
    check that an XML file has been created containing the same data as the input
    CSV file.

Scenario 2: Using Talend Data Integration metadata

This scenario applies only to a subscription-based Talend Platform solution or Talend Data Fabric.

The following scenario creates a three-component Job, reading data from an input file
that is transformed using a map that you create in the
Mapping

perspective and then outputting the transformed data in a new file. It works with

Talend Data Integration
metadata.

use_case-thmap-di.png

Copying an editable version of the example files

  1. In the
    Mapping
    perspective, in the Data Mapper view, expand the Hierarchical Mapper node and the Other
    Projects
    folder, right-click Examples
    and then select Copy in the contextual menu.
  2. In the Data Mapper view, right-click at the root
    of the Hierarchical Mapper node, and then select
    Paste in the contextual menu.

    This copies an editable version of all the read-only example files to your local
    workspace.

Adding and linking the components

  1. In the
    Integration
    perspective, create a new
    Standard Job and call it di_to_di.
  2. Click the point in the design workspace where you want to add the first component,
    start typing tFileInputDelimited, and then click
    the name of the component when it appears in the list proposed in order to select
    it.
  3. Do the same to add a tHMap component and a
    tFileOutputXML component as well.
  4. Connect the tFileInputDelimited component to the
    tHMap component using a Row
    > Main
    link, then connect the tHMap
    component to the tFileOutputXML component using a
    Row > Main link.

Defining the properties of tFileInputDelimited

  1. Select the tFileInputDelimited component to
    define its properties.

    use_case-thmap-tfileinputdelimited.png

  2. In the Basic settings tab, click the […] button next to the File
    name/Stream
    field then browse to the location on your file system where
    the input Excel file is stored, or enter the path manually between double quotes.
    For this example, use <PATH_TO_WORKSPACE>/<PROJECT_NAME>/Sample
    Data/CSV/PurchaseOrderPayPal/PayPalPO.csv
    .
  3. Select the CSV options check box.
  4. Change the Field Separator to a comma, between
    double quotes (“,”).
  5. Change the value of Header to 1.
  6. Click the […] button next to Edit schema to define the schema.
  7. Add three columns and rename them txn_id,
    payment_date and first_name (which correspond to the names of the first three columns in
    the input file, and is sufficient for the purposes of this example), and then click
    OK.
  8. Leave all the other parameters unchanged.

Defining the properties of tFileOutputXML

  1. Select the tFileOutputXML component to define its
    properties.

    use_case-thmap-fileoutputxml.png

  2. In the Basic settings tab, click the […] button next to the File
    Name
    field then browse to the location on your file system where the
    output file will be stored, or enter the path manually between double quotes.
  3. Click the […] button next to Edit schema to define the schema.
  4. Add three columns to the input schema on the left and rename them id, date and name, copy them to the output schema on the right, and then
    click OK.
  5. Leave the other elements unchanged.

Defining the properties of tHMap

  1. Select the tHMap component to define its
    properties.

    use_case-thmap-thmap-di.png

  2. Click the […] button next to the Open Map Editor field to create a new map based on the input
    and output of tHMap.
  3. In the tHMap Structure Generate/Select dialog box
    that opens, select Generate hierarchical mapper structure
    based on the schema
    and then click Next
    to generate the input structure.
  4. Do the same for the output structure.
  5. In the Map editor that opens, drag the txn_id
    element of Input (map) to the id element of Output (map). Do the same
    to map payment_date to date and first_name to name, and then save your changes.

    use_case-thmap-drag_root.png

Saving and executing the Job

  1. Press Ctrl+S to save your Job.
  2. In the Run tab, click Run to execute the Job.
  3. Browse to the location on your file system where the output file is stored to
    check that an XML file has been created containing the same data as the input CSV
    file.

Scenario 3: Transforming from a Data Integration schema to a complex content
schema

This scenario applies only to a subscription-based Talend Platform solution or Talend Data Fabric.

The following scenario creates a three-component Job, generating some random data from
an input component, transforming this data using a map which was previously created in
the
Mapping

perspective, and then outputting the transformed data in a JSON file. It works with

Talend Data Integration
metadata for the input and
Talend Data Mapper
metadata for the output.

use_case-thmap-di_to_json-scenario.png

Creating a new structure in the
Mapping
perspective

  1. In the
    Mapping
    perspective, in
    the Data Mapper view, expand the Hierarchical Mapper node, right-click Structures and then select New > Structure.
  2. In the [New Structure] dialog box that opens,
    select Create a new structure where you manually enter
    elements
    , and then click Next.
  3. Name your structure JSON_structure, and
    then click Next.
  4. In the [Select Representation] dialog box,
    select JSON from the list of available
    representations, and then click Next.
  5. Select Don’t select a sample document for
    now
    , and then click Finish.

Entering the elements for your new structure

  1. In the
    Mapping
    perspective, in
    the Data Mapper view, expand the Hierarchical Mapper node and the Structures node, and then open the JSON_structure structure you created
    earlier.
  2. In the JSON_structure, right-click to add a
    new element, click New element and name the new
    element Root.
  3. Follow the same steps to create a new element called people under the Root
    element, a person element under people, and four new elements under the person element: firstname, lastname,
    address and city.
  4. For the person element, change the
    Occurs Max value to -1 (unlimited).

    use_case-thmap-di_to_json-json_structure.png

  5. Press Ctrl + S to save your changes.

Adding and linking the components

  1. In the Integration perspective, in the Repository, right-click Job
    Designs
    , and then click Create Standard
    Job
    to create a Job named di_to_json. Add a Purpose and
    Description if you wish, and then click
    Finish.
  2. Click the point in the design workspace where you want to add the first
    component, start typing tRowGenerator, and then
    click the name of the component when it is displayed in the list proposed in
    order to select it.
  3. Do the same to add a tHMap component, and a
    tFileOutputRaw component as well.
  4. Connect the tRowGenerator component to the
    tHMap component using a Row > Main link, then connect the tHMap component to the tFileOutputRaw component using a Row >
    Main
    link. When you are asked if you want to get the schema of
    the target component, click Yes.

Defining the properties of tRowGenerator

  1. Select the tRowGenerator component to define
    its properties.
  2. In the Basic settings tab, click the
    […] button next to RowGenerator Editor to define the rows to be generated.
  3. In the dialog box that opens, click four times the [+] button to add four new columns to the schema, and name them
    firstname, lastname, address and
    city.
  4. For each of the columns you just added, change the function to match what is
    shown in the table below by clicking in the Functions column and scrolling through the list of available
    functions until you find the one you want, and then click OK when you’re done.

    firstname

    TalendDataGenerator.getFirstName

    lastname

    TalendDataGenerator.getFirstName

    address

    TalendDataGenerator.getUsStreet

    city

    TalendDataGenerator.getUsCity

Defining the properties of tFileOutputRaw

  1. Select the tFileOutputRaw component to define
    its properties.
  2. In the Basic settings tab, click the
    […] button then browse to the location on
    your file system where the output file is to be stored, or enter the path
    manually between double quotes, and call the output file output.json

    Leave the other parameters unchanged.

Defining the properties of tHMap

  1. Select the tHMap component to define its
    properties.

    use_case-thmap-di_to_json-output_settings.png

  2. Click the […] button next to the Open Map Editor field to create a new map.
  3. In the [tHMap Structure Generate/Select]
    dialog box that opens, select Generate hierarchical mapper
    structure based on the schema
    for the input structure, and then
    click Next and then Finish. This means that
    Talend Data Mapper
    will
    automatically generate a structure for you, based on the schema of the input
    component (tRowGenerator in this case).
  4. For the output structure, select Select an existing
    hierarchical mapper structure
    , and then click Next.
  5. Select the JSON_structure structure that
    you created earlier, and then click Next and
    then Finish.
  6. In the Map editor that opens, drag row from
    Input (Map) to person in Output (JSON) to map
    each of the input elements to its corresponding output element.

    use_case-thmap-di_to_json-mapping.png

  7. Double click SimpleLoop in Loop tab and, in the properties box that opens, check
    Stream Input and then click OK.
  8. Press Ctrl+S to save your changes to the
    map.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.
  2. In the Run tab, click Run to execute the Job.
  3. Browse to the location on your file system where the output file is stored to
    check that a JSON file showing the expected data has been successfully
    created.

Scenario 4: Handling errors

This scenario applies only to a subscription-based Talend Platform solution or Talend Data Fabric.

This following scenario creates a six-component Job that shows how to handle error
conditions using the tHMap component.

When tHMap executes a
Talend Data Mapper
map, an ExecutionStatus object is always returned. A pointer to this object is
stored in the globalMap as EXECUTION_STATUS. In
addition, the Overall Severity numeric value (the
constants are defined in ExecutionStatus) is also
stored in the globalMap as EXECUTION_SEVERITY.
Finally, a parameter called Execution Threshold is
defined that specifies the severity on which to throw an exception, thus triggering
Job-related or component-related error processing. The default value for this parameter
is Fatal.

To consult the Javadoc for the ExecutionStatus
object, see the Camel Runtime documentation included in the
Talend Data Mapper
runtime kits and the
Talend Data Mapper User Guide
.

use_case-thmap-scenario_error_handling.png

Adding and linking the components

  1. In the
    Integration
    perspective,
    create a new Standard Job and call it error_handling.
  2. Click the point in the design workspace where you want to add the first
    component, start typing tFixedFlowInput, and
    then click the name of the component when it appears in the list proposed in
    order to select it.
  3. Do the same to add a tHMap component, a
    tJavaRow component, and three tJava components as well.
  4. Connect the tFixedFlowInput component to the tHMap component using a Row
    > Main
    link, then connect the tHMap component to the tJavaRow
    component using a Row > Main link and rename
    the link out. If you are asked whether you
    want to get the schema of the target component, click Yes.
  5. Connect the tFixedFlowInput component to the
    first tJava component using an OnSubjobError trigger and to the second tJava component using an OnSubjobOk trigger, then connect the tHMap component to the third tJava component using a RunIf
    trigger.

Defining the properties of the RunIf trigger

  1. Select the RunIf trigger to define its
    properties.
  2. Add the following code in the Condition
    section.!((org.talend.transform.runtime.api.ExecutionStatus) globalMap.get("tHMap_1_EXECUTION_STATUS")).isOK()

Defining the properties of tFixedFlowInput

  1. Select the tFixedFlowInput component to
    define its properties.
  2. In the Basic settings tab, click the
    […] button next to Edit schema to define the schema.
  3. Add one column, of type String, rename it inputKey, and then click OK.
  4. In the Mode area. select the Use Inline Table radio button, then add three values
    to the inputKey column by clicking the
    [+]  button and then entering the value
    between double quotes in each case. In this example, the values used are
    value1, value2 and value3.

Defining the properties of tJavaRow

  1. Select the tJavaRow component to define its
    properties.
  2. In the Basic settings tab, click the
    […] button next to Edit schema to define the schema.
  3. Click the [+]  button to add a column to the
    input schema on the left, of type String, rename it outputKey, copy it to the output schema on the
    right, and then click OK.
  4. Add the following code in the Code
    section.String actualOutput = out.outputKey;

    System.out.println("======" + actualOutput + "======");

Defining the properties of tHMap

  1. Select the tHMap component to define its
    properties.
  2. Click the […] button next to the Open Map Editor field to create a new map based on
    the input and output of tHMap.
  3. In the tHMap Structure Generate/Select dialog
    box that opens, select Generate hierarchical mapper
    structure based on the schema
    and then click Next to generate the input structure.
  4. Do the same for the output structure.
  5. In the Map editor that opens, drag the inputKey element
    of Input (map) to the outputKey element of Output
    (map)
    , and then save your changes.

    use_case-thmap-error_handling-mapping.png

  6. Back in the Job, in the tHMap component, leave the value
    of the Exception Threshold drop-down list (in
    the Advanced settings tab) as Fatal.
  7. In the Basic settings tab, check that Read Input As is set to Data
    Integration columns
    .
  8. Check that Write Output As is set to Data Integration columns.

Defining the properties of the first tJava component (OnSubjobError)

  1. Select the first tJava component to define
    its properties. This component displays information in the console when the Job
    contains an error.
  2. Add the following code in the Code
    section.System.out.println("tJava_1: Subjob ERROR");

    org.talend.transform.runtime.api.ExecutionStatus es = (org.talend.transform.runtime.api.ExecutionStatus)globalMap.get("tHMap_1_EXECUTION_STATUS");

    System.out.println("Execution result:" + es.getOverallSeverity());

    // ExecutionStatus object
    System.out.println(es.toString());

    // XML version of ExecutionStatus object
    java.io.StringWriter sw = new java.io.StringWriter();
    es.exportToXml(sw);
    System.out.println("ExecutionStatus as XML");
    System.out.println(sw.toString());

Defining the properties of the second tJava component (OnSubjobOk)

  1. Select the second tJava component to define
    its properties. This component displays information in the console when the Job
    runs successfully.
  2. Add the following code in the Code
    section.System.out.println("tJava_2: Subjob OK");

    org.talend.transform.runtime.api.ExecutionStatus es = (org.talend.transform.runtime.api.ExecutionStatus)globalMap.get("tHMap_1_EXECUTION_STATUS");

    System.out.println("Execution result:" + es.getOverallSeverity());

    // ExecutionStatus object
    System.out.println(es.toString());

    // XML version of ExecutionStatus object
    java.io.StringWriter sw = new java.io.StringWriter();
    es.exportToXml(sw);
    System.out.println("ExecutionStatus as XML");
    System.out.println(sw.toString());

Defining the properties of the third tJava component (RunIf)

  1. Select the third tJava component to define
    its properties. This component runs if any errors occur within the map itself
    (higher than informational status).
  2. Add the following code in the Code
    section.System.out.println("tJava_3: Run If");

    org.talend.transform.runtime.api.ExecutionStatus es = (org.talend.transform.runtime.api.ExecutionStatus)globalMap.get("tHMap_1_EXECUTION_STATUS");

    System.out.println("Execution result:" + es.getOverallSeverity());

    // ExecutionStatus object
    System.out.println(es.toString());

    // XML version of ExecutionStatus object
    java.io.StringWriter sw = new java.io.StringWriter();
    es.exportToXml(sw);
    System.out.println("ExecutionStatus as XML");
    System.out.println(sw.toString());

Running the Job under different conditions

  1. Press Ctrl+S to save your Job.
  2. In the Run tab, click Run to execute the Job.

    In this case, no errors occur, so the Job triggers the second tJava component only.
    use_case-thmap-error_handling-run1.png

  3. Double-click the tHMap component to open the Map
    editor.
  4. Right-click the outputKey element and click
    Go to Structure Element.
  5. Change the tHMap_1_output structure that
    opens from Read Only to Editable and then change the Data
    Type
    for outputkey to
    Integer (32). This means that this element
    can only be an Integer, and since this does not match the input, an
    error will occur.

    use_case-thmap-error_handling-structure.png

  6. In the Run tab, click Run to execute the Job again.

    In this case, the Job still triggers the second tJava component even though there is an error, because the
    threshold above which an exception should be thrown (thus enabling the
    downstream components to detect the error) is set as Fatal. However, the third tJava
    component is also triggered since there is an error in the execution of the
    map.
    use_case-thmap-error_handling-run2.png

  7. In the tHMap component, change the value of the Exception Threshold drop-down list (on the Advanced Settings tab) to Error. This causes an exception to be thrown when the map has a
    severity of Error or higher.
  8. In the Run tab, click Run to execute the Job for a third time.

    In this case, the Job detects the error and triggers the first tJava component (OnSubjobError).
    use_case-thmap-error_handling-run3.png


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x