August 17, 2023

tMap – Docs for ESB 5.x

tMap

tMap.png

tMap properties

Component family

Processing

 

Function

tMap is an advanced component,
which integrates itself as plugin to Talend Studio.

Purpose

tMap transforms and routes data
from single or multiple sources to single or multiple destinations.

Basic settings

Map editor

It allows you to define the tMap
routing and transformation properties.

Note

If you do not want to handle execution errors, you can click
the Property Settings button at
the top of the input area and select the Die on error check box (selected by default) in
the [Property Settings] dialog
box. It will kill the Job if there is an error.

Note

To maximize the data transformation performance in a Job that
handles multiple lookup input flows with large amounts of data,
you can select the Lookup in
parallel
check box in the [Property Settings] dialog box.

However, in a Map/Reduce Job, only one
expression key is allowed per mapping component. If you need to
use multiple expression keys to join different input tables, use
multiple tMap components one after another.

This component offers the advantage of the dynamic schema feature. This allows you to
retrieve unknown columns from source files or to copy batches of columns from a source
without mapping each column individually. For further information about dynamic schemas,
see Talend Studio
User Guide.

This dynamic schema feature is designed for the purpose of retrieving unknown columns
of a table and is recommended to be used for this purpose only; it is not recommended
for the use of creating tables.

 

Mapping links display as

Auto: the default setting is
curves links

Curves: the mapping display as
curves

Lines: the mapping displays as
straight lines. This last option allows to slightly enhance
performance.

  Temp data directory path Enter the path where you want to store the temporary
data generated for lookup loading. For more information on this folder,
see Talend Studio User
Guide
.

 

Preview

The preview is an instant shot of the Mapper data. It becomes
available when Mapper properties have been filled in with data. The
preview synchronization takes effect only after saving
changes.

 

Use replicated join

Select this check box to perform a replicated join between the
input flows. By replicating each lookup table into memory, this type
of join doesn’t require an additional shuffle-and-sort step, thus
speeding up the whole process.

Note that this feature is available only to the Map/Reduce version
of this component and you need to ensure that the entire lookup
tables fit in memory.

Advanced settings Max buffer size (nb of rows) Type in the size of physical memory, in number of rows,
you want to allocate to processed data.
  Ignore trailing zeros for BigDecimal Select this check box to ignore trailing zeros for
BigDecimal data.
 

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the
Job level as well as at each component level. Note that this check box is not available in
the Map/Reduce version of the component.

Global Variables

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio
User Guide.

Usage

Possible uses are from a simple reorganization of fields to the
most complex Jobs of data multiplexing or demultiplexing
transformation, concatenation, inversion, filtering and
more…

Usage in Map/Reduce Jobs

If you have subscribed to one of the Talend solutions with Big Data, you can also
use this component as a Map/Reduce component. In a Talend Map/Reduce Job, this
component is used as an intermediate step and other components used along with it must be
Map/Reduce components, too. They generate native Map/Reduce code that can be executed
directly in Hadoop.

As explained earlier, If you need to use multiple expression keys
to join different input tables, use mutiple tMap components one after another.

For further information about a Talend Map/Reduce Job, see the sections
describing how to create, convert and configure a Talend Map/Reduce Job of the
Talend Big Data Getting Started Guide.

Note that in this documentation, unless otherwise explicitly stated, a scenario presents
only Standard Jobs, that is to say traditional Talend data
integration Jobs, and non Map/Reduce Jobs.

Log4j

The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User
Guide
.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

The use of tMap supposes minimum
Java knowledge in order to fully exploit its functionalities.

This component is a junction step, and for this reason cannot be
a start nor end component in the Job.

Note

For further information, see Talend Studio User
Guide
.

Scenario 1: Mapping data using a filter and a simple explicit join

The Job described below aims at reading data from a csv file with its schema stored in
the Repository, looking up at a reference file, the schema of which is also stored in
the Repository, then extracting data from these two files based on a defined filter to
an output file and reject files.

Linking the components

  1. Drop two tFileInputDelimited components,
    tMap and three tFileOutputDelimited components onto the design workspace.

  2. Rename the two tFileInputDelimited
    components as Cars and Owners,
    either by double-clicking the label in the design workspace or via the
    View tab of the Component view.

  3. Connect the two input components to tMap
    using Row > Main connections and label the connections as
    Cars_data and Owners_data
    respectively.

  4. Connect tMap to the three output
    components using Row > New Output (Main) connections and name the output
    connections as Insured, Reject_NoInsur and Reject_OwnerID
    respectively.

    UseCase_tMap2-1-1.png

Configuring the components

  1. Double-click the tFileInputDelimited
    component labelled Cars to display its Basic settings view.

    UseCase_tMap2-1-2.png
  2. Select Repository from the Property type list and select the component’s
    schema, cars in this scenario, from the [Repository Content] dialog box. The rest fields
    are automatically filled.

  3. Double-click the component labelled Owners and repeat
    the setting operation. Select the appropriate metadata entry,
    owners in this scenario.

    Note

    In this scenario, the input schemas are stored in the Metadata node of the Repository tree view for easy retrieval. For further
    information regarding metadata creation in the Repository, see
    Talend Studio User Guide.

  4. Double-click the tMap component to open
    the Map Editor.

    Note that the input area is already filled with the defined input tables
    and that the top table is the main input table, and the respective row
    connection labels are displayed on the top bar of the table.

  5. Create a join between the two tables on the ID_Owner
    column by simply dropping the ID_Owner column from the
    Cars_data table onto the
    ID_Owner column in the
    Owners_data table.

  6. Define this join as an inner join by clicking the tMap settings button, clicking in the Value field for Join Model,
    clicking the small button that appears in the field, and selecting Inner Join from the [Options] dialog box.

    UseCase_tmap2-1-3.png
  7. Drag all the columns of the Cars_data table to the
    Insured table.

  8. Drag the ID_Owner,
    Registration, and ID_Reseller
    columns of the Cars_data table and the
    Name column of the Owners_data
    table to the Reject_NoInsur table.

  9. Drag all the columns of the Cars_data table to the
    Reject_OwnerID table.

    For more information regarding data mapping, see Talend Studio
    User Guide
    .

  10. Click the plus arrow button at the top of the Insured
    table to add a filter row.

    Drag the ID_Insurance column of the
    Owners_data table to the filter condition area and
    enter the formula meaning ‘not undefined’: Owners_data.ID_Insurance !=
    null.

    With this filter, the Insured table will gather all
    the records that include an insurance ID.

    UseCase_tMap2-1-4.png
  11. Click the tMap settings button at the top
    of the Reject_NoInsur table and set Catch output reject to true to define the table as a standard reject output flow to
    gather the records that do not include an insurance ID.

    use_case-tmap2-1-5halftop.png
  12. Click the tMap settings button at the top
    of the Reject_OwnerID table and set Catch lookup inner join reject to true so that this output table will gather the
    records from the Cars_data flow with missing or
    unmatched owner IDs.

    use_case-tmap2-1-5halfbot.png

    Click OK to validate the mappings and
    close the Map Editor.

  13. Double-click each of the output components, one after the other, to define
    their properties. If you want a new file to be created, browse to the
    destination output folder, and type in a file name including the
    extension.

    UseCase_tMap2-1-6.png

    Select the Include header check box to
    reuse the column labels from the schema as header row in the output
    file.

Executing the Job

  1. Press Ctrl + S to save your Job.

  2. Press F6 to run the Job.

    The output files are created, which contain the relevant data as defined.

    UseCase_tmap2-1-7.png

    For examples of how to use dynamic
    schemas with tMap, see:

Scenario 2: Mapping data using inner join rejections

This scenario, based on scenario 1, adds one input file containing details about
resellers and extra fields in the main output table. Two filters on inner joins are
added to gather specific rejections.

Linking the components

  1. Drop a tFileInputDelimited component and
    a tFileOutputDelimited component to the
    design workspace, and label the components as Resellers
    and No_Reseller_ID respectively.

  2. Connect it to the Mapper using a Row >
    Main connection, and label the
    connection as Resellers_data.

  3. Connect the tMap component to the new
    tFileOutputDelimited component by using
    the Row connection named
    Reject_ResellerID.

    UseCase_tmap2-2-2.png

Configuring the components

  1. Double-click the Resellers component to display its
    Basic settings view.

    UseCase_tmap2-2-1.png
  2. Select Repository from the Property type list and select the component’s
    schema, resellers in this scenario, from the [Repository Content] dialog box. The rest fields
    are automatically filled.

    Note

    In this scenario, the input schemas are stored in the Metadata node of the Repository tree view for easy retrieval. For further
    information regarding metadata creation in the Repository, see
    Talend Studio User Guide.

  3. Double-click the tMap component to open
    the Map Editor.

    Note that the schema of the new input component is already added in the
    Input area.

  4. Create a join between the main input flow and the new input flow by
    dropping the ID_Reseller column of the
    Cars_data table to the
    ID_Reseller column of the
    Resellers_data table.

  5. Click the tMap settings button at the top
    of the Resellers_data table and set Join Model to Inner
    Join
    .

    UseCase_tmap2-2-3.png
  6. Drag all the columns except ID_Reseller of the
    Resellers_data table to the main output table,
    Insured.

    UseCase_tmap2-2-4.png

    Note

    When two inner joins are defined, you either need to define two
    different inner join reject tables to differentiate the two rejections
    or, if there is only one inner join reject output, both inner join
    rejections will be stored in the same output.

  7. Click the [+] button at the top of the
    output area to add a new output table, and name this new output table
    Reject_ResellerID.

  8. Drag all the columns of the Cars_data table to the
    Reject_ResellerID table.

  9. Click the tMap settings button and select
    Catch lookup inner join reject to
    true to define this new output table as
    an inner join reject output.

    If the defined inner join cannot be established, the information about
    the relevant cars will be gathered through this output flow.

    UseCase_tmap2-2-5.png
  10. Now apply filters on the two Inner Join reject outputs, in order for to
    distinguish the two types of rejection.

    In the first Inner Join output table, Reject_OwnerID,
    click the plus arrow button to add a filter line and fill it with the
    following formula to gather only owner ID related rejection:
    Owners_data.ID_Owner==null

  11. In the second Inner Join output table,
    Reject_ResellerID, repeat the same operation using
    the following formula: Resellers_data.ID_Reseller==null

    UseCase_tmap2-2-6.png

    Click OK to validate the map settings and
    close the Mapper Editor.

  12. Double-click the No_Reseller_ID component to display
    its Basic settings view.

    UseCase_tmap2-2-7.png

    Specify the output file path and select the Include
    Header
    check box, and leave the other parameters as they
    are.

  13. To demonstrate the work of the Mapper, in this example, remove reseller
    IDs 5 and 8 from the input file
    Resellers.csv.

Executing the Job

  1. Press Ctrl + S to save your Job.

  2. Press F6 to run the Job.

    The four output files are all created in the specified folder, containing
    information as defined. The output file
    No_Reseller_ID.csv contains the
    cars information related to reseller IDs
    5 and 8, which are missing in
    the input file Resellers.csv.

    UseCase_tmap2-2-8.png

    For examples of how to use dynamic
    schemas with tMap, see:

Scenario 3: Cascading join mapping

As third advanced use scenario, based on the scenario 2, add a new Input table
containing Insurance details for example.

Set up an Inner Join between two lookup input tables (Owners and Insurance) in the
Mapper to create a cascade lookup and hence retrieve Insurance details via the Owners
table data.

Scenario 4: Advanced mapping using filters, explicit joins and rejections

This scenario introduces a Job that allows you to find BMW owners who have two to six
children (inclusive), for sales promotion purpose for example.

Linking the components

  1. Drop three tFileInputDelimited
    components, a tMap component, and two
    tFileOutputDelimited components from
    the Palette onto the design workspace, and
    label them to best describe their functions.

  2. Connect the input components to the tMap
    using Row > Main connections.

    Pay attention to the file you connect first as it will automatically be
    set as Main flow, and all the other
    connections will be Lookup flows. In this
    example, the connection for the input component Owners
    is the Main flow.

    UseCase_tmap4-1.png

Configuring the components

  1. Define the properties of each input components in the respective Basic settings view. Define the properties of
    Owners.

    UseCase_tmap4-2.png
  2. Select Repository from the Property type list and select the component’s
    schema, owners in this scenario, from the [Repository Content] dialog box. The rest fields
    are automatically filled.

    Note

    In this scenario, the input schemas are stored in the Metadata node of the Repository tree view for easy retrieval. For further
    information regarding metadata creation in the Repository, see
    Talend Studio User Guide.

    In the same way, set the properties of the other input components:
    Cars and Resellers. These two
    Lookup flows will fill in secondary
    (lookup) tables in the input area of the Map
    Editor
    .

  3. Then double-click the tMap component to
    launch the Map Editor and define the
    mappings and filters.

    Set an explicit join between the Main
    flow Owner and the Lookup flow Cars by dropping the
    ID_Owner column of the Owners
    table to the ID_Owner column of the
    Cars table.

    The explicit join is displayed along with a hash key.

    use_case-tmap4-3cut2.png
  4. In the Expr. Key field of the
    Make column, type in a filter. In this use case,
    simply type in "BMW" as the search is focused on the owners of
    this particular make.

    use_case-tmap4-3cut3top.png
  5. Implement a cascading join between the two lookup tables
    Cars and Resellers on the
    ID_Reseller column in order to retrieve resellers
    information.

  6. As you want to reject the null values into a separate table and exclude
    them from the standard output, click the tMap
    settings
    button and set Join
    Model
    to Inner Join in each
    of the Lookup tables.

    use_case-tmap4-3cut3.png
  7. In the tMap settings, you can set Match
    Model
    to Unique match,
    First match, or All matches. In this use case, the All
    matches
    option is selected. Thus if several matches are found
    in the Inner Join, rows matching the explicit join as well as the filter,
    all of them will be added to the output flow (either in rejection or the
    regular output).

    Note

    The Unique match option functions as
    a Last match. The First match and All
    matches
    options function as named.

  8. On the output area of the Map Editor,
    click the plus button to add two tables, one for the full matches and the
    other for the rejections.

  9. Drag all the columns of the Owners table, the
    Registration, Make and
    Color columns of the Cars
    table, and the ID_Reseller and
    Name_Reseller columns of the
    Resellers table to the main output table.

  10. Drag all the columns of the Owners table to the
    reject output table.

  11. Click the Filter button at the top of the
    main output table to display the Filter
    expression area.

    Type in a filter statement to narrow down the number of rows loaded in the
    main output flow. In this use case, the statement reads:
    Owners.Children_Nr >= 2 && Owners.Children_Nr <=
    6
    .

  12. In the reject output table, click the tMap
    settings
    button and set the reject types.

    Set Catch output reject to true to collect data about BMW car owners who
    have less than two or more than six children.

    Set Catch lookup inner join reject to
    true to collect data about owners of
    other car makes and owners for whom the reseller information is not
    found.

    UseCase_tmap4-3.png

    Click OK to validate the mappings and
    close the Map Editor.

    On the design workspace, right-click the tMap and pull the respective output link to the relevant
    output components.

  13. Define the properties of the output components in their respective
    Basic settings view.

    In this use case, simple specify the output file paths and select the
    Include Header check box, and leave the
    other parameters as they are.

    UseCase_tmap4-4.png

Executing the Job

  1. Press Ctrl + S to save your Job.

  2. Press F6 to run it.

    The main output file contains the information related to BMW owners who
    have two to six children, and the reject output file contains the
    information about the rest of the car owners.

    UseCase_tmap4-5.png

    For examples of how to use dynamic
    schemas with tMap, see:

Scenario 5: Advanced mapping with filters and different rejections

This scenario is a modified version of the preceding scenario. It describes a Job that
applies filters to limit the search to BMW and Mercedes owners who have two to six
children and divides unmatched data into different reject output flows.

Linking the components

  1. Take the same Job as in Scenario 4: Advanced mapping using filters, explicit joins and rejections.

  2. Drop a new tFileOutputDelimited component
    from the Palette on the design workspace,
    and name it Rejects_BMW_Mercedes to present its
    functionality.

  3. Connect the tMap component to the new
    output component using a Row connection and
    label the connection according to the functionality of the output component.

    This connection label will appear as the name of the new output table in
    the Map Editor.

  4. Relabel the existing output connections and output components to reflect
    their functionality.

    The existing output tables in the Map
    Editor
    will be automatically renamed according to the
    connection labels. In this example, relabel the existing output connections
    BMW_Mercedes_withChildren and
    Owners_Other_Makes respectively.

    UseCase_tmap5-1.png

Configuring the components

  1. Double-click the tMap component to launch
    the Map Editor to change the mappings and
    the filters.

    Note that the output area contains a new, empty output table named
    Rejects_BMW_Mercedes. You can adjust the position
    of the table by selecting it and clicking the Up or Down arrow button at
    the top of the output area.

  2. Remove the Expr. key filter
    (“BMW”) from the Cars table in
    the input area.

  3. Click the Filters button to display the
    Filter field, and type in a new filter
    to limit the search to BMW or
    Mercedes car makes. The statement reads as follows:
    Cars.Make.equals("BMW") ||
    Cars.Make.equals("Mercedes")

    use_case-tmap5-2cut2.png
  4. Select all the columns of the main output table and drop them down to the
    new output table.

    Alternatively, you can also drag the corresponding columns from the
    relevant input tables to the new output table.

  5. Click the tMap settings button at the top
    of the new output table and set Catch output
    reject
    to true to collect
    data about BMW and Mercedes owners who have less than two or more than six
    children.

  6. In the Owners_Other_Makes table, set Catch lookup inner join reject to true to collect data about owners of other car
    makes and owners for whom the reseller information is not found.

    UseCase_tmap5-2.png
  7. Click OK to validate the mappings and
    close the Map Editor.

  8. Define the properties of the output components in their respective
    Basic settings view.

    In this use case, simple specify the output file paths and select the
    Include Header check box, and leave the
    other parameters as they are.

    UseCase_tmap5-3.png

Executing the Job

  1. Press Ctrl + S to save the Job.

  2. Press F6 to run it.

    The output files contain content of the main output flow shows that the
    filtered rows have correctly been passed on.

    UseCase_tmap5-4.png

    For examples of how to use dynamic
    schemas with tMap, see:

Scenario 6: Advanced mapping with lookup reload at each row

The following scenario describes a Job that retrieves people details from a lookup
database, based on a join on the age. The main flow source data is read from a MySQL
database table called people_age that contains people details such
as numeric id, alphanumeric first name and last name and numeric age. The people age is
either 40 or 60. The number of records in this table is intentionally restricted.

The reference or lookup information is also stored in a MySQL database table called
large_data_volume. This lookup table contains a number of
records including the city where people from the main flow have been to. For the sake of
clarity, the number of records is restricted but, in a normal use, the usefulness of the
feature described in the example below is more obvious for very large reference data
volume.

To optimize performance, a database connection component is used in the beginning of
the Job to open the connection to the lookup database table in order not to do that
every time we want to load a row from the lookup table.

An Expression Filter is applied to this lookup source flow, in order to select only
data from people whose age is equal to 60 or 40. This way only the relevant rows from
the lookup database table are loaded for each row from the main flow.

Therefore this Job shows how, from a limited number of main flow rows, the lookup join
can be optimized to load only results matching the expression key.

Note

Generally speaking, as the lookup loading is performed for each main flow row,
this option is mainly interesting when a limited number of rows is processed in the
main flow while a large number of reference rows are to be looked up to.

The join is solved on the age field. Then, using the relevant
loading option in the tMap component editor, the lookup
database information is loaded for each main flow incoming row.

For this Job, the metadata has been prepared for the source and connection components.
For more information on how to set up the DB connection schema metadata, see the
relevant section in the Talend Studio User Guide.

This Job is formed with five components, four database components and a mapping
component.

Linking the components

  1. Drop the DB Connection under the Metadata
    node of the Repository to the design
    workspace. In this example, the source table is called
    people_age.

  2. Select tMysqlInput from the list that
    pops up when dropping the component.

    UseCase_tMap6-2.png
  3. Drop the lookup DB connection table from the Metadata node to the design workspace selecting tMysqlInput from the list that pops up. In this
    Job, the lookup is called large_data_volume.

  4. The same way, drop the DB connection from the Metadata node to the design workspace selecting tMysqlConnection from the list that pops up. This
    component creates a permanent connection to the lookup database table in
    order not to do that every time we want to load a row from the lookup
    table.

  5. Then pick the tMap component from the
    Processing family, and the tMysqlOutput and tMysqlCommit components from the Database family in the Palette
    to the right hand side of the editor.

  6. Now connect all the components together. To do so, right-click the
    tMysqlInput component corresponding to
    the people table and drag the link towards tMap.

  7. Release the link over the tMap component,
    the main row flow is automatically set up.

  8. Rename the Main row link to
    people, to identify more easily the main flow
    data.

  9. Perform the same operation to connect the lookup table
    (large_data_volume) to the tMap component and the tMap
    to the tMysqlOutput component.

  10. A dialog box prompts for a name to the output link. In this example, the
    output flow is named: people_mixandmatch.

  11. Rename also the lookup row connection link to
    large_volume, to help identify the reference data
    flow.

  12. Connect tMysqlConnection to tMysqlInput using the trigger link OnSubjobOk.

  13. Connect the tMysqlInput component to the
    tMysqlCommit component using the
    trigger link OnSubjobOk.

    UseCase_tMap6-1.png

Configuring the components

  1. Double-click the tMap component to open
    the graphical mapping editor.

    UseCase_tMap6-3.png
  2. The Output table (that was created
    automatically when you linked the tMap to
    the tMySQLOutput will be formed by the
    matching rows from the lookup flow (large_data_volume)
    and the main flow (people_age).

    Select the main flow rows that are to be passed on to the output and drag
    them over to paste them in the Output table (to the right hand side of the
    mapping editor).

    In this example, the selection from the main flow include the following
    fields: id, first_name,
    last_Name and age.

    From the lookup table, the following column is selected:
    city.

    Drop the selected columns from the input tables
    (people and large_volume) to
    the output table.

  3. Now set up the join between the main and lookup flows.

    Select the age column of the main flow table (on top)
    and drag it towards the age column of the lookup flow
    table (large_volume in this example).

    A key icon appears next to the linked expression on the lookup table. The
    join is now established.

  4. Click the tMap settings button, click the
    three-dot button corresponding to Lookup
    Model
    , and select the Reload at each
    row
    option from the [Options] dialog box in order to reload the lookup for each
    row being processed.

    UseCase_tMap6-4.png
  5. In the same way, set Match Model to
    All matches in the Lookup table, in
    order to gather all instances of age matches in the
    output flow.

  6. Now implement the filtering, based on the age column,
    in the Lookup table. The GlobalMapKey field
    is automatically created when you selected the Reload
    at each row
    option. Indeed you can use this expression to
    dynamically filter the reference data in order to load only the relevant
    information when joining with the main flow.

    As mentioned in the introduction of the scenario, the main flow data
    contains only people whose age is either 40 or 60. To avoid the pain of
    loading all lookup rows, including ages that are different from 40 and 60,
    you can use the main flow age as global variable to feed the lookup
    filtering.

    UseCase_tMap6-5.png
  7. Drop the Age column from the main flow table to the
    Expr. field of the lookup table.

  8. Then in the globalMap Key field, put in
    the variable name, using the expression. In this example, it reads:
    "people.Age"

    Click OK to save the mapping setting and
    go back to the design workspace.

  9. To finalize the implementation of the dynamic filtering of the lookup
    flow, you need now to add a WHERE clause in the query of the database
    input.

    UseCase_tMap6-6.png
  10. At the end of the Query field, following
    the Select statement, type in the following WHERE clause:
    WHERE AGE
    ='"+((Integer)globalMap.get("people.Age"))+"'"

  11. Make sure that the type corresponds to the column used as variable. In
    this use case, Age is of Integer
    type. And use the variable the way you set in the globalMap key field of the map editor.

  12. Double-click the tMysqloutput component
    to define its properties.

    UseCase_tmap6-7.png
  13. Select the Use an existing connection
    check box to leverage the created DB connection.

    Define the target table name and relevant DB actions.

Executing the Job

  1. Press Ctrl + S to save the Job.

  2. Click the Run tab at the bottom of the
    design workspace, to display the Job execution tab.

  3. From the Debug Run view, click the
    Traces Debug button to view the data
    processing progress.

    For more comfort, you can maximize the Job design view while executing by
    simply double-clicking on the Job name tab.

    UseCase_tMap6-9.png

    The lookup data is reloaded for each of the main flow’s rows,
    corresponding to the age constraint. All age matches
    are retrieved in the lookup rows and grouped together in the output
    flow.

    Therefore if you check out the data contained in the newly created
    people_mixandmatch table, you will find all the
    age duplicates corresponding to different
    individuals whose age equals to 60 or 40 and the city where they have been
    to.

    UseCase_tmap6-10.png

    For examples of how to use dynamic
    schemas with tMap, see:

Scenario 7: Mapping with join output tables

The following scenario describes a Job that processes reject flows without separating
them from the main flow.

Linking the components

  1. In the Repository tree view, click
    Metadata > File delimited. Drag and drop the
    customers metadata onto the workspace.

    The customers metadata contains information about
    customers, such as their ID, their name or their address, etc.

    For more information about centralizing metadata, see Talend Studio
    User Guide
    .

  2. In the dialog box that asks you to choose which component type you want to
    use, select tFileInputDelimited and click
    OK.

  3. Drop the states metadata onto the design workspace.
    Select the same component in the dialog box and click OK.

    The states metadata contains the ID of the state,
    and its name.

  4. Drop a tMap and two tLogRow components from the Palette onto the design workspace.

  5. Connect the customers component to the tMap, using a Row >
    Main
    connection.

  6. Connect the states component to the tMap, using a Row >
    Main
    connection. This flow will automatically be defined as
    Lookup.

    UseCase_tmap7-1.png

Configuring the components

  1. Double-click the tMap component to open
    the Map Editor.

    Drop the idState column from the main input table to
    the idState column of the lookup table to create a
    join.

    Click the tMap settings button and set
    Join Model to Inner Join.

  2. Click the Property Settings button at the
    top of the input area to open the [Property
    Settings]
    dialog box, and clear the Die
    on error
    check box in order to handle the execution errors.

    The ErrorReject table is automatically
    created.

    UseCase_tMap7-2_TIS.png
  3. Select the id, idState,
    RegTime and RegisterTime in
    the input table and drag them to the ErrorReject table.

    UseCase_tMap7-3.png
  4. Click the [+] button at the top right of
    the editor to add an output table. In the dialog box that opens, select
    New output. In the field next to it,
    type in the name of the table, out1. Click OK.

  5. Drag the following columns from the input tables to the
    out1 table: id,
    CustomerName, idState, and
    LabelState.

    Add two columns, RegTime and
    RegisterTime, to the end of the
    out1 table and set their date formats:
    "dd/MM/yyyy HH:mm" and "yyyy-MM-dd
    HH:mm:ss.SSS"
    respectively.

  6. Click in the Expression field for the
    RegTime column, and press Ctrl+Space to display the auto-completion list. Find and
    double-click TalendDate.parseDate. Change the pattern to
    ("dd/MM/yyyy HH:mm",row1.RegTime).

  7. Do the same thing for the RegisterTime column, but
    change the pattern to ("yyyy-MM-dd
    HH:mm:ss.SSS",row1.RegisterTime)
    .

    UseCase_tMap7-4.png
  8. Click the [+] button at the top of the
    output area to add an output table. In the dialog box that opens, select
    Create join table from, choose
    Out1, and name it rejectInner.
    Click OK.

  9. Click the tMap settings button and set
    Catch lookup inner join reject to
    true in order to handle rejects.

  10. Drag the id, CustomerName, and
    idState columns from the input tables to the
    corresponding columns of the rejectInner table.

    Click in the Expression field for the
    LabelState column, and type in
    "UNKNOWN".

  11. Click in the Expression field for the
    RegTime column, press Ctrl+Space, and select TalendDate.parseDate. Change the
    pattern to ("dd/MM/yyyy HH:mm",row1.RegTime).

  12. Click in the Expression field for the
    RegisterTime column, press Ctrl+Space, and select TalendDate.parseDate, but change
    the pattern to ("yyyy-MM-dd
    HH:mm:ss.SSS",row1.RegisterTime)
    .

    If the data from row1 has a wrong pattern, it will
    be returned by the ErrorReject flow.

    UseCase_tMap7-4.png

    Click OK to validate the changes and
    close the editor.

  13. Double-click the first tLogRow component
    to display its Component view.

    Click Sync columns to retrieve the
    schema structure from the mapper if needed.

    In the Mode area, select Table.

    Do the same thing with the second tLogRow.

Executing the Job

  1. Press Ctrl + S to save your Job.

  2. Press F6 to execute it.

    The Run console displays the main out
    flow and the ErrorReject flow. The main output flow unites both valid data
    and inner join rejects, while the ErrorReject flow contains the error
    information about rows with unparseable date formats.

    UseCase_tMap7-5.png

    For examples of how to use dynamic
    schemas with tMap, see:


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x