July 30, 2023

tMDMBulkLoad – Docs for ESB 7.x

tMDMBulkLoad

Uses bulk mode to write XML structured master data into the MDM
server.

Note: Your submitted ID will
be used to create the record even if the ID is set to be auto-generated in the data
model. An update operation will be performed if a record with the same ID already exists
in MDM.

tMDMBulkLoad Standard properties

These properties are used to configure tMDMBulkLoad running in
the Standard Job framework.

The Standard
tMDMBulkLoad component belongs to the Talend MDM
family.

The component in this framework is available in all Talend
products
.

Basic settings

Schema and Edit Schema

A schema is a row description, it defines the number of
fields that will be processed and passed on to the next component. The
schema is either built-in or remote in the Repository.

Click Edit
schema
to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this
    option to view the schema only.

  • Change to built-in property:
    choose this option to change the schema to Built-in for local changes.

  • Update repository connection:
    choose this option to change the schema stored in the repository and decide whether
    to propagate the changes to all the Jobs upon completion. If you just want to
    propagate the changes to the current Job, you can select No upon completion and choose this schema metadata
    again in the Repository Content
    window.

Click Sync columns
to collect the schema from the previous component.

 

Built-in: You create the schema
and store it locally for this component only. Related topic: see
Talend Studio User Guide
.

 

Repository: You have already
created the schema and stored it in the Repository. You can reuse it in
various projects and Job designs. Related topic: see
Talend Studio User Guide
.

XML field

Select the name of the column in which you want to write
the XML data.

URL

Type in the URL required to access the MDM server.

Username and Password

Type in the user authentication data for the MDM
server.

To enter the password, click the […] button next to the
password field, and then in the pop-up dialog box enter the password between double quotes
and click OK to save the settings.

Data Model

Type in the name of the data model against which the data
to be written is validated.

Data Container

Type in the name of the data container where you want to
write the master data.

Entity

Type in the name of the entity that holds the data
record(s) you want to write.

Type

Select Master or
Staging to specify the
database on which the action should be performed.

Validate

Select this check box to validate the data you want to
write onto the MDM server against validation rules defined for the
current data model.

Note that for the PROVISIONING Data Container, validation
checks will always be performed on incoming records, regardless of
whether or not this check box is selected.

For more information on how to set the validation rules,
see
Talend Studio User Guide
.

Warning:

If you need faster loading performance, do not
select this check box.

Generate ID

Select this check box to generate an ID number for all of
the data written.

This check box is not available when the Validate
or Fire Create/Update event check box is
selected. The auto-generated ID will be used to create the record if
this check box is not available and the ID is not provided.

Warning:

If you need faster loading performance, do not
select this check box.

Insert only

Select this check box to skip the step of checking
whether the data records to be inserted already exist on the MDM server,
thus achieving a better performance.

However, before using this option, you need to make sure
that the data records do not exist in the database.

Commit size

Type in the row count of each batch to be written onto
the MDM server.

Use Transaction

Select this check box then, in the Component List, click an existing
connection component which will be used to commit the transaction.

Fire Create/Update event

Select this check box to add the actions carried out to a modification
report. In the Source Name field displayed, enter
the name of the application to be used to carry out the modifications
between double quotation marks.

This check box is available only when Master is
selected from the Type drop-down list.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the processing metadata
at the Job level as well as at each component level.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule

This component needs always an incoming link to offer XML
structured data. If your data offered is not yet in the XML structure,
you need use components like tWriteXMLField to transform this data into the XML
structure. For further information about tWriteXMLField, see tWriteXMLField.

You can increase the timeout values for a Job using this component
to help process a large number of data records. For more information, see advanced
execution settings for JVM parameters in the article Timeout values for a Job using
MDM components
on Talend Help Center (https://help.talend.com).

If you use a Job with the component tMDMBulkLoad to bulk
load large volumes of data into MDM, you can tune the bulk load
operation by adding a specific JVM argument (for example,
bulkload.concurrent.http.requests=25) in the
Advanced settings tab of the Job to limit the
maximum number of concurrent requests sent to the MDM server. This
avoids consuming all available Tomcat application server connections,
which will lead to transaction and deadlock issues.

Connections

Outgoing links (from this component to another):

Row: Main,

Trigger: Run if; On Component Ok;
On Component Error, On Subjob Ok, On Subjob Error.

Incoming links (from one component to this one):

Row: Main

Trigger: Run if, On Component Ok,
On Component Error, On Subjob Ok, On Subjob Error

For further information regarding connections, see
Talend Studio User Guide
.

Loading records into a business entity

This scenario applies only to Talend MDM Platform and Talend Data Fabric.

This scenario describes a Job that loads records into the ProductFamily business entity defined by a specific data model in the MDM
hub.

Prerequisites:

  • The Product data container: This data
    container is used to separate the product master data domain from the other
    master data domains.

  • The Product data model: This data model is
    used to define the attributes, validation rules, user access rights and
    relationships of the entities of interest. Thus it defines the attributes of the
    ProductFamily business entity.

  • The ProductFamily business entity: This
    business entity contains Id, Name, both defined by the Product data model.

For further information about how to create a data container, a data
model, and a business entity along with its attributes, see the MDM part of your
Talend Studio MDM Studio User Guide
.

The Job in this scenario uses three components.

tMDMBulkLoad_1.png
  • tFixedFlowInput: This component generates
    the records to be loaded into the ProductFamily business
    entity. In a real-life project, your records to be loaded are often voluminous
    and stored in a specific file. However, to simplify the replication of this
    scenario, this Job uses tFixedFlowInput to
    generate four sample records.

  • tWriteXMLField: This component transforms
    the incoming data into XML structure.

  • tMDMBulkLoad: This component writes the
    incoming data into the ProductFamily business entity in
    bulk mode, generating ID value for each of the record data.

Dropping and linking components

  1. Drop tFixedFlowInput, tWriteXMLField and tMDMBulkLoad
    onto the design workspace.
  2. Connect tFixedFlowInput to tWriteXMLField using the Main link.
  3. Do the same to connect tWriteXMLField to
    tMDMBulkLoad.

Configuring the components

Generating the data records to be loaded into a business entity

  1. Double click tFixedFlowInput to open its
    Basic settings view.

    tMDMBulkLoad_2.png

  2. Click the […] button next to Edit schema to open the schema editor.

    tMDMBulkLoad_3.png

  3. In the schema editor, click the [+] button to
    add one row.
  4. Name the new column, family in this
    example.
  5. Click OK to close the schema editor.
  6. In the Mode area of the Basic settings view, select the Use Inline
    Table
    option.
  7. Click the [+] button four times to add four
    rows in the table.
  8. In the inline table, click each of the added rows and then enter their names
    between quotes: Shirts, Hats,
    Pets, and Mugs.

Transforming the incoming data into XML structure

  1. Double-click tWriteXMLField to open its
    Basic settings view.

    tMDMBulkLoad_4.png

  2. Click the […] button next to the Edit schema field to open the schema editor and then
    add a row by clicking the [+] button.

    tMDMBulkLoad_5.png

  3. Click the newly added row to the right view of the schema editor and enter the
    name of the output column where you want to write the XML content. It is
    xmlRecord in this example.
  4. Click OK to validate this output schema and
    close the schema editor.

    In the dialog box that pops up, click OK to
    propagate this schema to the following component.
  5. In the Basic settings view, click the
    […] button next to Configure XML Tree to open the dialog box where you can create
    the XML structure.

    tMDMBulkLoad_6.png

  6. In the Link Target area, click
    rootTag and rename it to
    ProductFamily, which is the name of the business entity
    used in this scenario.
  7. In the Linker source area, drop
    family to ProductFamily in the
    Link target area.

    A dialog box pops up, asking you to select one operation.
    Select Create as sub-element of target node
    to create a sub-element of the ProductFamily node. Then,
    the family element appears under the
    ProductFamily node.
    Right-click the Name node and select from the contextual
    menu Set As Loop Element.
  8. In the Link target area, click the family node and rename it to Name, which is one of the attributes of the
    ProductFamily business entity.

    Click OK to validate the XML structure you
    defined.

Writing the incoming data into a business entity

  1. Double-click tMDMBulkLoad to open its
    Basic settings view.

    tMDMBulkLoad_7.png

  2. Select xmlRecord from the XML
    Field
    drop-down list.
  3. In the URL field, enter the bulk loader URL
    between quotes. For example,
    http://localhost:8180/talendmdm/services/bulkload.
  4. In the Username and Password fields, enter your login and password to connect to the
    MDM server.
  5. In the Data Model and the Data Container fields, enter the names corresponding
    to the data model and the data container you need to use. Both are
    Product for this scenario.

    In the Entity field, enter the name of the
    business entity into which you want to load the records. In this example, enter
    ProductFamily.
  6. Select the Generate ID check box in order to
    generate ID values for the records to be loaded.
  7. In the Commit size field, type in the batch
    size to be written into the MDM hub in bulk mode.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.
  2. Execute the Job by pressing F6 or clicking
    Run on the Run tab.

    Log into your
    Talend MDM Web UI
    to check the newly
    added records for the ProductFamily business entity.
    tMDMBulkLoad_8.png


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x