tMysqlOutputBulk

Writes a file with columns based on the defined delimiter and the MySQL or Aurora
standards.

The tMysqlOutputBulk and
tMysqlBulkExec components are used together in a two step
process. In the first step, an output file is generated. In the second step, this file
is used in the INSERT statement used to feed a database. These two steps are fused
together in the tMysqlOutputBulkExec component, detailed in a
separate section. The advantage of using two separate steps is that the data can be
transformed before it is loaded in the database.

tMysqlOutputBulk Standard properties

These properties are used to configure tMysqlOutputBulk running in the Standard Job framework.

The Standard
tMysqlOutputBulk component belongs to the Databases family.

The component in this framework is available in all Talend
products.

Note: This component is a specific version of a dynamic database
connector. The properties related to database settings vary depending on your database
type selection. For more information about dynamic database connectors, see Dynamic database components.

Basic settings

Database	Select a type of database from the list and click Apply.
Property type	Either Built-in or Repository .
	Built-in: No property data stored centrally.
	Repository: Select the repository file in which the properties are stored. The fields that follow are completed automatically using the data retrieved.
File Name	Name of the file to be generated. This file is generated on the same machine where the Studio is installed or where the Job using tMysqlOutputBulk is deployed.
Append	Select this check box to add the new rows at the end of the file
Schema and Edit Schema	A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. When you create a Spark Job, avoid the reserved word `line` when naming the fields.
	Built-In: You create and store the schema locally for this component only.
	Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. When the schema to be reused has default values that are integers or functions, ensure that these default values are not enclosed within quotation marks. If they are, you must remove the quotation marks manually. You can find more details about how to verify default values in retrieved schema in Talend Help Center (https://help.talend.com).
	Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available: View schema: choose this option to view the schema only. Change to built-in property: choose this option to change the schema to Built-in for local changes. Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the Repository Content window.

Advanced settings

Row separator	String (ex: ” ” on Unix) to distinguish rows.
Field separator	Character, string or regular expression to separate fields.
Text enclosure	Character used to enclose the text.
Create directory if does not exist	This check box is selected by default. It creates a directory to hold the output table if required.
Custom the flush buffer size	Customize the amount of memory used to temporarily store output data. In the Row number field, enter the number of rows after which the memory is to be freed again.
Records contain NULL value	This check box is selected by default. It allows you to take account of NULL value fields. If you clear the check box, the NULL values will automatically be replaced with empty values.
Check disk space	Select the this check box to throw an exception during execution if the disk is full.
Encoding	Select the encoding from the list or select Custom and define it manually. This field is compulsory for DB data handling.
tStatCatcher Statistics	Select this check box to collect the log data at the component level.

Global Variables

Global Variables	NB_LINE: the number of rows processed. This is an After variable and it returns an integer. ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box. A Flow variable functions during the execution of a component while an After variable functions after the execution of the component. To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it. For further information about variables, see Talend Studio User Guide.

NB_LINE: the number of rows processed. This is an After
variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule	This component is to be used along with tMySQlBulkExec component. Used together they offer gains in performance while feeding a MySQL or Aurora database.
Component family	Databases/MySQL
Limitation	Due to license incompatibility, one or more JARs required to use this component are not provided. You can install the missing JARs for this particular component by clicking the Install button on the Component tab view. You can also find out and add all missing JARs easily on the Modules tab in the Integration perspective of your studio. You can find more details about how to install external modules in Talend Help Center (https://help.talend.com).

Inserting transformed data in MySQL database

This scenario describes a four-component job which aims at fueling a database
with data contained in a file, including transformed data. Two steps are required in this job,
first step is to create the file, that will then be used in the second step. The first step
includes a tranformation phase of the data included in the file.

Dropping and linking components

Drag and drop a tRowGenerator, a
tMap, a tMysqlOutputBulk as well as a tMysqlBulkExec component.
Connect the main flow using row Main
links.
And connect the start component (tRowgenerator in this example) to the tMysqlBulkExec using a trigger connection, of type OnComponentOk.

Configuring the components

A tRowGenerator is used to generate
random data. Double-click on the tRowGenerator
component to launch the editor.
Define the schema of the rows to be generated and the nature of data to
generate. In this example, the clients file to be
produced will contain the following columns: ID,
First Name, Last Name,
Address, City which all are
defined as string data but the ID that is of integer type.

Some schema information don’t necessarily need to be displayed. To hide
them away, click on Columns list button
next to the toolbar, and uncheck the relevant entries, such as Precision or Parameters.

Use the plus button to add as many columns to your schema
definition.

Click the Refresh button to preview the first generated row of your
output.
Then select the tMap component to set the
transformation.
Drag and drop all columns from the input table to the output table.
Apply the transformation on the LastName column by
adding .toUpperCase() in its expression field.

Then, click OK to validate the
transformation.
Double-click on the tMysqlOutputBulk
component.
Define the name of the file to be produced in File
Name field. If the delimited file information is stored in
the Repository, select it in Property Type field, to retrieve relevant data.
In this use case the file name is clients.txt.

The schema is propagated from the tMap
component, if you accepted it when prompted.
In this example, don’t include the header information as the table should
already contain it.
Click OK to validate the output.
Then double-click on the tMysqlBulkExec
component to set the INSERT query to be executed.
Define the database connection details. We recommend you to store
this type of information in the Repository, so that you can retrieve them at any time
for any Job.
Set the table to be filled in with the collected data, in the Table field.
Fill in the column delimiters in the Field
terminated by area.
Make sure the encoding corresponds to the data encoding.

Saving and executing the Job

Press Ctrl+S to save your Job.
Press F6 or click Run on the Run tab to
execute the Job.

The clients database table is filled with data from
the file including upper-case last name as transformed
in the job.

For simple Insert operations that don’t include any transformations, the use of
tMysqlOutputBulkExec allows you to skip a step
in the process and thus improves performance.

Document get from Talend https://help.talend.com

Thank you for watching.

Docs 7.x

0 Comments

Inline Feedbacks

View all comments

tMysqlOutputBulk – Docs for ESB 7.x

tMysqlOutputBulk