Warning
This component will be available in the Palette of
Talend Studio on the condition that you have subscribed to one of
the Talend
solutions with Big Data.
Component family |
Big Data / MongoDB |
|
Function |
tMongoDBOutput inserts, updates, |
|
Purpose |
This component executes the action defined on the collection in |
|
Basic settings |
Use existing connection |
Select this check box and in the Component List click the |
|
DB Version |
List of the database versions. Available when the Use existing |
|
Use replica set address |
Select this check box to show the Replica In the Replica address table, you Available when the Use existing |
|
Server and Port |
IP address and listening port of the database server. Available when the Use existing |
|
Database |
Name of the database. |
Use SSL connection |
Select this check box to enable the SSL encrypted connection. Then you need to use the tSetKeystore component in the For further information about tSetKeystore, see tSetKeystore. Note that the SSL connection is available only for the version 2.4 + of MongoDB. |
|
|
Required authentication |
Select this check box to enable the database |
|
Username and Password |
DB user authentication data. To enter the password, click the […] button next to the Available when the Required |
|
Collection |
Name of the collection in the MongoDB database. |
|
Drop collection if exist |
Select this check box to drop the collection if it already |
|
Action on data |
The following operations are available: Insert: insert data. Update: update data.
Upsert: update and insert Delete: delete data. |
Schema and Edit |
A schema is a row description. It defines the number of fields to be processed and passed on Click Edit schema to make changes to the schema. If the
Click Sync columns to retrieve |
|
|
|
Built-In: You create and store the schema locally for this |
|
|
Repository: You have already created the schema and When the schema to be reused has default values that are integers or functions, ensure that For more details, see https://help.talend.com/display/KB/Verifying+default+values+in+a+retrieved+schema. |
|
Mapping |
Specify the parent node for the column in the MongoDB Not available when the Generate JSON |
Die on error |
This check box is cleared by default, meaning to skip the row on |
|
Advanced settings |
Generate JSON Document |
Select this check box for JSON configuration: Configure JSON Tree: click the Group by: click the [+] button to add lines and choose the Remove root node: select this Data node and Query node (available for update and WarningThese nodes are mandatory for update and upsert actions. They |
tStatCatcher Statistics |
Select this check box to collect the log data at the component |
|
Global Variables |
NB_LINE: the number of rows read by an input component or ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
tMongoDBOutput executes the |
|
Log4j |
The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html. |
|
Limitation |
Note
|
This scenario creates the collection blog and
writes post data to it.
-
Drop tMongoDBConnection, tFixedFlowInput, tMongoDBOutput, tMongoDBClose, tMongoDBInput and tLogRow
onto the workspace. -
Rename tFixedFlowInput as blog_post_data, tMongoDBOutput as write_data_to_collection, tMongoDBInput as read_data_from_collection and tLogRow as show_data_from_collection.
-
Link tMongoDBConnection to tFixedFlowInput using the OnSubjobOk trigger.
-
Link tFixedFlowInput to tMongoDBOutput using a Row > Main
connection. -
Link tFixedFlowInput to tMongoDBInput using the OnSubjobOk trigger.
-
Link tMongoDBInput to tMongoDBClose using the OnSubjobOk trigger.
-
Link tMongoDBInput to tLogRow using a Row > Main
connection.
-
Double-click tMongoDBConnection to open
its Basic settings view. -
From the DB Version list, select the
MongoDB version you are using. -
In the Server and Port fields, enter the connection details.
In the Database field, enter the name of the MongoDB
database. -
Double-click tFixedFlowInput to open its
Basic settings view.Select Use Inline Content (delimited
file) in the Mode
area.In the Content field, enter the data to write to the
MongoDB database, for example:1231;Andy;Open Source Outlook;Open Source,Talend;Talend, the leader of the open source world...3;Andy;ELT Overview;ELT,Talend;Talend, the big name in the ELT circle...2;Andy;Data Integration Overview;Data Integration,Talend;Talend, the leading player in the DI field... -
Double-click tMongoDBOutput to open its
Basic settings view.Select the Use existing connection and
Drop collection if exist check
boxes.In the Collection field, enter the name
of the collection, namely blog. -
Click the […] button next to Edit schema to open the schema editor.
-
Click the [+] button to add five columns
in the right part, namely id, author, title, keywords and
contents, with the type as Integer and String respectively.Click to copy all the columns to the input table.
Click Ok to close the editor.
-
The columns now appear in the left part of the Mapping area.
For columns author, title, keywords and
contents, enter their parent node
post. By doing so, those nodes reside
under the node post in the MongoDB
collection. -
Double-click tMongoDBInput to open its
Basic settings view.Select the Use existing connection check
box.In the Collection field, enter the name
of the collection, namely blog. -
Click the […] button next to Edit schema to open the schema editor.
-
Click the [+] button to add five columns,
namely id, author, title, keywords and contents, with the type as Integer and String
respectively.Click OK to close the editor.
-
The columns now appear in the left part of the Mapping area.
For columns author, title, keywords and contents,
enter their parent node post so that the
data can be retrieved from the correct positions. -
In the Sort by area, click the [+] button to add one line and enter id under Column.
Select asc from the Order asc or desc? column to the right of the id column. This way, the retrieved records will
appear in ascending order of the id
column.
-
Press Ctrl+S to save the Job.
-
Press F6 to run the Job.
-
Switch to the database talend and read data from the
collection blog in the MongoDB command
line client. You can find that author,
title, keywords and contents all
reside under the node post. Meanwhile,
the records are stored in the same order as the source input.
This scenario upserts the collection blog as an
existing record has its author changed and a new record is added. Before the upsert, the
collection blog looks like:
1 2 3 |
1;Andy;Open Source Outlook;Open Source,Talend;Talend, the leader of the open source world... 2;Andy;Data Integration Overview;Data Integration,Talend;Talend, the leading player in the DI field... 3;Andy;ELT Overview;ELT,Talend;Talend, the big name in the ELT circle... |
Such records can be inserted to the database following the instructions of Scenario 1: Creating a collection and writing data to it.
-
Drop tMongoDBConnection, tFixedFlowInput, tMongoDBOutput, tMongoDBClose, tMongoDBInput and tLogRow
from the Palette onto the design
workspace. -
Rename tFixedFlowInput as blog_post_data, tMongoDBOutput as write_data_to_collection, tMongoDBInput as read_data_from_collection and tLogRow as show_data_from_collection.
-
Link tMongoDBConnection to tFixedFlowInput using the OnSubjobOk trigger.
-
Link tFixedFlowInput to tMongoDBOutput using a Row > Main
connection. -
Link tFixedFlowInput to tMongoDBInput using the OnSubjobOk trigger.
-
Link tMongoDBInput to tMongoDBClose using the OnSubjobOk trigger.
-
Link tMongoDBInput to tLogRow using a Row > Main
connection.
-
Double-click tMongoDBConnection to open
its Basic settings view. -
From the DB Version list, select the
MongoDB version you are using. -
In the Server and Port fields, enter the connection details.
In the Database field, enter the name of the MongoDB
database. -
Double-click tFixedFlowInput to open its
Basic settings view.Select Use Inline Content (delimited
file) in the Mode
area.In the Content field, enter the data for upserting the
MongoDB database, for example:12341;Andy;Open Source Outlook;Open Source,Talend;Talend, the leader of the open source world...2;Andy;Data Integration Overview;Data Integration,Talend;Talend, the leading player in the DI field...3;Anderson;ELT Overview;ELT,Talend;Talend, the big name in the ELT circle...4;Andy;Big Data Bang;Big Data,Talend;Talend, the driving force for Big Data applications...As shown above, the 3rd record has its author changed and the 4th record
is new. -
Double-click tMongoDBOutput to open its
Basic settings view.Select the Use existing connection and
Die on error check boxes.In the Collection field, enter the name
of the collection, namely blog.Select Upsert from the Action on data list.
-
Click the […] button next to Edit schema to open the schema editor.
-
Click the [+] button to add five columns
in the right part, namely id, author, title, keywords and
contents, with the type as Integer and String respectively.Click to copy all the columns to the input table.
Click Ok to close the editor.
-
In the Advanced Settings view, select the
Generate JSON Document check
box.Select the Remove root node check box.
In the Data node and Query node fields, enter “data” and “query”.
-
Click the […] button next to Configure JSON Tree to open the configuration
interface. -
Right-click the node rootTag and select
Add Sub-element from the contextual
menu.In the dialog box that appears, type in data for the Data
node:Click OK to close the window.
Repeat this operation to define query
as the Query node.Right-click the node data and select
Set As Loop Element from the contextual
menu.Warning
These nodes are mandatory for update and upsert actions. They are
intended to enable the update and upsert actions though will not be
stored in the database. -
Select all the columns under the Schema
list and drop them to the data node.In the window that appears, select Create as
sub-element of target node.Click OK to close the window.
Repeat this operation to drop the id
column from the Schema list under the
Query node. -
Right-click the node id under data and select Add
Attribute from the contextual menu.In the dialog box that appears, type in type as the attribute name:
Click OK to close the window.
Right-click the node @type under
id and select Set A Fix Value from the contextual menu.In the dialog box that appears, type in integer as the attribute value, ensuring the id values are stored as integers in the
database.Click OK to close the window.
Repeat this operation to set this attribute for the id node under Query.
Click OK to close the JSON Tree
configuration interface. -
Double-click tMongoDBInput to open its
Basic settings view.Select the Use existing connection check
box.In the Collection field, enter the name
of the collection, namely blog.Click the […] button next to Edit schema to open the schema editor.
Click the [+] button to add five columns,
namely id, author, title, keywords and contents, with the type as Integer and String
respectively.Click OK to close the editor.
The columns now appear in the left part of the Mapping area.
For columns author, title, keywords and contents,
enter their parent node post so that the
data can be retrieved from the correct positions. -
Double-click tLogRow to open its
Basic settings view.In the Mode area, select Table (print values in cells of a table for
better display.