tMongoDBBulkLoad
MongoDB database so that the data can be further processed.
tMongoDBBulkLoad Standard properties
These properties are used to configure tMongoDBBulkLoad running in the Standard Job framework.
The Standard
tMongoDBBulkLoad component belongs to the Big Data and the Databases NoSQLfamilies.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric.
Basic settings
Schema and Edit schema |
A schema is a row description. It defines the number of fields Click Edit
|
MongoDB directory |
Fill in this field with the MongoDB home directory. |
Use local DB path |
Select this check box to provide the information of the local database that you want to
|
Use replica set address |
Select this check box to define a replica set to be connected.
|
Server |
Hostname or IP address of the database server. Note that the default This field is available only when the Use replica set address check box is not |
Port |
Listening port of the database server. Note that the default value This field is available only when the Use replica set address check box is not |
Database |
Type in the name of the database to import data to. |
Collection |
Type in the name of the collection to import data to. |
Use SSL connection |
Select this check box to enable the SSL or TLS encrypted connection. Then you need to use the tSetKeystore Note that the SSL connection is available only for the version 2.4 + of |
Drop collection if exist |
Select this check box to remove the collection if it already |
Required authentication |
Select this check box to enable the database authentication. Among the mechanisms listed on the Authentication mechanism For details about the other mechanisms in this list, see MongoDB Authentication from the MongoDB |
Set Authentication database |
If the username to be used to connect to MongoDB has been created in a specific For further information about the MongoDB Authentication database, see User Authentication database. |
Username and Password |
DB user authentication data. To enter the password, click the […] button next to the Available when the Required If the security system you have selected from the Authentication mechanism drop-down list is Kerberos, you need to |
Data file |
Type in the full path of the file from which the data will be imported Make sure that the data file is in standard format. For |
File type |
Select the proper file type from the list. CSV, TSV and JSON are |
The JSON file starts with an |
Select this check box to allow tMongoDBBulkload to read the JSON files starting with an This check box appears when the File |
Action on data |
Select the action that you want to perform on the data.
|
Upsert fields |
Customize the fields that you want to upsert as needed. This table is available when you select Upsert from the Action on data list. |
First line is header |
Select this check box to use the first line in CSV or TSV files as a This check box is available only when you select CSV or |
Ignore blanks |
Select this check box to ignore the empty fields in CSV or TSV This check box is available only when you select CSV or |
Print log |
Select this check box to print logs. |
Advanced settings
Additional arguments |
Complete this table to use the additional arguments as required. For example, you can use the argument “–jsonArray” to accept the |
tStatCatcher Statistics |
Select this check box to collect the log data at a component level. |
Global Variables
Global Variables |
NB_LINE: the number of rows read by an input component or
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
This component can be used together with the tMongoDBInput component to check if the data is imported |
Limitation |
The MongoDB client tool needs to be installed on the machine where |
Importing data into MongoDB database
This scenario applies only to Talend products with Big Data.
The following scenario describes a Job that firstly imports data from a CSV file into
the specified MongoDB collection, then reads data from the MongoDB collection to check
if the import is successful, next continues to import data from a JSON file with the
same data structure into the same MongoDB collection, and finally displays the data from
the MongoDB collection to demonstrate that the data from the JSON file is also imported
successfully.
Dropping and linking the components
- Drop the following components from the Palette onto the design workspace: two tMongoDBBulkLoad components, two tMongoDBInput components, and two tLogRow components.
-
Connect the first tMongoDBBulkLoad to the
first tMongoDBInput using a Trigger > OnSubjobOk link. -
Connect the first tMongoDBInput to the
first tLogRow using a Row > Main link. - Repeat the two steps above to connect the second tMongoDBBulkLoad to the second tMongoDBInput, and the second tMongoDBInput to the second tLogRow.
-
Connect the first tMongoDBInput to the
second tMongoDBBulkLoad using a Trigger > OnSubjobOk link. -
Label the two tLogRow components to
better identify the data displayed on the console.
Configuring the components
Importing data from a CSV file
-
Double-click the first tMongoDBBulkLoad
component to open its Basic settings view
in the Component tab. -
In the MongoDB directory field, type in
the MongoDB home directory. In this example, it is D:/MongoDB. -
In the Server and Port fields, fill in the information required for the
connection to MongoDB. In this example, type in localhost and 27017. -
In the Database field,
type in the database to import data to, bookstore in this
example. -
In the Collection field, type in the collection to
import data to, books in this example. -
Select the Drop collection if exist check
box to remove the specified collection if it already exists. -
Browse to the desired data file from which you want to import data. In
this example, it is D:/Input/books.csv,
which is a standard CSV file containing four columns: id, title,
author, and category.id,title,author,category
1,Computer Networks,Larry Peterson,Computer Science
2,David Copperfield,Charles Dickens,Language&Literature
3,Life of Pi,Yann Martel,Language&Literature
- Select CSV from the File type list.
- Select Insert from the Action on data list.
-
Select the First line is
header check box to use the first line in the CSV file as a
header. -
Select the Ignore blanks check box to ignore the blank
fields (if any) in the CSV file.
Validating that the CSV file is imported successfully
-
Double-click the first tMongoDBInput component to open its Basic settings view in the Component tab.
-
From the DB Version list, select the
MongoDB version you are using. -
In the Server and Port fields, fill in the information required for the
connection to MongoDB. In this example, type in localhost and 27017. -
In the Database field, type in the
database from which the data will be read, bookstore in this example. -
In the Collection field, type in the
collection from which the data will be read, books in this example. -
Click Edit schema to
define the data structure to be read from the MongoDB collection. -
In the Mapping table, the Column field is automatically populated with the
defined schema. You do not need to fill in the Parent
node path column. -
Double-click the first tLogRow component to open its Basic
settings view in the Component tab. - In the Mode area, select Table (print values in cells of a table).
Importing data from a JSON file
-
Double-click the second tMongoDBBulkLoad component to open its Basic settings view in the Component tab.
-
In the MongoDB directory field, type in
the MongoDB home directory. In this example, it is D:/MongoDB. -
In the Server and Port fields, fill in the information required for the
connection to MongoDB. In this example, type in localhost and 27017. -
In the Database field, type in the target
database to import data, bookstore in
this example. -
In the Collection field, type in the target collection
to import data, books in this example -
Browse to the desired data file from which you want to import data. Here,
select books.json.{
"id": "4",
"title": "Les Miserables",
"author": "Victor Hugo",
"category": "Language&Literature"
}
{
"id": "5",
"title": "Advanced Database Systems",
"author": "Carlo Zaniolo",
"category": "Database"}
- Select JSON from the File type list.
- Select Insert from the Action on data list.
-
Click the Advanced
settings tab to define the additional arguments as needed.In this example, add the argument ” –jsonArray” to accept the
imported data within a single JSON array.
Validating that the JSON file is imported successfully
-
Repeat Step 1 through Step 7 described in the procedure Validating that the CSV file is imported successfully to configure the second tMongoDBInput component.
- Repeat Step 8 through Step 9 described in the procedure Validating that the CSV file is imported successfully to configure the second tLogRow component.
Saving and executing the Job
- Press Ctrl + S to save the Job.
-
Execute the Job by pressing F6 or clicking Run on the
Run tab.
The data from the collection books in the MongoDB database bookstore is
displayed on the console, which contains the data imported from both the CSV file
books.csv and the JSON file books.json.