Warning
This component will be available in the Palette of
Talend Studio on the condition that you have subscribed to one of
the Talend
solutions with Big Data.
Component family |
Big Data / MongoDB |
|
Function |
tMongoDBBulkLoad reads data from |
|
Purpose |
tMongoDBBulkLoad allows you to |
|
Basic settings |
Schema and Edit schema |
A schema is a row description. It defines the number of fields to be processed and passed on Click Edit schema to make changes to the schema. If the
|
MongoDB directory |
Fill in this field with the MongoDB home directory. |
|
Use local DB path |
Select this check box to provide the information of the local
|
|
Use replica set address |
Select this check box to define a replica set to be
|
|
Server |
Hostname or IP address of the database server. Note that the WarningThis field is available only when the Use replica set address check box is not |
|
Port |
Listening port of the database server. Note that the default value WarningThis field is available only when the Use replica set address check box is not |
|
Database |
Type in the name of the database to import data to. |
|
Collection |
Type in the name of the collection to import data to. |
|
Use SSL connection |
Select this check box to enable the SSL encrypted connection. Then you need to use the tSetKeystore component in the For further information about tSetKeystore, see tSetKeystore. Note that the SSL connection is available only for the version 2.4 + of MongoDB. |
|
Drop collection if exist |
Select this check box to remove the collection if it already |
|
Required authentication |
Select this check box to provide credentials for MongoDB
To enter the password, click the […] button next to the |
|
Data file |
Type in the full path of the file from which the data will be WarningMake sure that the data file is in standard format. For |
|
File type |
Select the proper file type from the list. CSV, TSV and JSON are |
|
The JSON file starts with an |
Select this check box to allow tMongoDBBulkload to read the JSON files starting This check box appears when the File |
|
Action on data |
Select the action that you want to perform on the data.
|
|
Upsert fields |
Customize the fields that you want to upsert as needed. WarningThis table is available when you select Upsert from the Action on |
|
First line is header |
Select this check box to use the first line in CSV or TSV files as WarningThis check box is available only when you select CSV or TSV |
|
Ignore blanks |
Select this check box to ignore the empty fields in CSV or TSV WarningThis check box is available only when you select CSV or TSV |
|
Print log |
Select this check box to print logs. |
|
Advanced settings |
Additional arguments |
Complete this table to use the additional arguments as required. For example, you can use the argument “–jsonArray” to accept the |
tStatCatcher Statistics |
Select this check box to collect the log data at a component |
|
Global Variables |
NB_LINE: the number of rows read by an input component or ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
This component can be used together with the tMongoDBInput component to check if the data is |
|
Log4j |
The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html. |
|
Limitation |
The MongoDB client tool needs to be installed on the machine where |
The following scenario describes a Job that firstly imports data from a CSV file into
the specified MongoDB collection, then reads data from the MongoDB collection to check
if the import is successful, next continues to import data from a JSON file with the
same data structure into the same MongoDB collection, and finally displays the data from
the MongoDB collection to demonstrate that the data from the JSON file is also imported
successfully.
-
Drop the following components from the Palette onto the design workspace: two tMongoDBBulkLoad components, two tMongoDBInput components, and two tLogRow components.
-
Connect the first tMongoDBBulkLoad to the
first tMongoDBInput using a Trigger > OnSubjobOk link. -
Connect the first tMongoDBInput to the
first tLogRow using a Row > Main link. -
Repeat the two steps above to connect the second tMongoDBBulkLoad to the second tMongoDBInput, and the second tMongoDBInput to the second tLogRow.
-
Connect the first tMongoDBInput to the
second tMongoDBBulkLoad using a Trigger > OnSubjobOk link. -
Label the two tLogRow components to
better identify the data displayed on the console.
Importing data from a CSV file
-
Double-click the first tMongoDBBulkLoad
component to open its Basic settings view
in the Component tab. -
In the MongoDB directory field, type in
the MongoDB home directory. In this example, it is D:/MongoDB. -
In the Server and Port fields, fill in the information required for the
connection to MongoDB. In this example, type in localhost and 27017. -
In the Database field, type in the
database to import data to, bookstore in
this example.In the Collection field, type in the
collection to import data to, books in
this example. -
Select the Drop collection if exist check
box to remove the specified collection if it already exists. -
Browse to the desired data file from which you want to import data. In
this example, it is D:/Input/books.csv,
which is a standard CSV file containing four columns: id, title,
author, and category.1234id,title,author,category1,Computer Networks,Larry Peterson,Computer Science2,David Copperfield,Charles Dickens,Language&Literature3,Life of Pi,Yann Martel,Language&Literature -
Select CSV from the File type list.
-
Select Insert from the Action on data list.
-
Select the First line is header check box
to use the first line in the CSV file as a header.Select the Ignore blanks check box to
ignore the blank fields (if any) in the CSV file.
Validating that the CSV file is imported successfully
-
Double-click the first tMongoDBInput
component to open its Basic settings view
in the Component tab. -
From the DB Version list, select the
MongoDB version you are using. -
In the Server and Port fields, fill in the information required for the
connection to MongoDB. In this example, type in localhost and 27017. -
In the Database field, type in the
database from which the data will be read, bookstore in this example. -
In the Collection field, type in the
collection from which the data will be read, books in this example. -
Click Edit schema to define the data
structure to be read from the MongoDB collection. -
In the Mapping table, the Column field is automatically populated with the
defined schema. You do not need to fill in the Parent
node path column. -
Double-click the first tLogRow component
to open its Basic settings view in the
Component tab. -
In the Mode area, select Table (print values in cells of a table).
Importing data from a JSON file
-
Double-click the second tMongoDBBulkLoad
component to open its Basic settings view
in the Component tab. -
In the MongoDB directory field, type in
the MongoDB home directory. In this example, it is D:/MongoDB. -
In the Server and Port fields, fill in the information required for the
connection to MongoDB. In this example, type in localhost and 27017. -
In the Database field, type in the target
database to import data, bookstore in
this example.In the Collection field, type in the
target collection to import data, books
in this example. -
Browse to the desired data file from which you want to import data. Here,
select books.json.12345678910111213{"id": "4","title": "Les Miserables","author": "Victor Hugo","category": "Language&Literature"}{"id": "5","title": "Advanced Database Systems","author": "Carlo Zaniolo","category": "Database"} -
Select JSON from the File type list.
-
Select Insert from the Action on data list.
-
Click the Advanced settings tab to define
the additional arguments as needed.In this example, add the argument ”
–jsonArray” to accept the imported data within a single JSON
array.
Validating that the JSON file is imported successfully
-
Repeat Step 1 through Step 7 described in the procedure Validating that the CSV file is imported successfully to configure the second tMongoDBInput component.
-
Repeat Step 8 through Step 9 described in the procedure Validating that the CSV file is imported successfully to configure the second tLogRow component.
-
Press Ctrl + S to save the Job.
-
Execute the Job by pressing F6 or
clicking Run on the Run tab.The data from the collection books in
the MongoDB database bookstore is
displayed on the console, which contains the data imported from both the CSV
file books.csv and the JSON file
books.json.