tMapRDBOutput
Writes columns of data into a given MapRDB database.
tMapRDBOutput
receives data from its preceding component, creates a table in a given MapRDB database
and writes the received data into this table.
Depending on the Talend
product you are using, this component can be used in one, some or all of the following
Job frameworks:
-
Standard: see tMapRDBOutput Standard properties.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric. -
MapReduce: see tMapRDBOutput MapReduce properties (deprecated).
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric. -
Spark Batch: see tMapRDBOutput properties for Apache Spark Batch.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric. -
Spark Streaming: see tMapRDBOutput properties for Apache Spark Streaming.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.
tMapRDBOutput Standard properties
These properties are used to configure tMapRDBOutput running in the Standard Job framework.
The Standard
tMapRDBOutput component belongs to the Big Data and the Databases NoSQL families.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the |
Use an existing connection |
Select this check box and in the Component List click the relevant connection component to |
Distribution and |
Select the MapR distribution to be used. Only MapR V5.2 onwards is supported If the distribution you need to use with your MapRDB database is not
|
Hadoop version of the |
This list is displayed only when you have selected Custom from the distribution list to connect to a cluster not yet |
Zookeeper quorum |
Type in the name or the URL of the Zookeeper service you use to coordinate the transaction |
Zookeeper client port |
Type in the number of the client listening port of the Zookeeper service you are |
Use kerberos authentication |
If the database to be used is running with Kerberos security, select this
check box, then, enter the principal names in the displayed fields. You should be able to find the information in the hbase-site.xml file of the cluster to be used.
If you need to use a Kerberos keytab file to log in, select Use a keytab to authenticate. A keytab file contains Note that the user that executes a keytab-enabled Job is not necessarily |
Schema and Edit |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the When the schema to be reused has default values that are You can find more details about how to |
Set table Namespace mappings |
Enter the string to be used to construct the mapping between an Apache HBase table and a For the valid syntax you can use, see http://doc.mapr.com/display/MapR40x/Mapping+Table+Namespace+Between+Apache+HBase+Tables+and+MapR+Tables. |
Table name |
Type in the name of the HBase table you need to create. |
Action on table |
Select the action you need to take for creating a table. |
Custom Row Key |
Select this check box to use the customized row keys. Once selected, the corresponding For example, you can type in |
Families |
Complete this table to specify the column or columns to be created |
Die on error |
This check box is cleared by default, meaning to skip the row on |
Advanced settings
Use batch mode |
Select this check box to activate the batch mode for data processing. |
Batch size |
Specify the number of records to be processed in each batch. This field appears only when the Use batch mode |
Properties |
If you need to use custom configuration for your database, complete this table with the For example, you need to define the value of the dfs.replication property as 1 for the Note:
This table is not available when you are using an existing |
tStatCatcher Statistics |
Select this check box to collect log data at the component |
Family parameters |
Type in the names and, when needs be, the custom performance options of the column Note: The parameter Compression type allows you to select the
format for output data compression. |
Global Variables
Global Variables |
NB_LINE: the number of rows read by an input component or
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
This component is normally an end component of a Job and always |
Prerequisites |
Before starting, ensure that you have met the Loopback IP prerequisites expected by your The Hadoop distribution must be properly installed, so as to guarantee the interaction
For further information about how to install a Hadoop distribution, see the manuals |
Related scenario
This component is similar to tHBaseOutput. For related scenario to tHBaseOutput, see Exchanging customer data with HBase.
tMapRDBOutput MapReduce properties (deprecated)
These properties are used to configure tMapRDBOutput running in the MapReduce Job framework.
The MapReduce
tMapRDBOutput component belongs to the MapReduce and the Databases families.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric.
The MapReduce framework is deprecated from Talend 7.3 onwards. Use Talend Jobs for Apache Spark to accomplish your integration tasks.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the The properties are stored centrally under the Hadoop |
Distribution and |
Select the MapR distribution to be used. Only MapR V5.2 onwards is supported If the distribution you need to use with your MapRDB database is not
|
Zookeeper quorum |
Type in the name or the URL of the Zookeeper service you use to coordinate the transaction |
Zookeeper client port |
Type in the number of the client listening port of the Zookeeper service you are |
Use kerberos authentication |
If the database to be used is running with Kerberos security, select this
check box, then, enter the principal names in the displayed fields. You should be able to find the information in the hbase-site.xml file of the cluster to be used.
If you need to use a Kerberos keytab file to log in, select Use a keytab to authenticate. A keytab file contains Note that the user that executes a keytab-enabled Job is not necessarily |
Schema et Edit |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Table name |
Type in the name of the table in which you need to write data. This table must already |
Table Namespace mappings |
Enter the string to be used to construct the mapping between an Apache HBase table and a For the valid syntax you can use, see http://doc.mapr.com/display/MapR40x/Mapping+Table+Namespace+Between+Apache+HBase+Tables+and+MapR+Tables. |
Row key column |
Select the column used as the row key column of the table. Then if needs be, select the Store row key column to HBase |
Families |
Complete this table to map the columns of the table to be used with the schema columns you The Column column of this table is automatically filled |
Advanced settings
Properties |
If you need to use custom configuration for your database, complete this table with the For example, you need to define the value of the dfs.replication property as 1 for the |
Global Variables
Global Variables |
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
In a The Hadoop configuration you use for the whole Job and the Hadoop distribution you use for |
Hadoop Connection |
You need to use the Hadoop Configuration tab in the This connection is effective on a per-Job basis. |
Prerequisites |
Before starting, ensure that you have met the Loopback IP prerequisites expected by your The Hadoop distribution must be properly installed, so as to guarantee the interaction
For further information about how to install a Hadoop distribution, see the manuals |
Related scenarios
No scenario is available for the Map/Reduce version of this component yet.
tMapRDBOutput properties for Apache Spark Batch
These properties are used to configure tMapRDBOutput running in the Spark Batch Job framework.
The Spark Batch
tMapRDBOutput component belongs to the Databases family.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric.
Basic settings
Storage configuration |
Select the tMapRDBConfiguration component from which the |
Schema et Edit schema |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Table name |
Type in the name of the table in which you need to write data. This |
Table Namespace mappings |
Enter the string to be used to construct the mapping between an Apache HBase table and a For the valid syntax you can use, see http://doc.mapr.com/display/MapR40x/Mapping+Table+Namespace+Between+Apache+HBase+Tables+and+MapR+Tables. |
Row key column |
Select the column used as the row key column of the table. Then if needs be, select the Store row key |
Families |
Complete this table to map the columns of the table to be used with the schema columns you The Column column of this table is automatically filled |
Advanced settings
Use batch mode |
Select this check box to activate the batch mode for data processing. |
Usage
Usage rule |
This component is used as an end component and requires an input link. This component uses a tMapRDBConfiguration component present in the same Job to connect to |
Spark Connection |
In the Spark
Configuration tab in the Run view, define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
Related scenarios
For a scenario about how to use the same type of component in a Spark Batch Job, see Writing and reading data from MongoDB using a Spark Batch Job.
tMapRDBOutput properties for Apache Spark Streaming
These properties are used to configure tMapRDBOutput running in the Spark Streaming Job framework.
The Spark Streaming
tMapRDBOutput component belongs to the Databases family.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.
Basic settings
Storage configuration |
Select the tMapRDBConfiguration component from which the |
Schema et Edit schema |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Table name |
Type in the name of the table in which you need to write data. This |
Table Namespace mappings |
Enter the string to be used to construct the mapping between an Apache HBase table and a For the valid syntax you can use, see http://doc.mapr.com/display/MapR40x/Mapping+Table+Namespace+Between+Apache+HBase+Tables+and+MapR+Tables. |
Row key column |
Select the column used as the row key column of the table. Then if needs be, select the Store row key |
Families |
Complete this table to map the columns of the table to be used with the schema columns you The Column column of this table is automatically filled |
Advanced settings
Use batch mode |
Select this check box to activate the batch mode for data processing. |
Usage
Usage rule |
This component is used as an end component and requires an input link. This component uses a tMapRDBConfiguration component present in the same Job to connect to |
Spark Connection |
In the Spark
Configuration tab in the Run view, define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
Related scenarios
For a scenario about how to use the same type of component in a Spark Streaming Job, see
Reading and writing data in MongoDB using a Spark Streaming Job.