tMapRDBConfiguration
Stores connection information and credentials to be reused by other MapRDB
components.
You configure the connection to a MapR-DB database in tMapRDBConfiguration and configure the other
MapRDB components to reuse this configuration. At runtime, the Spark
executors read this configuration in order to connect to MapR-DB.
Depending on the Talend
product you are using, this component can be used in one, some or all of the following
Job frameworks:
-
Spark Batch: see tMapRDBConfiguration properties for Apache Spark Batch.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric. -
Spark Streaming: see tMapRDBConfiguration properties for Apache Spark Streaming.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.
tMapRDBConfiguration properties for Apache Spark Batch
These properties are used to configure tMapRDBConfiguration running in the Spark Batch Job framework.
The Spark Batch
tMapRDBConfiguration component belongs to the Storage and the Databases families.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the The properties are stored centrally under the Hadoop |
Distribution and |
Select the MapR distribution to be used. Only MapR V5.2 onwards is supported If the distribution you need to use with your MapRDB database is not
|
Zookeeper quorum |
Type in the name or the URL of the Zookeeper service you use to coordinate the transaction |
Zookeeper client port |
Type in the number of the client listening port of the Zookeeper service you are |
Use kerberos |
If the database to be used is running with Kerberos security, select this
check box, then, enter the principal names in the displayed fields. You should be able to find the information in the hbase-site.xml file of the cluster to be used.
If you need to use a Kerberos keytab file to log in, select Use a keytab to authenticate. A keytab file contains Note that the user that executes a keytab-enabled Job is not necessarily For further information about how Kerberos can be configured for your |
HBase Properties |
If you need to use custom configuration for your database, complete this table with the For example, you need to define the value of the dfs.replication property as 1 for the database configuration. Then you need to add one |
Usage
Usage rule |
This component is used only with the other MapRDB components to provide MapR-DB |
Prerequisites |
Before starting, ensure that you have met the Loopback IP prerequisites expected by your The Hadoop distribution must be properly installed, so as to guarantee the interaction
For further information about how to install a Hadoop distribution, see the manuals |
Spark Connection |
In the Spark
Configuration tab in the Run view, define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
Related scenarios
For a scenario about how to use the same type of component in a Spark Batch Job, see Writing and reading data from MongoDB using a Spark Batch Job.
tMapRDBConfiguration properties for Apache Spark Streaming
These properties are used to configure tMapRDBConfiguration running in the Spark Streaming Job framework.
The Spark Streaming
tMapRDBConfiguration component belongs to the Storage and the Databases families.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the The properties are stored centrally under the Hadoop |
Distribution and |
Select the MapR distribution to be used. Only MapR V5.2 onwards is supported If the distribution you need to use with your MapRDB database is not
|
Zookeeper quorum |
Type in the name or the URL of the Zookeeper service you use to coordinate the transaction |
Zookeeper client port |
Type in the number of the client listening port of the Zookeeper service you are |
Use kerberos |
If the database to be used is running with Kerberos security, select this
check box, then, enter the principal names in the displayed fields. You should be able to find the information in the hbase-site.xml file of the cluster to be used.
If you need to use a Kerberos keytab file to log in, select Use a keytab to authenticate. A keytab file contains Note that the user that executes a keytab-enabled Job is not necessarily Do not use the keytab configuration in tMapRDBConfiguration For further information about how Kerberos can be configured for your |
HBase Properties |
If you need to use custom configuration for your database, complete this table with the For example, you need to define the value of the dfs.replication property as 1 for the database configuration. Then you need to add one |
Usage
Usage rule |
This component is used only with the other MapRDB components to provide MapR-DB |
Prerequisites |
Before starting, ensure that you have met the Loopback IP prerequisites expected by your The Hadoop distribution must be properly installed, so as to guarantee the interaction
For further information about how to install a Hadoop distribution, see the manuals |
Spark Connection |
In the Spark
Configuration tab in the Run view, define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
Related scenarios
For a scenario about how to use the same type of component in a Spark Streaming Job, see
Reading and writing data in MongoDB using a Spark Streaming Job.