tMapRDBInput
Reads data from a given MapRDB database and extracts columns of
selection.
tMapRDBInput extracts columns
corresponding to schema definition. Then it passes these columns to the
next component via a Main row link.
Depending on the Talend
product you are using, this component can be used in one, some or all of the following
Job frameworks:
-
Standard: see tMapRDBInput Standard properties.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric. -
MapReduce: see tMapRDBInput MapReduce properties (deprecated).
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric. -
Spark Batch: see tMapRDBInput properties for Apache Spark Batch.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric.
tMapRDBInput Standard properties
These properties are used to configure tMapRDBInput running in the Standard Job framework.
The Standard
tMapRDBInput component belongs to the Big Data and the Databases NoSQL families.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the The properties are stored centrally under the Hadoop |
Use an existing connection |
Select this check box and in the Component List click the relevant connection component to |
Distribution and |
Select the MapR distribution to be used. Only MapR V5.2 onwards is supported If the distribution you need to use with your MapRDB database is not
|
Hadoop version of the |
This list is displayed only when you have selected Custom from the distribution list to connect to a cluster not yet |
Zookeeper quorum |
Type in the name or the URL of the Zookeeper service you use to coordinate the transaction |
Zookeeper client port |
Type in the number of the client listening port of the Zookeeper service you are |
Use kerberos authentication |
If the database to be used is running with Kerberos security, select this
check box, then, enter the principal names in the displayed fields. You should be able to find the information in the hbase-site.xml file of the cluster to be used.
If you need to use a Kerberos keytab file to log in, select Use a keytab to authenticate. A keytab file contains Note that the user that executes a keytab-enabled Job is not necessarily |
Schema and Edit |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Set table Namespace mappings |
Enter the string to be used to construct the mapping between an Apache HBase table and a For the valid syntax you can use, see http://doc.mapr.com/display/MapR40x/Mapping+Table+Namespace+Between+Apache+HBase+Tables+and+MapR+Tables. |
Table name |
Type in the name of the table from which you need to extract columns. |
Define a row selection |
Select this check box and then in the Start row and the Different from the filters you can set using Is by |
Mapping |
Complete this table to map the columns of the table to be used with the schema columns you |
Advanced settings
tStatCatcher Statistics |
Select this check box to collect log data at the component level. |
Properties |
If you need to use custom configuration for your database, complete this table with the For example, you need to define the value of the dfs.replication property as 1 for the Note:
This table is not available when you are using an existing |
Is by filter |
Select this check box to use filters to perform fine-grained data selection from your Once selecting it, the Filter table that is used to This feature leverages filters provided by HBase and subject to constraints explained in |
Logical operation |
Select the operator you need to use to define the logical relation between filters. This
available operators are:
|
Filter |
Click the button under this table to add as many rows as required, each row representing a
filter. The parameters you may need to set for a filter are:
Depending on the Filter type you are using, |
Global Variables
Global Variables |
NB_LINE: the number of rows read by an input component or
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
This component is a start component of a Job and always needs an |
Prerequisites |
Before starting, ensure that you have met the Loopback IP prerequisites expected by your The Hadoop distribution must be properly installed, so as to guarantee the interaction
For further information about how to install a Hadoop distribution, see the manuals |
Related scenario
This component is similar to tHBaseInput. For related scenario to tHBaseInput, see Exchanging customer data with HBase.
tMapRDBInput MapReduce properties (deprecated)
These properties are used to configure tMapRDBInput running in the MapReduce Job framework.
The MapReduce
tMapRDBInput component belongs to the MapReduce and the Databases families.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric.
The MapReduce framework is deprecated from Talend 7.3 onwards. Use Talend Jobs for Apache Spark to accomplish your integration tasks.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the The properties are stored centrally under the Hadoop |
|
Click this icon to open a database connection wizard and store the For more information about setting up and storing database |
Distribution and |
Select the MapR distribution to be used. Only MapR V5.2 onwards is supported If the distribution you need to use with your MapRDB database is not
|
Zookeeper quorum |
Type in the name or the URL of the Zookeeper service you use to coordinate the transaction |
Zookeeper client port |
Type in the number of the client listening port of the Zookeeper service you are |
Use kerberos authentication |
If the database to be used is running with Kerberos security, select this
check box, then, enter the principal names in the displayed fields. You should be able to find the information in the hbase-site.xml file of the cluster to be used.
If you need to use a Kerberos keytab file to log in, select Use a keytab to authenticate. A keytab file contains Note that the user that executes a keytab-enabled Job is not necessarily |
Schema et Edit schema |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Table Namespace mappings |
Enter the string to be used to construct the mapping between an Apache HBase table and a For the valid syntax you can use, see http://doc.mapr.com/display/MapR40x/Mapping+Table+Namespace+Between+Apache+HBase+Tables+and+MapR+Tables. |
Table name |
Type in the name of the table from which you need to extract columns. |
Mapping |
Complete this table to map the columns of the table to be used with the schema columns you |
Die on error |
Select the check box to stop the execution of the Job when an error Clear the check box to skip any rows on error and complete the process for |
Advanced settings
Properties |
If you need to use custom configuration for your database, complete this table with the For example, you need to define the value of the dfs.replication property as 1 for the |
Is by filter |
Select this check box to use filters to perform fine-grained data selection from your Once selecting it, the Filter table that is used to This feature leverages filters provided by HBase and subject to constraints explained in |
Logical operation |
Select the operator you need to use to define the logical relation between filters. This
available operators are:
|
Filter |
Click the button under this table to add as many rows as required, each row representing a
filter. The parameters you may need to set for a filter are:
Depending on the Filter type you are using, |
Global Variables
Global Variables |
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
In a You need to use the Hadoop Configuration tab in the The Hadoop configuration you use for the whole Job and the Hadoop distribution you use for |
Prerequisites |
Before starting, ensure that you have met the Loopback IP prerequisites expected by your The Hadoop distribution must be properly installed, so as to guarantee the interaction
For further information about how to install a Hadoop distribution, see the manuals |
Hadoop Connection |
You need to use the Hadoop Configuration tab in the This connection is effective on a per-Job basis. |
Related scenarios
No scenario is available for the Map/Reduce version of this component yet.
tMapRDBInput properties for Apache Spark Batch
These properties are used to configure tMapRDBInput running in the Spark Batch Job framework.
The Spark Batch
tMapRDBInput component belongs to the Databases family.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric.
Basic settings
Storage configuration |
Select the tMapRDBConfiguration component from which the |
Schema and Edit |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Table name |
Type in the name of the table from which you need to extract columns. |
Table Namespace mappings |
Enter the string to be used to construct the mapping between an Apache HBase table and a For the valid syntax you can use, see http://doc.mapr.com/display/MapR40x/Mapping+Table+Namespace+Between+Apache+HBase+Tables+and+MapR+Tables. |
Mapping |
Complete this table to map the columns of the table to be used with the schema columns you |
Is by filter |
Select this check box to use filters to perform fine-grained data selection from your Once selecting it, the Filter table that is used to This feature leverages filters provided by HBase and subject to constraints explained in |
Logical operation |
Select the operator you need to use to define the logical relation between filters. This
available operators are:
|
Filter |
Click the button under this table to add as many rows as required, each row representing a
filter. The parameters you may need to set for a filter are:
Depending on the Filter type you are using, |
Die on HBase error |
Select the check box to stop the execution of the Job when an error Clear the check box to skip any rows on error and complete the process for |
Usage
Usage rule |
This component is used as a start component and requires an output This component uses a tMapRDBConfiguration component present in the same Job to connect to |
Spark Connection |
In the Spark
Configuration tab in the Run view, define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
Related scenarios
For a scenario about how to use the same type of component in a Spark Batch Job, see Writing and reading data from MongoDB using a Spark Batch Job.