August 16, 2023

Connecting to a custom Hadoop distribution – Docs for ESB 6.x

Connecting to a custom Hadoop distribution

As explained in the properties table, when you select the Custom option from the Distribution
drop-down list, you are connecting to a Hadoop distribution different from any of the
Hadoop distributions provided on that Distribution list
in the Studio.

After selecting this Custom option, click the

dotbutton.png

button to display the [Import custom
definition]
dialog box and proceed as follows:

  1. Depending on your situation, select Import from existing
    version
    or Import from zip to
    configure the custom Hadoop distribution to be connected to.

    • If you have the zip file of the custom Hadoop distribution you need to
      connect to, select Import from zip.

      Talend
      community provides this kind of zip
      files that you can download from http://www.talendforge.org/exchange/index.php.

    • Otherwise, select Import from existing
      version
      to import an officially supported Hadoop
      distribution as base so as to customize it by following the wizard.

    import_custom_definition.png

    Note that the check boxes in the wizard allow you to select the Hadoop
    element(s) you need to import. All the check boxes are not always displayed in
    your wizard depending on the context in which you are creating the connection.
    For example, if you are creating this connection for a Hive component, then only
    the Hive check box appears.
  2. Whether you have selected Import from existing
    version
    or Import from zip,
    verify that each check box next to the Hadoop element you need to import has
    been selected..
  3. Click OK and then in the pop-up warning, click Yes to accept overwriting any custom setup of jar
    files previously implemented.

    Once done, the [Custom Hadoop version
    definition]
    dialog box becomes active.
    export_custom_hadoop_setup.png

    This dialog box lists the Hadoop elements and their jar files you are
    importing.
  4. If you have selected Import from zip, click
    OK to validate the imported
    configuration.

    If you have selected Import from existing
    version
    as base, you should still need to add more jar files to
    customize that version. Then from the tab of the Hadoop element you need to
    customize, for example, the HDFS/HCatalog/Oozie
    tab, click the [+] button to open the [Select libraries] dialog box.
  5. Select the External libraries option to open
    its view.
  6. Browse to and select any jar file you need to import.
  7. Click OK to validate the changes and to close
    the [Select libraries] dialog box.

    Once done, the selected jar file appears on the list in the tab of the Hadoop
    element being configured.
    Note that if you need to share the custom Hadoop setup with another Studio,
    you can export this custom connection from the [Custom
    Hadoop version definition]
    window using the

    export.png

    button.

  8. In the [Custom Hadoop version definition]
    dialog box, click OK to validate the customized
    configuration. This brings you back to the Distribution list in the Basic
    settings
    view of the component.

Now that the configuration of the custom Hadoop version has been set up and you are
back to the Distribution list, you are able to continue
to enter other parameters required by the connection.

If the custom Hadoop version you need to connect to contains YARN and you want to use
it, select the Use YARN check box next to the Distribution list.

A video is available in the following link to demonstrate, by taking HDFS as example, how
to set up the connection to a custom Hadoop cluster, also referred to as an unsupported
Hadoop distribution: How to add an unsupported Hadoop
distribution to the Studio
.


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x