tHashInput
Reads from the cache memory data loaded by tHashOutput to offer
high-speed data feed, facilitating transactions involving a large amount of
data.
The components of the Technical family are normally hidden from the Palette by default. For more information about how to show
them on the Palette, see
Talend Studio User
Guide.
tHashInput Standard properties
These properties are used to configure tHashInput running in the Standard Job framework.
The Standard
tHashInput component belongs to the Technical family.
The component in this framework is available in all Talend
products.
Basic settings
Schema and Edit |
A schema is a row description, it defines the number of fields to Click Edit
This This |
 |
Built-in: The schema is created |
 |
Repository: The schema already |
Link with a tHashOutput |
Select this check box to connect to a tHashOutput component. It is always selected by |
Component list |
Drop-down list of available tHashOutput components. |
Clear cache after reading |
Select this check box to clear the cache after reading the data |
Advanced settings
tStatCatcher Statistics |
Select this check box to collect log data at the component |
Global Variables
Global Variables |
ERROR_MESSAGE: the error message generated by the
NB_LINE: the number of rows processed. This is an After A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
This component is used along with tHashOutput. It reads from the cache memory data |
Reading data from the cache memory for high-speed data access
The following Job reads from the cache memory a huge amount of data loaded by two
tHashOutput components and pass it to a tFileOutputDelimited. The goal of this scenario is to show
the speed at which mass data is read and written. In practice, data feed generated in
this way can be used as lookup table input for some use cases where a big amount of data
needs to be referenced.
Dropping and linking the components
- Drag and drop the following components from the Palette to the workspace: tFixedFlowInput (X2), tHashOutput (X2), tHashInput and tFileOutputDelimited.
-
Connect the first tFixedFlowInput to the
first tHashOutput using a Row > Main
link. -
Connect the second tFixedFlowInput to the
second tHashOutput using a Row > Main
link. - Connect the first subJob (from tFixedFlowInput_1) to the second subJob (to tFixedFlowInput_2) using an OnSubjobOk link.
-
Connect tHashInput to tFileOutputDelimited using a Row > Main
link. -
Connect the second subJob to the last subJob using an OnSubjobOk link.
Configuring the components
Configuring data inputs and hash cache
-
Double-click the first tFixedFlowInput component to display its Basic settings view.
-
Select Built-In from the Schema drop-down list.
Note:
You can select Repository from
the Schema drop-down list to fill
in the relevant fields automatically if the relevant metadata has
been stored in the Repository. For
more information about Metadata,
see the
Talend Studio User
Guide. -
Click Edit schema to define the data
structure of the input flow. In this case, the input has two columns:
ID and ID_Insurance, and then click OK to close the dialog box. -
Fill in the Number of rows field to
specify the entries to output, e.g. 50000. -
Select the Use Single Table check
box. In the Values table and in the
Value column, assign values to the
columns, e.g. 1 for ID and 3
for ID_Insurance. -
Perform the same operations for the second tFixedFlowInput component, with the only difference in
the values. That is, 2 for ID and 4
for ID_Insurance in this case. -
Double-click the first tHashOutput to
display its Basic settings view. -
Select Built-In from the Schema drop-down list and click Sync columns to retrieve the schema from the
previous component. Select Keep all
from the Keys management drop-down list
and keep the Append check box
selected. - Perform the same operations for the second tHashOutput component, and select the Link with a tHashOutput check box.
Configuring data retrieval from hash cache and data output
-
Double-click tHashInput to display
its Basic settings view. -
Select Built-In from the Schema drop-down list. Click Edit schema to define the data structure,
which is the same as that of tHashOutput. -
Select tHashOutput_1 from the
Component list drop down
list. -
Double-click tFileOutputDelimited to
display its Basic settings view. -
Select Built-In from the Property Type drop-down list. In the
File Name field, enter the full
path and name of the file, e.g. “E:/Allr70207V5.0/Talend-All-r70207-V5.0.0NB/workspace/out.csv”. -
Select the Include Header check box
and click Sync columns to retrieve the
schema from the previous component.
Saving and executing the Job
- Press Ctrl+S to save the Job.
-
Press F6, or click Run on the Run tab to
execute the Job.
You can find that mass entries are written and read very rapidly.
Clearing the memory before loading data to it in case an iterator exists
in the same subJob
In this scenario, the usage of the Append option of
tHashOutput is demonstrated as it helps remove
repetitive or unwanted data in case an iterator exists in the same subJob as tHashOutput.
To build the Job, do the following:
Dropping and linking the components
-
Drag and drop the following components from the Palette to the workspace: tLoop, tFixedFlowInput,
tHashOutput, tHashInput and tLogRow. -
Connect tLoop to tFixedFlowInput using a Row
> Iterate link. - Connect tFixedFlowInput to tHashOutput using a Row > Main link.
- Connect tHashInput to tLogRow using a Row > Main link.
-
Connect tLoop to tHashInput using an OnSubjobOk link.
Configuring the components
Configuring data input and hash cache
-
Double-click the tLoop component to
display its Basic settings view. -
Select For as the loop type. Type in
1, 2
1 in the From, To and Step fields respectively. Keep the Values are increasing check box
selected. -
Double-click the tFixedFlowInput
component to display its Basic settings
view. -
Select Built-In from the Schema drop-down list.
Note:
You can select Repository from
the Schema drop-down list to fill
in the relevant fields automatically if the relevant metadata has
been stored in the Repository. For
more information about Metadata,
see the
Talend Studio User
Guide. -
Click Edit schema to define the data
structure of the input flow. In this case, the input has one column:
Name. -
Click OK to close the dialog
box. -
Fill in the Number of rows field to
specify the entries to output, for example 1. -
Select the Use Single Table check
box. In the Values table, assign a
value to the Name field, e.g. Marx. -
Double-click tHashOutput to display
its Basic settings view. -
Select Built-In from the Schema drop-down list and click Sync columns to retrieve the schema from the
previous component. Select Keep all
from the Keys management drop-down list
and deselect the Append check
box.
Configuring data retrieval from hash cache and data output
-
Double-click tHashInput to display
its Basic settings view. -
Select Built-In from the Schema drop-down list. Click Edit schema to define the data structure,
which is the same as that of tHashOutput. -
Select tHashOutput_2 from the
Component list drop-down
list. -
Double-click tLogRow to display its
Basic settings view. -
Select Built-In from the Schema drop-down list and click Sync columns to retrieve the schema from the
previous component. In the Mode area,
select Table (print values in cells of a
table).
Saving and executing the Job
- Press Ctrl+S to save the Job.
-
Press F6, or click Run on the Run tab to
execute the Job.You can find that only one row was output although two rows were generated
by tFixedFlowInput.