tNeo4jOutput
Receives data from the preceding component and writes the data into Neo4j.
tNeo4jOutput is used to write data
into a Neo4j database, and/or update or delete entries in the database based on the index
defined.
tNeo4jOutput Standard properties
These properties are used to configure tNeo4jOutput running in the Standard Job framework.
The Standard
tNeo4jOutput component belongs to the Big Data and the Databases NoSQL families.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric.
Basic settings
Use an existing connection |
Select this check box and in the Component List click the relevant connection component to |
DB version |
Select the Neo4j version you are using. This component does not support Neo4j version V3.2.X. Do not reuse the connection to V3.2.X defined in a tNeo4jConnection component. Do Neo4j version 2.X.X is compatible only with Java 7 or higher but it offers This list is not shown if the Use an Upon selecting a database version, you will be |
Remote server |
Select this check box if you use a Neo4j remote server, and specify the root URL in the Server URL field.
This check box appears only if you do not select the Use an existing connection check box. |
Database path |
If you use Neo4j in embedded mode, specify the directory This field appears only if you do not select the |
Shutdown after |
Select this check box to shutdown the Neo4j database connection when no more Alternatively, you can use tNeo4jClose to shutdown the This avoids errors such as “Id file not properly shutdown” at next execution This check box is available only if the Use an existing |
Mapping |
Click the […]
|
Use label (Neo4j > 2.0) |
Select this check box to create nodes with a label. Enter This check box is not shown if Neo4J 1.X.X is selected from the DB Version list or Delete is selected from the Data action list. Note that this option works only with Neo4j 2.0 onwards |
Data action |
On the data of the node, you can perform:
|
Index name |
Specify the index name to query. This field is available only if the action selected in |
Index key |
Specify the index key to query. This field is available only if the action selected in |
Index value |
Select the index value to query. This field is available only if the action selected in |
Schema and Edit schema |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the When the schema to be reused has default values that are You can find more details about how to |
Advanced settings
Commit every |
Enter the number of rows to be completed before committing batches of Warning: This option is only supported by the
embedded mode of the database. You can’t make transactions in REST mode. |
Batch import |
Select this check box to activate the batch mode. Warning:
Note: If you have configured index creation on multiple
columns in the Mapping table, it is recommended that you select the Unique check box in the index setting for the last column to avoid creating unwanted redundant indexes that may cause batch load issues. If you want more explanations about memory mapping configuration of |
Node store mapped memory |
Type in the memory size in MB allocated to nodes. |
Relationship store mapped memory |
Type in the memory size in MB allocated to relationships. |
Property store mapped memory |
Type in the memory size in MB allocated to property. |
String store mapped memory |
Type in the memory size in MB allocated to strings. |
Array store mapped memory |
Type in the memory size in MB allocated to arrays. |
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at the Job |
Global Variables
Global Variables |
NB_LINE: the number of rows read by an input component or
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
This component is used as an output component and it always needs an incoming link. |
Limitation | n/a |
Writing data to a Neo4j database and reading specific data from
it
This scenario applies only to Talend products with Big Data.
This basic scenario describes a Job composed of two subJobs: the first subJob reads
employees data from a CSV file and writes it to a Neo4j database, and then triggers the
second subJob, which reads the employees data based on certain query conditions from the
Neo4j database and displays the data on the Run
console.
Adding and linking components
-
Create a Job and add the following components to the Job by typing theirs
names in the design workspace or dropping them from the Palette:-
a tFileInputDelimited component,
to read the employees data from a CSV file, -
a tNeo4jOutput component to write
the employees data to a Neo4j database, -
a tNeo4jIntput component to read
the employees data from the Neo4j database based on given
conditions, and -
a tLogRow component to display
the data on the Run console.
-
-
Link the tFileInputDelimited component to
the tNeo4jOutput component using a
Row > Main connection. -
Link the tNeo4jIntput component to the
tLogRow component using a Row > Main
connection. -
Link the tFileInputDelimited component to
the tNeo4jIntput component using a
Trigger > On
Subjob Ok connection. -
Label the components to better identify their roles in the Job.
Configuring the components
Importing data to the Neo4j database
-
Double-click the tFileInputDelimited
component to open its Basic settings view
on the Components tab. -
In the File name/Stream field, specify
the path to the CSV file that contains the employees data to read.The input CSV file used in this example is as follows:1234567891011employeeID;employeeName;age;hireDate;salary;managerID1;Rutherford Roosevelt;38;06-10-2008;13336.58;m52;Warren Adams;43;05-22-2008;11626.68;m63;Andrew Roosevelt;55;04-01-2007;10052.95;m44;Herbert Quincy;54;06-14-2007;10694.71;m65;Woodrow Polk;33;08-14-2007;13751.50;m46;Theodore Johnson;47;01-26-2008;12426.87;m67;Benjamin Adams;32;02-25-2008;10438.65;m48;Woodrow Harrison;51;10-11-2008;11188.27;m59;George Truman;40;04-28-2008;14254.49;m510;Harry Jackson;38;04-01-2008;12798.78;m6 -
In the Header field, specify the number
of rows to skip as header rows. In this example, the first row of the CSV
file is the header row. -
Click the […] button next to Edit schema to open the Schema dialog box, and define the input schema based on
the structure of the input file. In this example, the input schema is
composed of six columns: employeeID
(integer), employeeName (String),
age (Integer), hireDate (Date), salary (Float), and managerID (String).When done, click OK to close the
Schema dialog box and propagate the
schema to the next component. -
Click the tNeo4jOutput component and
select the Component tab to open its
Basic settings view. -
Define a Neo4j database connection. In this example, the Neo4j database is
accessible in REST mode, so select the Remote
server check box and specify the URL of the Neo4j server in
the Server URL field, “http://localhost:7474/db/data” in this
example. -
If needed, click the Sync columns button
to ensure the component has the same schema as the preceding
component.Keep the rest of the parameters as they are.
Reading data from the Neo4j database
-
Double-click the tNeo4jInput component to
open its Basic settings view. -
As in the tNeo4jOutput component, specify
the URL of the Neo4j server to connect to, “http://localhost:7474/db/data” in this example. -
Click the […] button next to Edit schema and define the schema for employees
information display. When done, click OK to
close the Schema dialog box and propagate
the schema to the next component.The defined schema columns automatically appear in the Mapping table. -
In the Query field, type in the Cypher
query to match the data to read from the Neo4j database. In this example,
use the following Cypher query to find employees who are more than 40 years old and are under the manager
m6.1"MATCH (n) WHERE n.age > 40 AND n.managerID = 'm6' RETURN n;" -
Fill the Return parameter field for each
schema column with a return parameter in double quotes to map the node
properties in the Neo4j database with the schema columns. -
Double-click the tLogRow component to
open its Basic settings view, and select
the Table (print values in cells of a
table) option to display the retrieved information in a
table.
Executing the Job
- Press Ctrl+S to save the Job.
-
Press F6 or click Run on the Run tab to run
the Job.The employees data of the CSV file is written to the Neo4j database and
then the information of employees matching the set conditions is retrieved
from the Neo4j database and displayed on the console.
Writing family information to Neo4j and creating relationships
This scenario applies only to Talend products with Big Data.
This scenario describes a Job that will write family information to labeled nodes in a
remote Neo4j database and create relationships based on the family names.
Adding and linking components
-
Create a Job and add the following components to the Job by typing theirs
names in the design workspace or dropping them from the Palette:-
a tFileInputDelimited component,
to read the family data from a CSV file, -
a tNeo4jOutput component to write
the family data to a Neo4j database and create relationships between
husband and wife.
-
-
Link the tFileInputDelimited component to
the tNeo4jOutput component using a
Row > Main connection. -
Label the components to better identify their roles in the Job.
Configuring the components
Configuring the data source
-
Double-click the tFileInputDelimited
component to open its Basic settings view
on the Components tab. -
In the File name/Stream field, specify
the path to the CSV file that contains the family data to read.The input CSV file used in this example is as follows:1234567Name;Gender;Age;FamilyJenny;Female;24;the JohnsonsJack;Male;26;the JohnsonsRichard;Male;35;the BlacksAnne;Female;36;the WhitesHelen;Female;28;the BlacksTom;Male;38;the Whites -
In the Header field, specify the number
of rows to skip as header rows. In this example, the first row of the CSV
file is the header row. -
Click the […] button next to Edit schema to open the Schema dialog box, and define the input schema based on
the structure of the input file. In this example, the input schema is
composed of six columns: name (integer),
gender (String), age (Integer), and family (String).When done, click OK to close the
Schema dialog box and propagate the
schema to the next component.
Writing data to Neo4j and creating indexes and relationships
-
Click the tNeo4jOutput component and
select the Component tab to open its
Basic settings view. -
From the DB Version list, select
Neo4J 2.X.X to enable node
labeling. -
Define a Neo4j database connection. In this example, the Neo4j database is
accessible in REST mode, so select the Remote
server check box and specify the URL of the Neo4j server in
the Server URL field, “http://localhost:7474/db/data” in this
example. -
Double-click the tNeo4jOutput component
or click the Mapping button on the
component’s Basic settings view to open the
index and relationship mapping editor. -
With the name column selected from the
schema panel, click the Index creation tab,
click the [+] button to add a row in the
table, and create an index named first_name on this column:-
In the Name field, enter
first_name between double
quotation marks. -
In the Key field, enter first_name between double quotation
marks to give the index a key.
Then click in the schema panel to validate your index creation. -
-
With the family column selected from
the schema panel, click the Index creation
tab, click the [+] button to add a row in
the table, and create an index named family on this column:-
In the Name field, enter
family between double
quotation marks. -
In the Key field, enter family_name between double quotation
marks to give the index a key.
Then click in the schema panel to validate your index creation. -
-
With the family column selected from
the schema panel, click the Relationship
creation tab, click the [+]
button to add a row in the table, and create a relationship named Spouse on this column based on the index named
family:-
In the Type field, enter
Spouse between double
quotation marks. -
From the Direction list field,
select either Outgoing or Incoming. -
In the Index Name field, enter
family between double
quotation marks. -
In the Index Key field, enter
family_name between double
quotation marks.
Then click in the schema panel to validate your relationship creation, and
click OK to close the mapping
editor. -
-
Select the Use label (Neo4j > 2.0) check
box and enter Families between double
quotation marks in the Label name field so
that the nodes to be created will be labeled Families. -
From the Data action list, select
Insert or update, and set a reference
key in the Index area that appears:-
In Index name field, enter
first_name between double
quotation marks. -
In Index key field, enter
first_name between double
quotation marks. -
From Index value field, select
name. As the Value field is left blank in index
creation, the index value will be the value of the name column for each row.
This way, when the Job is executed, nodes will be inserted or updated in
the Neo4j database based on the first_name index: for each data row, if a node containing
the same first name already exists in the database, the node will be
updated; otherwise, a new node will be created. -
Executing the Job and checking the result
-
Press Ctrl+S to save the Job, and press
F6 or click Run on the Run tab to run
the Job. -
In the address bar of your Web browser, enter the URL of the Neo4j
database browser,http://localhost:7474/
in this example, and
enter the following Cypher query in the command line to view the
nodes.1MATCH (n:`Families`) RETURN n;As shown in the graphic view, three pairs of nodes labeled Families have been created and those with the
same family name are linked together via the relationship Spouse.