tCassandraOutput
Writes data into or deletes data from a column family of a Cassandra
keyspace.
tCassandraOutput receives data from
the preceding component, and writes data into Cassandra.
Depending on the Talend
product you are using, this component can be used in one, some or all of the following
Job frameworks:
-
Standard: see tCassandraOutput Standard properties.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric. -
Spark Batch: see tCassandraOutput properties for Apache Spark Batch.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric. -
Spark Streaming: see tCassandraOutput properties for Apache Spark Streaming.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.
tCassandraOutput Standard properties
These properties are used to configure tCassandraOutput running in the Standard Job framework.
The Standard
tCassandraOutput component belongs to the Big Data and the Databases NoSQL families.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the |
Use existing connection |
Select this check box and in the Component List click the relevant connection component to |
DB Version |
Select the Cassandra version you are using. |
API type |
This drop-down list is displayed only when you have selected the 2.0 version Note that the Hector API is deprecated along with Along with the evolution of the CQL commands, the parameters to be set in the Basic settings view varies. |
Host |
Hostname or IP address of the Cassandra server. |
Port |
Listening port number of the Cassandra server. |
Required authentication |
Select this check box to provide credentials for the Cassandra This check box appears only if you do not select the Use existing connection check box. |
Username |
Fill in this field with the username for the Cassandra |
Password |
Fill in this field with the password for the Cassandra To enter the password, click the […] button next to the |
Use SSL |
Select this check box to enable the SSL or TLS encrypted connection. Then you need to use the tSetKeystore |
Keyspace |
Type in the name of the keyspace into which you want to write data. |
Action on keyspace |
Select the operation you want to perform on the keyspace to be used:
|
Column family |
Type in the name of the keyspace into which you want to write data. |
Action on column family |
Select the operation you want to perform on the column family to be used:
|
Action on data |
On the data of the table defined, you can perform:
Note that the action list varies depending on the For more advanced actions, use the Advanced settings |
Schema and Edit schema |
A schema is a row description. It defines the number of fields Click Edit
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the When the schema to be reused has default values that are You can find more details about how to |
Sync columns |
Click this button to retrieve schema from the previous component |
Die on error |
Clear the check box to skip any rows on error and complete the process for |
Features available only with the Hector API (deprecated)
Row key column |
Select the row key column from the list. |
Include row key in columns |
Select this check box to include row key in columns. |
Super columns |
Select the super column from the list. This drop-down list appears only if you select Super from the Column family |
Include super columns in standard |
Select this check box to include the super columns in standard |
Delete row |
Select this check box to delete the row. This check box appears only if you select Delete from the Action on |
Delete columns |
Customize the columns you want to delete. |
Delete super columns |
Select this check box to delete super columns. This check box appears only if you select the Delete Row check box. |
Advanced settings
Batch Size |
Number of lines in each processed batch. When you are using the Datastax API, |
Use unlogged batch |
Select this check box to handle data in batch but with Cassandra’s UNLOGGED approach. This Then you need to configure how the batch mode works:
The ideal situation to use batches with Cassandra is when a small number of tables must In this UNLOGGED approach, the Job does not write batches into Cassandra’s batchlog system |
Insert if not exists |
Select this check box to insert rows. This row insertion takes place only when they do not This feature is available to the Insert action |
Delete if exists |
Select this check box to remove from the target table only the rows that have the same This feature is available only to the Delete |
Use TTL |
Select this check box to write the TTL data in the target table. In the column list that This feature is available to the Insert action and the |
Use Timestamp |
Select this check box to write the timestamp data in the target table. In the column list This feature is available to the following actions: Insert, Update and Delete. |
IF condition |
Add the condition to be met for the Update or the |
Special assignment operation |
Complete this table to construct advanced SET commands of Cassandra to make the Update action more specific. For example, add a record to the In the Update column column of this table, you need to
select the column to be updated and then select the operations to be used from the Operation column. The following operations are available:
For more details about these operations, see Datastax’s related documentation |
Row key in the List type |
Select the column to be used to construct the WHERE clause of Cassandra to perform the |
Delete collection column based on postion/key |
Select the column to be used as reference to locate the particular row(s) to be This feature is available only to the Delete |
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at the Job |
Global Variables
Global Variables |
NB_LINE: the number of rows read by an input component or
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
This component is used as an output component and it always needs an |
Related Scenario
For a scenario in which tCassandraOutput is used, see
Handling data with Cassandra.
tCassandraOutput properties for Apache Spark Batch
These properties are used to configure tCassandraOutput running in the Spark Batch Job framework.
The Spark Batch
tCassandraOutput component belongs to the Databases family.
The component in this framework is available in all subscription-based Talend products with Big Data
and Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the |
Sync columns |
Click this button to retrieve schema from the previous component |
Keyspace |
Type in the name of the keyspace into which you want to write data. |
Action on keyspace |
Select the operation you want to perform on the keyspace to be used:
|
Column family |
Type in the name of the keyspace into which you want to write data. |
Action on column family |
Select the operation you want to perform on the column family to be used:
This list is available only when you have selected Update, Upsert or Insert from the Action on data drop-down |
Action on data |
On the data of the table defined, you can perform:
For more advanced actions, use the Advanced settings |
Schema and Edit schema |
A schema is a row description. It defines the number of fields Click Edit
The schema of this component does not support the Object type and the List type. |
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the When the schema to be reused has default values that are You can find more details about how to |
Advanced settings
Configuration |
Add the Cassandra properties you need to customize in upserting data into Cassandra.
The following list presents the numerical values you can put and the consistency levels
For further details about each of the consistency policies, see Datastax When a row is added to the table, you need to click the new row in the Property name column to display the list of the available |
Use unlogged batch |
Select this check box to handle data in batch but with Cassandra’s UNLOGGED approach. This Then you need to configure how the batch mode works:
The ideal situation to use batches with Cassandra is when a small number of tables must In this UNLOGGED approach, the Job does not write batches into Cassandra’s batchlog system |
Insert if not exists |
Select this check box to insert rows. This row insertion takes place only when they do not This feature is available to the Insert action |
Delete if exists |
Select this check box to remove from the target table only the rows that have the same This feature is available only to the Delete |
Use TTL |
Select this check box to write the TTL data in the target table. In the column list that This feature is available to the Insert action and the |
Use Timestamp |
Select this check box to write the timestamp data in the target table. In the column list This feature is available to the following actions: Insert, Update and Delete. |
IF condition |
Add the condition to be met for the Update or the |
Special assignment operation |
Complete this table to construct advanced SET commands of Cassandra to make the Update action more specific. For example, add a record to the In the Update column column of this table, you need to
select the column to be updated and then select the operations to be used from the Operation column. The following operations are available:
For more details about these operations, see Datastax’s related documentation |
Row key in the List type |
Select the column to be used to construct the WHERE clause of Cassandra to perform the |
Delete collection column based on |
Select the column to be used as reference to locate the particular row(s) to be This feature is available only to the Delete |
Usage
Usage rule |
This component is used as an end component and requires an input link. This component should use one and only one tCassandraConfiguration component present in the same Job to connect to This component, along with the Spark Batch component Palette it belongs to, Note that in this documentation, unless otherwise explicitly stated, a |
Spark Connection |
In the Spark
Configuration tab in the Run view, define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
Related scenarios
For a scenario about how to use the same type of component in a Spark Batch Job, see Writing and reading data from MongoDB using a Spark Batch Job.
tCassandraOutput properties for Apache Spark Streaming
These properties are used to configure tCassandraOutput running in the Spark Streaming Job framework.
The Spark Streaming
tCassandraOutput component belongs to the Databases family.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the |
Sync columns |
Click this button to retrieve schema from the previous component |
Keyspace |
Type in the name of the keyspace into which you want to write data. |
Action on keyspace |
Select the operation you want to perform on the keyspace to be used:
|
Column family |
Type in the name of the keyspace into which you want to write data. |
Action on column family |
Select the operation you want to perform on the column family to be used:
This list is available only when you have selected Update, Upsert or Insert from the Action on data drop-down |
Action on data |
On the data of the table defined, you can perform:
For more advanced actions, use the Advanced settings |
Schema and Edit schema |
A schema is a row description. It defines the number of fields Click Edit
The schema of this component does not support the Object type and the List type. |
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the When the schema to be reused has default values that are You can find more details about how to |
Advanced settings
Configuration |
Add the Cassandra properties you need to customize in upserting data into Cassandra.
The following list presents the numerical values you can put and the consistency levels
For further details about each of the consistency policies, see Datastax When a row is added to the table, you need to click the new row in the Property name column to display the list of the available |
Use unlogged batch |
Select this check box to handle data in batch but with Cassandra’s UNLOGGED approach. This Then you need to configure how the batch mode works:
The ideal situation to use batches with Cassandra is when a small number of tables must In this UNLOGGED approach, the Job does not write batches into Cassandra’s batchlog system |
Insert if not exists |
Select this check box to insert rows. This row insertion takes place only when they do not This feature is available to the Insert action |
Delete if exists |
Select this check box to remove from the target table only the rows that have the same This feature is available only to the Delete |
Use TTL |
Select this check box to write the TTL data in the target table. In the column list that This feature is available to the Insert action and the |
Use Timestamp |
Select this check box to write the timestamp data in the target table. In the column list This feature is available to the following actions: Insert, Update and Delete. |
IF condition |
Add the condition to be met for the Update or the |
Special assignment operation |
Complete this table to construct advanced SET commands of Cassandra to make the Update action more specific. For example, add a record to the In the Update column column of this table, you need to
select the column to be updated and then select the operations to be used from the Operation column. The following operations are available:
For more details about these operations, see Datastax’s related documentation |
Row key in the List type |
Select the column to be used to construct the WHERE clause of Cassandra to perform the |
Delete collection column based on |
Select the column to be used as reference to locate the particular row(s) to be This feature is available only to the Delete |
Usage
Usage rule |
This component is used as an end component and requires an input link. This component should use one and only one tCassandraConfiguration component present in the same Job to connect to This component, along with the Spark Batch component Palette it belongs to, Note that in this documentation, unless otherwise explicitly stated, a |
Spark Connection |
In the Spark
Configuration tab in the Run view, define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
Related scenarios
For a scenario about how to use the same type of component in a Spark Streaming Job, see
Reading and writing data in MongoDB using a Spark Streaming Job.