tKinesisOutput
Acts as data producer to put data to an Amazon Kinesis stream for real-time
ingestion.
Using the Kinesis Client Library (KCL) provided by Amazon, tKinesisOutput receives serialized messages
from its preceding component and publishes these messages to an existing
Amazon Kinesis stream.
tKinesisOutput properties for Apache Spark Streaming
These properties are used to configure tKinesisOutput running in the Spark Streaming Job framework.
The Spark Streaming
tKinesisOutput component belongs to the Messaging family.
The streaming version of this component is available in the Palette of the Studio only if you have subscribed to Talend Real-time Big Data Platform or Talend Data
Fabric.
Basic settings
|
Schema and Edit |
A schema is a row description. It defines the number of fields (columns) to The schema of this component is read-only. You can click Edit schema to view the schema. The read-only serializedValue column is used to carry The other columns are automatically retrieved from the schema of its preceding component. |
|
Access key |
Enter the access key ID that uniquely identifies an AWS Account. For |
|
Secret key |
Enter the secret access key, constituting the security credentials in To enter the password, click the […] button next to the |
|
Stream name |
Enter the name of the Kinesis stream you want to add data to. |
|
Endpoint URL |
Enter the endpoint of the Kinesis service to be used. For example, https://kinesis.us-east-1.amazonaws.com. More valid Kinesis endpoint URLs |
|
Number of shard |
Enter the number of partitions (shards in terms of Kinesis) to be created in the target |
Advanced settings
|
Connection pool |
In this area, you configure, for each Spark executor, the connection pool used to control
|
|
Evict connections |
Select this check box to define criteria to destroy connections in the connection pool. The
|
Usage
|
Usage rule |
This component is used as an end component and requires an input link. This component needs a Write component such as tWriteJSONField to define a serializedValue column in the input schema to send serialized data. This component, along with the Spark Streaming component Palette it belongs to, appears Note that in this documentation, unless otherwise explicitly stated, a scenario presents |
|
Spark Connection |
You need to use the Spark Configuration tab in
the Run view to define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
|
Limitation |
Due to license incompatibility, one or more JARs required to use this component are not |
Related scenario
For a related scenario, see Working with Amazon Kinesis and Big Data Streaming Jobs.