tFlumeInput
Acts as interface to integrate Flume and the Spark Streaming Job developed with the
Studio to continuously read data from a given Flume agent.
tFlumeInput streams
data from a given Flume agent and sends this data to its following components.
tFlumeInput properties for Apache Spark Streaming
These properties are used to configure tFlumeInput running in the Spark Streaming Job framework.
The Spark Streaming
tFlumeInput component belongs to the Messaging
family.
The streaming version of this component is available in the Palette of the Studio only if you have subscribed to Talend Real-time Big Data Platform or Talend Data
Fabric.
Basic settings
Host and Port |
Enter the hostname and the port of the machine used as the sink (the data output point
|
Type |
Select the approach to read data from Flume.
For further information about these two approaches, see https://spark.apache.org/docs/1.3.1/streaming-flume-integration.html. |
Schema and Edit |
A schema is a row description. It defines the number of fields (columns) to Built-In: You create and store the Repository: You have already created This read-only line column is used by tFlumeInput to automatically extract the body of an input Flume event |
Advanced settings
Encoding |
Select the encoding from the list or select Custom and define it manually. This encoding is used by tFlumeInput to decode the input |
Usage
Usage rule |
This component is used as a start component and requires an output link. At runtime, the tFlumeInput component keeps listening to This component, along with the Spark Streaming component Palette it belongs to, appears Note that in this documentation, unless otherwise explicitly stated, a scenario presents |
Spark Connection |
You need to use the Spark Configuration tab in
the Run view to define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |
Limitation |
Due to license incompatibility, one or more JARs required to use this component are not |
Related scenarios
No scenario is available for the Spark Streaming version of this component
yet.