tDBFSGet
Copies files from a given DBFS (Databricks Filesystem) system, pastes them in a
user-defined directory and if needs be, renames them.
The DBFS (Databricks Filesystem) components are designed for quick and straightforward data transferring with Databricks. If you need to handle more sophisticated scenarios for optimal performance, use Spark Jobs with Databricks.
tDBFSGet Standard properties
These properties are used to configure tDBFSGet running in the Standard Job framework.
The Standard
tDBFSGet component belongs to the Big Data and the File families.
The component in this framework is available in all Talend products with Big Data
and in Talend Data Fabric.
Basic settings
Property type |
Either Built-In or Repository. Built-In: No property data stored centrally.
Repository: Select the repository file where the |
Use an existing connection |
Select this check box and in the Component List click the HDFS connection component from which Note that when a Job contains the parent Job and the child Job, |
Endpoint |
In the Endpoint |
Token |
Click the […] button |
DBFS directory |
In the DBFS directory field, enter the path pointing |
Local directory |
Browse to, or enter the local directory to store the files copied from |
Overwrite file |
Options to overwrite or not the existing file with the new one. |
Include subdirectories |
Select this check box if the selected input source type includes |
Files |
In the Files area, the fields to be completed are: – File mask: type in the file name to be selected from – New name: give a new name to the obtained file. |
Die on error |
Select the check box to stop the execution of the Job when an error Clear the check box to skip any rows on error and complete the |
Advanced settings
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at the Job level |
Usage
Usage rule |
This component combines DBFS connection and data extraction, thus used as a It runs standalone and does not generate input or output flow for the other components. It is often connected to the Job using OnSubjobOk or OnComponentOk link, depending on the context. |