tPatternMasking properties for Apache Spark Streaming
These properties are used to configure tPatternMasking running
in the Spark Streaming Job framework.
The Spark Streaming
tPatternMasking component belongs to the Data Quality family.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.
Basic settings
Schema and Edit Schema |
A schema is a row description. It defines the number of fields Click Sync Click Edit
The output schema of this component contains read-only
columns:
|
 |
Built-In: You create and store the schema locally for this component |
 |
Repository: You have already created the schema and stored it in the |
Modifications |
Define in the table what fields to change and how to change Column to mask: You can mask data from different columns but you need to Each column is processed sequentially, meaning that data masking In a colum, each data field is a fixed length field, except the last data For fixed length fields, each value must contain the same number of In a column, the last Enumeration or Enumeration For variable length fields, each value might not always contain the same Field type: Select the field type the data belongs to.
In the Values, Path, When the input data is invalid, meaning that a value does not match the pattern defined in |
Advanced settings
Method |
The component uses Format-Preserving Encryption (FPE) The FPE methods are bijective methods, except when The Basic method is the default Note: As the masking methods are stronger, it is recommended to use the FF1
algorithms rather than the Basic method. The FF1 with AES method is based You can use those methods only if the number of possible Note: Java 8u161 is the minimum
required version to use the FF1 with AES method. To be able to use this FPE method with Java versions earlier than 8u161, download the Java Cryptography Extension (JCE) unlimited strength jurisdiction policy files from Oracle website. The FF1 with AES and |
Password for FF1 |
Set the password |
Use tweaks with FF1 Encryption |
Select this If bijective |
Seed for random generator |
Set a random number if you want to generate If you do not set the seed, the component |
Encoding |
Select the encoding from the list or select Custom and define it manually. If you select Custom and leave the field empty, the supported When you set Field |
Output the original row? |
Select this check box to output original data rows in addition to the |
Should Null input return NULL? |
This check box is selected by If the input is |
Should EMPTY input return EMPTY? |
When this check box is selected, empty values are left unchanged in |
Send invalid data to “Invalid” output flow |
This check box is selected by default.
Invalid data are any values that do not match the pattern. |
Usage
Usage rule |
This component, along with the Spark Streaming component Palette it belongs to, appears This component is used as an intermediate step. You need to use the Spark Configuration tab in the This connection is effective on a per-Job basis. For further information about a Note that in this documentation, unless otherwise explicitly stated, a |
Spark Connection |
In the Spark
Configuration tab in the Run view, define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, you must specify the directory in the file system to which these jar files are transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |