Masking Medicare beneficiary identifiers
This scenario applies only to Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, Talend MDM Platform and Talend Data Fabric.
Using the tPatternMasking component, you can replace personally
identifiable information, such as Medicare Beneficiary Identifiers (MBI), with realistic
values in a consistent manner.
consists of 11 characters, excluding dashes, and uses the following pattern:
- A digit in the 1 to 9 range
- A letter in the A to Z range (minus S, L, O, I, B, Z)
- A digit or a letter in the A to Z range (minus S, L, O, I, B, Z)
- A digit in the 0 to 9 range
- A letter in the A to Z range (minus S, L, O, I, B, Z)
- A digit or a letter in the A to Z range (minus S, L, O, I, B, Z)
- A digit in the 0 to 9 range
- A letter in the A to Z range (minus S, L, O, I, B, Z)
- A letter in the A to Z range (minus S, L, O, I, B, Z)
- A digit in the 0 to 9 range
- A digit in the 0 to 9 range
For example, 1EG4-TE5-MK73 is a valid MBI.
This scenario describes a Job which uses the following components:
-
the tFixedFlowInput component generates MBIs;
-
the tPatternMasking component replaces the original MBIs with
random digits or letters from a set of named values, or a random digit from a
specified range; -
the tLogRow component outputs the substitute data set.
- Setting up the Job
- Configuring the input component
- Configuring the masking operations
The alpha_values.csv file contains the allowed alphabetic values: all letters in the A to Z range (minus S, L, O, I, B, Z). The alphanum_values.csv file contains the allowed alphanumeric values: the values from alpha_values.csv and digits. - Configuring the output component and executing the Job