
Component family |
File/Input |
|
Function |
tFileInputMSDelimited reads a |
|
Purpose |
tFileInputMSDelimited opens a |
|
Basic settings |
Multi Schema Editor |
The [Multi Schema Editor] helps For more information, see The Multi Schema Editor. |
|
Output |
Lists all the schemas you define in the [Multi Schema Editor], along with the related record |
|
Die on error |
Select this check box to stop the execution of the Job when an |
Advanced settings |
Trim all column |
Select this check box to remove leading and trailing whitespaces |
|
Validate date |
Select this check box to check the date format strictly against |
|
Advanced separator (for numbers) |
Select this check box to modify the separators used for
Thousands separator: define
Decimal separator: define |
|
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at a |
Global Variables |
NB_LINE: the number of rows processed. This is an After ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
Use this component to read multi-structured delimited files and |
|
Log4j |
The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html. |
|
Limitation |
Due to license incompatibility, one or more JARs required to use this component are not |
The [Multi Schema Editor] enables you to:
-
set the path to the source file,
-
define the source file properties,
-
define data structure for each of the output schemas.
Note
When you define data structure for each of the output schemas in the [Multi Schema Editor], column names in the different
data structures automatically appear in the input schema lists of the components
that come after tFileInputMSDelimited. However,
you can still define data structures directly in the Basic
settings view of each of these components.
The [Multi Schema Editor] also helps to declare
the schema that should act as the source schema (primary key) from the incoming data
to insure its unicity.The editor uses this mapping to associate all schemas
processed in the delimited file to the source schema in the same file.
Note
The editor opens with the first column, that usually holds the record type
indicator, selected by default. However, once the editor is open, you can select
the check box of any of the schema columns to define it as a primary key.
The below figure illustrates an example of the [Multi Schema
Editor].

For detailed information about the usage of the Multi Schema
Editor, see Scenario: Reading a multi structure delimited file.
The following scenario creates a Java Job which aims at reading three schemas in a
delimited file and displaying their data structure on the Run Job
console.
The delimited file processed in this example looks like the following:

-
Drop a tFileInputMSDelimited component
and three tLogRow components from the
Palette onto the design
workspace. -
In the design workspace, right-click tFileInputMSDelimited and connect it to tLogRow1, tLogRow2, and tLogRow3
using the row_A_1, row_B_1, and row_C_1 links
respectively.
-
Double-click tFileInputMSDelimited to
open the Multi Schema Editor. -
Click Browse… next to the File name field to locate the multi schema
delimited file you need to process. -
In the File Settings area:
-Select from the list the encoding type the source file is encoded in.
This setting is meant to ensure encoding consistency throughout all input
and output files.-Select the field and row separators used in the source file.
Note
Select the Use Multiple Separator
check box and define the fields that follow accordingly if different
field separators are used to separate schemas in the source file.A preview of the source file data displays automatically in the Preview panel.
Note
Column 0 that usually holds the
record type indicator is selected by default. However, you can select
the check box of any of the other columns to define it as a primary
key. -
Click Fetch Codes to the right of the
Preview panel to list the type of
schema and records you have in the source file. In this scenario, the source
file has three schema types (A, B, C).Click each schema type in the Fetch Codes
panel to display its data structure below the Preview panel. -
Click in the name cells and set column names for each of the selected
schema.In this scenario, column names read as the following:
-Schema A: Type, DiscName, Author,
Date,-Schema B: Type,
SongName,-Schema C: Type,
LibraryName.You need now to set the primary key from the incoming data to insure its
unicity (DiscName in this scenario). To do that: -
In the Fetch Codes panel, select the
schema holding the column you want to set as the primary key (schema
A in this scenario) to display its data
structure. -
Click in the Key cell that corresponds to the
DiscName column and select the check box that
appears. -
Click anywhere in the editor and the false in the
Key cell will become
true.You need now to declare the parent schema by which you want to group the
other “children” schemas (DiscName in this scenario).
To do that: -
In the Fetch Codes panel, select schema
B and click the right arrow button to move it to
the right. Then, do the same with schema C.Note
The Cardinality field is not
compulsory. It helps you to define the number (or range) of fields in
“children” schemas attached to the parent schema. However, if you set
the wrong number or range and try to execute the Job, an error message
will display. -
In the [Multi Schema Editor], click
OK to validate all the changes you did
and close the editor.The three defined schemas along with the corresponding record types and
field separators display automatically in the Basic
settings view of tFileInputMSDelimited.The three schemas you defined in the [Multi Schema
Editor] are automatically passed to the three tLogRow components. -
If needed, click the Edit schema button
in the Basic settings view of each of the
tLogRow components to view the input
and output data structures you defined in the Multi
Schema Editor or to modify them.