August 17, 2023

tAdvancedFileOutputXML – Docs for ESB 5.x

tAdvancedFileOutputXML

tAdvancedFileOutputXML_icon32_white.png

tAdvancedFileOutputXML properties

Component family

XML or File/Output

 

Function

tAdvancedFileOutputXML outputs
data to an XML type of file and offers an interface to deal with
loop and group by elements if needed.

Purpose

tAdvancedFileOutputXML writes an
XML file with separated data values according to an XML tree
structure.

Basic settings

Property type

Either Built-in or Repository.

Since version 5.6, both the Built-In mode and the Repository mode are
available in any of the Talend solutions.

 

 

Built-in: No property data stored
centrally.

 

 

Repository: Select the Repository
file where Properties are stored. The following fields are
pre-filled in using fetched data.

Use Output Stream

Select this check box process the data flow of interest. Once you
have selected it, the Output Stream
field displays and you can type in the data flow of interest.

The data flow to be processed must be added to the flow in order
for this component to fetch these data via the corresponding
representative variable.

This variable could be already pre-defined in your Studio or
provided by the context or the components you are using along with
this component; otherwise, you could define it manually and use it
according to the design of your Job, for example, using tJava or tJavaFlex.

In order to avoid the inconvenience of hand writing, you could
select the variable of interest from the auto-completion list
(Ctrl+Space) to fill the
current field on condition that this variable has been properly
defined.

For further information about how to use a stream, see Scenario 2: Reading data from a remote file in streaming mode.

 

File name

Name or path to the output file and/or the variable to be used.

This field becomes unavailable once you have selected the
Use Output Stream check
box.

For further information about how to define and use a variable in
a Job, see Talend Studio
User Guide.

 

Configure XML tree

Opens the dedicated interface to help you set the XML mapping. For
details about the interface, see Defining the XML tree.

 

Schema and Edit
Schema

A schema is a row description, it defines the number of fields
that will be processed and passed on to the next component. The
schema is either built-in or remote in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are
available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the
current schema is of the Repository type, three options are
available:

  • View schema: choose this option to view the
    schema only.

  • Change to built-in property: choose this option
    to change the schema to Built-in for local
    changes.

  • Update repository connection: choose this option to change
    the schema stored in the repository and decide whether to propagate the changes to
    all the Jobs upon completion. If you just want to propagate the changes to the
    current Job, you can select No upon completion and
    choose this schema metadata again in the [Repository
    Content]
    window.

 

 

Built-in: The schema will be
created and stored locally for this component only. Related topic:
see Talend Studio User Guide.

 

 

Repository: The schema already
exists and is stored in the Repository, hence can be reused in
various projects and job designs. Related topic: see
Talend Studio User
Guide
.

 

Sync columns

Click to synchronize the output file schema with the input file
schema. The Sync function only displays once the Row connection is
linked with the Output component.

 

Append the source xml file

Select this check box to add the new lines at the end of your
source XML file.

 

Generate compact file

Select this check box to generate a file that does not have any
empty space or line separators. All elements then are presented in a
unique line and this will reduce considerably file size.

 

Include DTD or XSL

Select this check box to to add the DOCTYPE declaration,
indicating the root element, the access path and the DTD file, or to
add the processing instruction, indicating the type of stylesheet
used (such as XSL types), along with the access path and file
name.

Advanced settings

Split output in several files

If the XML file output is big, you can split the file every
certain number of rows.

 

Trim data

This check box is activated when you are using the dom4j
generation mode. Select this check box to trim the leading or
trailing whitespace from the value of a XML element.

 

Create directory only if not exists

This check box is selected by default. It creates a directory to
hold the output XML files if required.

 

Create empty element if needed

This box is selected by default. If no column is associated to an
XML node, this option will create an open/close tag in place of the
expected tag.

 

Create attribute even if its value is NULL

Select this check box to generate XML tag attribute for the
associated input column whose value is null.

 

Create attribute even if it is unmapped

Select this check box to generate XML tag attribute for the
associated input column that is unmapped.

 

Create associated XSD file

If one of the XML elements is defined as a Namespace element, this
option will create the corresponding XSD file.

Note

To use this option, you must select Dom4J as the generation mode.

 

Add Document type as node

Select this check box to add column(s) of the Document type as node(s) instead of
escaped string(s) in the output XML file.

This check box appears only when the generation mode is set to
Slow and memory-consuming
(Dom4j)
in the Advanced
settings
tab.

 

Advanced separator (for number)

Select this check box to change the expected data
separator.

Thousands separator: define the
thousands separator, between inverted commas

Decimal separator: define the
decimals separator between inverted commas

 

Generation mode

Select the appropriate generation mode according to your memory
availability. The available modes are:

  • Slow and memory-consuming
    (Dom4j)

    Note

    This option allows you to use dom4j to process the XML
    files of high complexity.

  • Fast with low memory
    consumption

Once you select Append the source xml
file
in the Basic
settings
view, this field disappears because in this
situation, your generation mode is set automatically as
dom4j.

 

Encoding

Select the encoding from the list or select Custom and define it
manually. This field is compulsory for DB data handling.

 

Don’t generate empty file

Select the check box to avoid the generation of an empty
file.

 

tStatCatcher Statistics

Select the check box to collect the log data at a Job level as
well as at each component level.

Global Variables

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

NB_LINE: the number of rows processed. This is an After
variable and it returns an integer.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio
User Guide.

Usage

Use this component to write an XML file with data passed on from
other components using a Row link.

Log4j

The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User
Guide
.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

n/a

Defining the XML tree

Double-click on the tAdvancedFileOutputXML
component to open the dedicated interface or click on the three-dot button on the
Basic settings vertical tab of the Component Settings tab.

XMLtree_adv_mapping1.png

To the left of the mapping interface, under Schema
List
, all of the columns retrieved from the incoming data flow are
listed (on the condition that an input flow is connected to the tAdvancedFileOutputXML component).

To the right of the interface, define the XML structure you want to obtain as
output.

You can easily import the XML structure or create it manually, then map the input
schema columns onto each corresponding element of the XML tree.

Importing the XML tree

The easiest and most common way to fill out the XML tree panel, is to import a
well-formed XML file.

  1. Rename the root tag that displays by
    default on the XML tree panel, by
    clicking on it once.

  2. Right-click on the root tag to display the contextual menu.

  3. On the menu, select Import XML
    tree
    .

  4. Browse to the file to import and click OK.

    Note

    • You can import an XML tree from files in XML, XSD and DTD
      formats.

    • When importing an XML tree structure from an XSD file, you
      can choose an element as the root of your XML tree.

    XMLtree_adv_mapping2.png

The XML Tree column is hence automatically
filled out with the correct elements. You can remove and insert elements or
sub-elements from and to the tree:

  1. Select the relevant element of the tree.

  2. Right-click to display the contextual menu

  3. Select Delete to remove the
    selection from the tree or select the relevant option among: Add sub-element, Add
    attribute
    , Add namespace
    to enrich the tree.

Creating the XML tree manually

If you don’t have any XML structure defined as yet, you can create it
manually.

  1. Rename the root tag that displays by
    default on the XML tree panel, by
    clicking on it once.

  2. Right-click on the root tag to display the contextual menu.

  3. On the menu, select Add sub-element to create the first element of the
    structure.

You can also add an attribute or a child element to any element of the tree or
remove any element from the tree.

  1. Select the relevant element on the tree you just created.

  2. Right-click to the left of the element name to display the contextual
    menu.

  3. On the menu, select the relevant option among: Add sub-element, Add
    attribute
    , Add namespace
    or Delete.

Mapping XML data

Once your XML tree is ready, you can map each input column with the relevant XML
tree element or sub-element to fill out the Related
Column
:

  1. Click on one of the Schema column
    name
    .

  2. Drag it onto the relevant sub-element to the right.

  3. Release to implement the actual mapping.

XMLtree_adv_mapping3.png

A light blue link displays that illustrates this mapping. If available, use the
Auto-Map button, located to the bottom left of
the interface, to carry out this operation automatically.

You can disconnect any mapping on any element of the XML tree:

  1. Select the element of the XML tree, that should be disconnected from its
    respective schema column.

  2. Right-click to the left of the element name to display the contextual
    menu.

  3. Select Disconnect linker.

The light blue link disappears.

Defining the node status

Defining the XML tree and mapping the data is not sufficient. You also need to
define the loop element and if required the group element.

Loop element

The loop element allows you to define the iterating object. Generally the Loop
element is also the row generator.

To define an element as loop element:

  1. Select the relevant element on the XML tree.

  2. Right-click to the left of the element name to display the contextual
    menu.

  3. Select Set as Loop Element.

XMLtree_adv_mapping4.png

The Node Status column shows the newly added
status.

Note

There can only be one loop element at a time.

Group element

The group element is optional, it represents a constant element where the
groupby operation can be performed. A group element can be defined on the
condition that a loop element was defined before.

When using a group element, the rows should sorted, in order to be able to
group by the selected node.

To define an element as group element:

  1. Select the relevant element on the XML tree.

  2. Right-click to the left of the element name to display the contextual
    menu.

  3. Select Set as Group Element.

XMLtree_adv_mapping5.png

The Node Status column shows the newly added
status and any group status required are automatically defined, if
needed.

Click OK once the mapping is complete to
validate the definition and continue the job configuration where needed.

Scenario: Creating an XML file using a loop

The following scenario describes the creation of an XML file from a sorted flat file
gathering a video collection.

Use_Case_tAdvancedFileOutputXML1.png

Configuring the source file

  1. Drop a tFileInputDelimited and a tAdvancedFileOutputXML from the Palette onto the design workspace.

  2. Alternatively, if you configured a description for the input delimited file in
    the Metadata area of the Repository, then you can directly drag & drop the metadata
    entry onto the editor, to set up automatically the input flow.

  3. Right-click on the input component and drag a row main link towards the
    tAdvancedFileOutputXML component to
    implement a connection.

  4. Select the tFileInputDelimited component and
    display the Component settings tab located in
    the tab system at the bottom of the Studio.

    Use_Case_tAdvancedFileOutputXML2.png
  5. Select the Property type, according to
    whether you stored the file description in the Repository or not. If you dragged
    & dropped the component directly from the Metadata, no changes to the
    setting should be needed.

    If you didn’t setup the file description in the Repository, then select Built-in and manually fill out the fields displayed on the
    Basic settings vertical tab.

    The input file contains the following type of columns separated by
    semi-colons: id, name,
    category, year,
    language, director and
    cast.

    Use_Case_tAdvancedFileOutputXML3.png

    In this simple use case, the Cast field
    gathers different values and the id increments when changing movie.

  6. If needed, define the tFileDelimitedInput
    schema according to the file structure.

    Use_Case_tAdvancedFileOutputXML4.png
  7. Once you checked that the schema of the input file meets your expectation,
    click on OK to validate.

Configuring the XML output and mapping

  1. Then select the tAdvancedFileOutputXML
    component and click on the Component settings
    tab to configure the basic settings as well as the mapping. Note that a
    double-click on the component will open directly the mapping interface.

    Use_Case_tAdvancedFileOutputXML5.png
  2. In the File Name field, browse to the file to
    be written if it exists or type in the path and file name that needs to be
    created for the output.

    By default, the schema (file description) is automatically propagated from the
    input flow. But you can edit it if you need.

  3. Then click on the three-dot button or double-click on the tAdvancedFileOutputXML component on the design
    workspace to open the dedicated mapping editor.

    To the left of the interface, are listed the columns from the input file
    description.

  4. To the right of the interface, set the XML tree panel to reflect the expected
    XML structure output.

    You can create the structure node by node. For more information about the
    manual creation of an XML tree, see Defining the XML tree.

    In this example, an XML template is used to populate the XML tree
    automatically.

  5. Right-click on the root tag displaying by
    default and select Import XML tree at the end
    of the contextual menu options.

  6. Browse to the XML file to be imported and click OK
    to validate the import operation.

    Note

    You can import an XML tree from files in XML, XSD and DTD formats.

  7. Then drag & drop each column name from the Schema
    List
    to the matching (or relevant) XML
    tree
    elements as described in Mapping XML data.

    The mapping is shown as blue links between the left and right panels.

    Use_Case_tAdvancedFileOutputXML6.png

    Finally, define the node status where the loop should take place. In this use
    case, the Cast being the changing element on which the
    iteration should operate, this element will be the loop element.

    Right-click on the Cast element on the XML tree, and select Set as loop element.

  8. To group by movie, this use case needs also a group element to be
    defined.

    Right-click on the Movie parent node of the XML tree, and select Set as group element.

    The newly defined node status show on the corresponding element lines.

  9. Click OK to validate the
    configuration.

  10. Press F6 to execute the Job.

    Use_Case_tAdvancedFileOutputXML7.png

    The output XML file shows the structure as defined.


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x