August 17, 2023

tAlfrescoOutput – Docs for ESB 5.x

tAlfrescoOutput

tAlfrescoOutput_icon32_white.png

tAlfrescoOutput Properties

Component family

Business

 

Function

Creates dematerialized documents in an Alfresco server where they
are indexed under meaningful models.

Purpose

Allows to create and manage documents in an Alfresco
server.

Basic settings

URL

Type in the URL to connect to the Alfresco Web application.

 

Login and Password

Type in the user authentication data to the Alfresco
server.

To enter the password, click the […] button next to the
password field, and then in the pop-up dialog box enter the password between double quotes
and click OK to save the settings.

Target Location

Base

Type in the base path where to put the document, or

Select the Map… check box and
then in the Column list, select the
target location column.

Note: When you type in the base
name, make sure to use the double backslash (\) escape
character.

Create Or Update Mode

Document Mode

Select in the list the mode you want to use for the created
document.

Create only: creates a document if
it does not exist.

Note that an error message will display if you try to create a
document that already exists

Create or update: creates a
document if it does not exist or updates the document if it
exists.

 

Container Mode

Select in the list the mode you want to use for the destination
folder in Alfresco.

Update only: updates a destination
folder if the folder exists.

Note that an error message will display if you try to update a
document that does not exist

Create or update: creates a
destination folder if it does not exist or updates the destination
folder if it exists.

 

Define Document Type

Click the three-dot button to display the tAlfrescoOutput editor. This editor enables you
to:

– select the file where you defined the metadata according to
which you want to save the document in Alfresco

-define the type f the document

-select any of the aspects in the available
aspects
list of the model file and click the plus
button to add it in the list to the left.

 

Property Mapping

Displays the parameters you set in the tAlfrescoOutput editor and according to which the
document will be created in the Alfresco server.

Note that in the Property Mapping
area, you can modify any of the input schemas.

 

Schema and Edit
schema

A schema is a row description. It defines the number of fields to be processed and passed on
to the next component. The schema is either Built-In or
stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are
available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the
current schema is of the Repository type, three options are
available:

  • View schema: choose this option to view the
    schema only.

  • Change to built-in property: choose this option
    to change the schema to Built-in for local
    changes.

  • Update repository connection: choose this option to change
    the schema stored in the repository and decide whether to propagate the changes to
    all the Jobs upon completion. If you just want to propagate the changes to the
    current Job, you can select No upon completion and
    choose this schema metadata again in the [Repository
    Content]
    window.

 

Result Log File Name

Browse to the file where you want to save any logs related to the
Job execution.

 

Die on error

This check box is cleared by default, meaning to skip the row on
error and to complete the process for error-free rows. If needed,
you can retrieve the rows on error via a Row > Rejects
link.


Advanced settings

Configure Target Location Container

Allows to configure the (by default) type of containers
(folders)

Select this check box to display new fields where you can modify
the container type to use your own created types based on the
father/child model.

Permissions

Configure Permissions

When selected, allows to manually configure access rights to
containers and documents.

Select the Inherit Permissions
check box to synchronize access rights between containers and
documents.

Click the Plus button to add new
lines to the Permissions list, then
you can assign roles to user or group columns.

 

Encoding

Select the encoding type from the list or select Custom and define it manually. This field
is compulsory.

 

Association Target Mapping

Allows to create new documents in Alfresco with associated links
towards other documents already existing in Alfresco, to facilitate
the navigation process for example.

To create associations:

  1. Open the tAlfresco
    editor.

  2. Click the Add button and
    select a model where you have already defined aspects that
    contain associations.

  3. Click the drop-down arrow at the top of the editor and
    select the corresponding document type.

  4. Click OK to close the
    editor and display the created association in the Association Target Mapping
    list.

 

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a
Job level as well as at each component level.

Global Variables

NB_LINE: the number of rows read by an input component or
transferred to an output component. This is an After variable and it returns an
integer.

NB_LINE_REJECTED: the number of rows rejected. This is an
After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio
User Guide.

Usage

Usually used as an output component. An input component is
required.

Limitation/Prerequisites

To be able to use the tAlfrescoOutput component, few relevant resources
need to be installed: check the Installation
Procedure
sub section below for more information.

Due to license incompatibility, one or more JARs required to use this component are not
provided. You can install the missing JARs for this particular component by clicking the
Install button on the Component tab view. You can also find out and add all missing JARs easily on
the Modules tab in the Integration perspective
of your studio. For details, see https://help.talend.com/display/KB/How+to+install+external+modules+in+the+Talend+products
or the section describing how to configure the Studio in the Talend Installation and Upgrade
Guide
.

Installation procedure

To be able to use tAlfrescoOutput in the
Integration perspective of Talend Studio, you need first to install the Alfresco server with
few relevant resources.

The below sub sections detail the prerequisite and the installation
procedure.

Prerequisites

Start with the following operations:

  1. Download the file
    alfresco-community-tomcat-2.1.0.zip

  2. Unzip the file in an installation folder, for example:

  3. Install JDK 1.6.0+

  4. Update the environment variable

  5. From the installation folder (C:alfresco), launch the
    alfresco server using the script alf_start.bat

Warning

Make sure that the Alfresco server is launched
correctly before start using the tAlfrescoOutput component.

Installing the Talend
Alfresco module

Note that the talendalfresco_20081014.zip is provided with the
tAlfrescoOutput component in the Integration perspective of Talend Studio.

To install the talendalfresco module:

  1. From talendalfresco_20081014.zip and in the
    talendalfresco_20081014alfresco folder, look for the
    following jars: stax-api-1.0.1.jar, wstx-lgpl-3.2.7.jar,
    talendalfresco-client_1.0.jar, and
    talendalfresco-alfresco_1.0.jar and move them to
    C:alfresco omcatwebappsalfrescoWEB-INFlib

  2. Add the authentification filter of the commands to the
    web.xml file located in the path

    following the model of the example provided in
    talendalfresco_20081014/alfresco folder of the zipped
    file talendalfresco_20081014.zip

    The following figures show the portion of lines (in blue) to add in
    the file web.xml alfresco.

    Use_Case_tAlfrescoOutput_Install.png
    Use_Case_tAlfrescoOutput_Install1.png
Useful information for advanced use

Installing new types for Alfresco:

From the package_jeu_test.zip and in the
package_jeu_test/fichiers_conf_alfresco2.1 folder, look for the
following files: xml H76ModelCustom.xml (description of the model),
web-client-config-custom.xml (web interface of the model), and
custom-model-context.xml (registration of the new model) and
paste them in the following folder:
C:/alfresco/tomcat/shared/classes/alfresco/extension

Dates:

  • The dates must be of the Talend date type
    java.util.Date.

  • Columns without either mapping or default values, for example of the
    type Date, are written as empty strings.

  • Solution: delete all columns without mapping or default values. Note
    that any modification of the type Alfresco will put them back.

Content:

  • Do not mix up between the file path which content you want to create
    in Alfresco and its target location in Alfresco.

  • Provide a URL! It can target various protocols, among which are file,
    HTTP and so on.

  • For URLs referring to files on the file system, precede them by
    “file:” for Windows used locally, and by “file://” for Windows on a
    network (which accepts as well “file: “) or for Linux.

  • Do not double the backslash in the target base path (automatic
    escape), unless you type in the path in the basic settings of the
    tAlfrescoOutput component, or doing
    concatenation in the tMap editor for
    example.

Multiple properties or associations:

  • It is possible to create only one association by document if it is
    mapped to a string value, or one or more associations by document if it
    is mapped to a list value (object).

  • You can empty an association by mapping it to an empty list, which you
    can create, for example, by using new
    java.util.ArrayList()
    in the tMap
    component.

However, it is impossible to delete an association.

Building List(object)with tAggregate:

  • define the table of the relation n-n in a file, containing a
    name line for example (included in the input rows), and
    a category line (that can be defined with its mapping in a
    third file).

  • group by: input name, output name.

  • operation: output categoryList, function
    list(object), input category. ATTENTION
    list (object) and non simple list.

References (documents and folders):

  • References are created by mapping one or more existing reference nodes
    (xpath or namepath) using String type or
    List(object).

  • An error in the association or the property of the reference type does
    not prevent the creation of the node that holds the reference.

  • Properties of the reference type are created in the Basic Settings view.

  • Associations are created in the Advanced
    Settings
    view.

Dematerialization, tAlfrescoOutput, and Enterprise Content Management

Dematerialization is the process that convert documents held in physical form into
electronic form, and thus helps to move away from the use of physical documentation
to the use of electronic Enterprise Content Management (ECM) systems. The range of
documents that can be managed with an Enterprise Content Management system include
just about everything from basic documents to stock certificates, for example.

Enterprises dematerialize their content via a manual document handling, done by
man, or an automatic document handling, machine-based.

Considering the varied nature of the content to be dematerialized, enterprises
have to use varied technologies to do it. Scanning paper documents, creating
interfaces to capture electronic documents from other applications, converting
document images into machine-readable/editable text documents, and so on are
examples of the technologies available.

Furthermore, scanned documents and digital faxes are not readable texts. To
convert them into machine-readable characters, different character recognition
technologies are used. Handwritten Character Recognition (HCR) and Optical Mark
Recognition (OMR) are two examples of such technologies.

Equally important as the content that is captured in various formats from numerous
sources in the dematerialization process is the supporting metadata that allows
efficient identification of the content via specific queries.

Now how can this document content along with the related metadata be aggregated
and indexed in an Enterprise Content Management system so that it can be retrieved
and managed in meaningful ways? Talend provides the answer through
the tAlfrescoOutput component.

The tAlfrescoOutput component allows you to stock
and manage your electronic documents and the related metadata on the Alfresco
server, the leading open source enterprise content management system.

The following figure illustrates Talend‘s role between the
dematerialization process and the Enterprise Content Management system
(Alfresco).

Use_Case_tAlfrescoOutput_GeneralConcept.png

Scenario: Creating documents on an Alfresco server

This Java scenario describes a two-component Job which aims at creating two document
files with the related metadata in an Alfresco server, the java-based Enterprise Control
Management system.

Setting up your Job

  1. Drop the tFileInputDelimited and
    tAlfrescoOutput components from the
    Palette onto the design workspace.

  2. Connect the two components together using a Main > Row
    connection.

    Use_Case_tAlfrescoOutput.png

Setting up the schema

  1. In the design workspace, double-click tFileInputDelimited to display its basic settings.

  2. Set the File Name path and all related
    properties. Note that if you have already
    stored your input schemas locally in the Repository, you can simply drop the relevant file item
    from the Metadata folder onto the
    design workspace and the delimited file settings will automatically
    display in the relevant fields in the component Basic settings view.

    Note

    For more information about metadata, see Setting up a File Delimited schema in Talend Studio
    User Guide.

    Use_Case_tAlfrescoOutput1.png

In this scenario, the delimited file provides the metadata and path of two
documents we want to create in the Alfresco server. The input schema for the
documents consists of four columns: file_name,
destination_folder name, source_path, and author.

Use_Case_tAlfrescoOutput2.png

And therefore the input schema of the delimited file will be as the
following:

Use_Case_tAlfrescoOutput3.png

Setting up the connection to the Alfresco server

  1. In the design workspace, double-click tAlfrescoOutput to display its basic settings.

    Use_Case_tAlfrescoOutput4.png
  2. In the Alfresco Server area, enter the
    Alfresco server URL and user authentication information in the corresponding
    fields.

  3. In the TargetLocation area, either type
    in the base name where to put the document in the server, or Select the
    Map… check box and then in the
    Column list, select the target location
    column, destination_folder_name in this scenario.

    Note

    When you type in the base name, make sure to use the double backslash
    (\) escape character.

  4. In the Document Mode list, select the
    mode you want to use for the created documents.

  5. In the Container Mode list, select the
    mode you want to use for the destination folder in Alfresco.

Defining the document

  1. Click the Define Document Type three-dot
    button to open the tAlfrescoOutput
    editor.

    Use_Case_tAlfrescoOutput5.png
  2. Click the Add button to browse and
    select the xml file that holds the metadata according to which you want to
    save the documents in Alfresco.

    All available aspects in the selected model file display in the Available Aspects list.

    Note

    You can browse for this model folder locally or on the network. After
    defining the aspects to use for the document to be created in Alfresco,
    this model folder is not needed any more.

  3. If needed, select in the Available Aspects
    list the aspect(s) to be included in the metadata to write in the
    Alfresco server. In this scenario we want the author name to be part of the
    metadata registered in Alfresco.

  4. Click the drop-down arrow at the top of the editor to select from the
    list the type to give to the created document in Alfresco,
    Content in this scenario.

    All the defined aspects used to select the metadata to write in the
    Alfresco server display in the Property
    Mapping
    list in the Basic Settings
    view of tAlfrescoOutput, three
    aspects in this scenario, two basic for the Content type
    (content and name) and an additional one
    (author).

Executing your Job

  1. Click Sync columns to auto propagate all
    the columns of the delimited file.

    If needed, click Edit schema to view the
    output data structure of tAlfrescoOutput.

    Use_Case_tAlfrescoOutput6.png
  2. Click the three-dot button next to the Result Log
    File Name
    field and browse to the file where you want to save
    any logs after Job execution.

  3. Save your Job, and press F6 to execute
    it.

    Use_Case_tAlfrescoOutput7.png

    The two documents are created in Alfresco using the metadata provided in
    the input schemas.

    Use_Case_tAlfrescoOutput8.png

Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x