tAlfrescoOutput (deprecated)
meaningful models.
Alfresco server installation procedure (deprecated)
To be able to use tAlfrescoOutput in the
Integration
perspective of
Talend Studio
,
you need first to install the Alfresco server with few relevant resources.
The below sub sections detail the prerequisite and the installation procedure.
Prerequisites
Start with the following operations:
-
Download the file
alfresco-community-tomcat-2.1.0.zip
-
Unzip the file in an installation folder, for example:
C:Program FilesJavajdk1.6.0_27
- Install JDK 1.6.0+
-
Update the environment variable
JAVA_HOME (JAVA_HOME= C:alfresco)
-
From the installation folder (
C:alfresco
), launch the
alfresco server using the scriptalf_start.bat
Make sure that the Alfresco server is launched
correctly before start using the tAlfrescoOutput component.
Installing the Talend
Alfresco module
Note that the talendalfresco_20081014.zip
is provided with the
tAlfrescoOutput component in the Integration perspective of
Talend Studio
.
To install the talendalfresco
module:
-
From
talendalfresco_20081014.zip
and in the
talendalfresco_20081014alfresco
folder, look for the
following jars:stax-api-1.0.1.jar, wstx-lgpl-3.2.7.jar
,
talendalfresco-client_1.0.jar
, and
talendalfresco-alfresco_1.0.jar
and move them to
C:alfresco omcatwebappsalfrescoWEB-INFlib
-
Add the authentification filter of the commands to the
web.xml
file located in the path
C:alfresco omcatwebappsalfrescoWEB-INF
son WEB-INF/
following the model of the example provided in
talendalfresco_20081014/alfresco
folder of the zipped file
talendalfresco_20081014.zip
The following figures show the portion of lines (in blue) to add in the
fileweb.xml alfresco.
Useful information for advanced use
Installing new types for Alfresco:
From the package_jeu_test.zip
and in the
package_jeu_test/fichiers_conf_alfresco2.1
folder, look for the
following files: xml H76ModelCustom.xml
(description of the model),
web-client-config-custom.xml
(web interface of the model), and
custom-model-context.xml
(registration of the new model) and paste
them in the following folder:
C:/alfresco/tomcat/shared/classes/alfresco/extension
Dates:
-
The dates must be of the
Talend
date type
java.util.Date
. -
Columns without either mapping or default values, for example of the type
Date
, are written as empty strings. -
Solution: delete all columns without mapping or default values. Note that
any modification of the type Alfresco will put them back.
Content:
-
Do not mix up between the file path which content you want to create in
Alfresco and its target location in Alfresco. -
Provide a URL! It can target various protocols, among which are file, HTTP
and so on. -
For URLs referring to files on the file system, precede them by “file:”
for Windows used locally, and by “file://” for Windows on a network (which
accepts as well “file: “) or for Linux. -
Do not double the backslash in the target base path (automatic escape),
unless you type in the path in the basic settings of the tAlfrescoOutput component, or doing concatenation
in the tMap editor for example.
Multiple properties or associations:
-
It is possible to create only one association by document if it is mapped
to a string value, or one or more associations by document if it is mapped
to a list value (object). -
You can empty an association by mapping it to an empty list, which you can
create, for example, by usingnew java.util.ArrayList()
in the
tMap component.
However, it is impossible to delete an association.
Building List(object)
with tAggregate:
-
define the table of the relation n-n in a file, containing a
name
line for example (included in the input rows), and a
category
line (that can be defined with its mapping in a
third file). -
group by: input name, output name.
-
operation: output
categoryList
, function
list(object)
, inputcategory
. ATTENTION list
(object) and non simple list.
– References (documents and folders):
-
References are created by mapping one or more existing reference nodes
(xpath or namepath) usingString
type or
List(object).
-
An error in the association or the property of the reference type does not
prevent the creation of the node that holds the reference. -
Properties of the reference type are created in the Basic Settings view.
-
Associations are created in the Advanced
Settings view.
Dematerialization, tAlfrescoOutput, and Enterprise Content Management
Dematerialization is the process that convert documents held in physical form into
electronic form, and thus helps to move away from the use of physical documentation to
the use of electronic Enterprise Content Management (ECM) systems. The range of
documents that can be managed with an Enterprise Content Management system include just
about everything from basic documents to stock certificates, for example.
Enterprises dematerialize their content via a manual document handling, done by man,
or an automatic document handling, machine-based.
Considering the varied nature of the content to be dematerialized, enterprises have to
use varied technologies to do it. Scanning paper documents, creating interfaces to
capture electronic documents from other applications, converting document images into
machine-readable/editable text documents, and so on are examples of the technologies
available.
Furthermore, scanned documents and digital faxes are not readable texts. To convert
them into machine-readable characters, different character recognition technologies are
used. Handwritten Character Recognition (HCR) and Optical Mark Recognition (OMR) are two
examples of such technologies.
Equally important as the content that is captured in various formats from numerous
sources in the dematerialization process is the supporting metadata that allows
efficient identification of the content via specific queries.
Now how can this document content along with the related metadata be aggregated and
indexed in an Enterprise Content Management system so that it can be retrieved and
managed in meaningful ways?
Talend
provides the answer through the
tAlfrescoOutput component.
The tAlfrescoOutput component allows you to stock and
manage your electronic documents and the related metadata on the Alfresco server, the
leading open source enterprise content management system.
The following figure illustrates
Talend
‘s role between the
dematerialization process and the Enterprise Content Management system
(Alfresco).

tAlfrescoOutput Standard properties
These properties are used to configure tAlfrescoOutput running in the Standard Job framework.
The Standard
tAlfrescoOutput component belongs to the Business family.
The component in this framework is generally available.
Basic settings
URL |
Type in the URL to connect to the Alfresco Web application. |
Login and Password |
Type in the user authentication data to the Alfresco To enter the password, click the […] button next to the |
Base |
Type in the base path where to put the document, or Select the Map… check box and
Note: When you type in the base |
Document Mode |
Select in the list the mode you want to use for the created
Create only: creates a document if Note that an error message will display if you try to create a
Create or update: creates a |
Container Mode |
Select in the list the mode you want to use for the destination
Update only: updates a destination Note that an error message will display if you try to update a
Create or update: creates a |
Define Document Type |
Click the three-dot button to display the tAlfrescoOutput editor. This editor enables you – select the file where you defined the metadata according to -define the type f the document -select any of the aspects in the available |
Property Mapping |
Displays the parameters you set in the tAlfrescoOutput editor and according to which the Note that in the Property Mapping |
Schema and Edit |
A schema is a row description. It defines the number of fields (columns) to Click Edit schema to make changes to the schema.
|
Result Log File Name |
Browse to the file where you want to save any logs related to the |
Die on error |
This check box is cleared by default, meaning to skip the row on |
Advanced settings
Configure Target Location Container |
Allows to configure the (by default) type of containers Select this check box to display new fields where you can modify |
Configure Permissions |
When selected, allows to manually configure access rights to Select the Inherit Permissions Click the Plus button to add new |
Encoding |
Select the encoding type from the list or select Custom and define it manually. This field |
Association Target Mapping |
Allows to create new documents in Alfresco with associated links To create associations:
|
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at a |
Global Variables
Global Variables |
NB_LINE: the number of rows read by an input component or
NB_LINE_REJECTED: the number of rows rejected. This is an
ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see |
Usage
Usage rule |
Usually used as an output component. An input component is |
Limitation/Prerequisites |
To be able to use the tAlfrescoOutput component, few relevant resources Due to license incompatibility, one or more JARs required to use this component are not |
Scenario: Creating documents on an Alfresco server (deprecated)
This Java scenario describes a two-component Job which aims at creating two document
files with the related metadata in an Alfresco server, the java-based Enterprise Control
Management system.
Setting up your Job
-
Drop the tFileInputDelimited and
tAlfrescoOutput components from the
Palette onto the design workspace.
-
Connect the two components together using a Main > Row
connection.
Setting up the schema
-
In the design workspace, double-click tFileInputDelimited to display its basic settings.
-
Set the File Name path and all related
properties. Note that if you have already
stored your input schemas locally in the Repository, you can simply drop the relevant file item
from the Metadata folder onto the
design workspace and the delimited file settings will automatically
display in the relevant fields in the component Basic settings view.
Note:For more information about metadata, see Setting up a File Delimited schema in
Talend Studio
User Guide.
In this scenario, the delimited file provides the metadata and path of two
documents we want to create in the Alfresco server. The input schema for the
documents consists of four columns: file_name,
destination_folder name, source_path, and author.

And therefore the input schema of the delimited file will be as the
following:

Setting up the connection to the Alfresco server
-
In the design workspace, double-click tAlfrescoOutput to display its basic settings.
-
In the Alfresco Server area, enter the
Alfresco server URL and user authentication information in the corresponding
fields. -
In the TargetLocation area, either type
in the base name where to put the document in the server, or Select the
Map… check box and then in the
Column list, select the target location
column,destination_folder_name
in this scenario.Note:When you type in the base name, make sure to use the double backslash
(\) escape character. -
In the Document Mode list, select the
mode you want to use for the created documents. -
In the Container Mode list, select the
mode you want to use for the destination folder in Alfresco.
Defining the document
-
Click the Define Document Type three-dot
button to open the tAlfrescoOutput
editor. -
Click the Add button to browse and
select the xml file that holds the metadata according to which you want to
save the documents in Alfresco.All available aspects in the selected model file display in the Available Aspects list.Note:You can browse for this model folder locally or on the network. After
defining the aspects to use for the document to be created in Alfresco,
this model folder is not needed any more. -
If needed, select in the Available Aspects
list the aspect(s) to be included in the metadata to write in the
Alfresco server. In this scenario we want the author name to be part of the
metadata registered in Alfresco. -
Click the drop-down arrow at the top of the editor to select from the
list the type to give to the created document in Alfresco,
Content
in this scenario.All the defined aspects used to select the metadata to write in the
Alfresco server display in the Property
Mapping list in the Basic Settings
view of tAlfrescoOutput, three
aspects in this scenario, two basic for theContent
type
(content
andname
) and an additional one
(author
).
Executing your Job
-
Click Sync columns to auto propagate all
the columns of the delimited file.If needed, click Edit schema to view the
output data structure of tAlfrescoOutput. -
Click the three-dot button next to the Result Log
File Name field and browse to the file where you want to save
any logs after Job execution. -
Save your Job, and press F6 to execute
it.The two documents are created in Alfresco using the metadata provided in
the input schemas.