Component family |
Business |
|
Function |
Creates dematerialized documents in an Alfresco server where they |
|
Purpose |
Allows to create and manage documents in an Alfresco |
|
Basic settings |
URL |
Type in the URL to connect to the Alfresco Web application. |
|
Login and Password |
Type in the user authentication data to the Alfresco To enter the password, click the […] button next to the |
Target Location |
Base |
Type in the base path where to put the document, or Select the Map… check box and
Note: When you type in the base |
Create Or Update Mode |
Document Mode |
Select in the list the mode you want to use for the created
Create only: creates a document if Note that an error message will display if you try to create a
Create or update: creates a |
|
Container Mode |
Select in the list the mode you want to use for the destination
Update only: updates a destination Note that an error message will display if you try to update a
Create or update: creates a |
|
Define Document Type |
Click the three-dot button to display the tAlfrescoOutput editor. This editor enables you – select the file where you defined the metadata according to -define the type f the document -select any of the aspects in the available |
|
Property Mapping |
Displays the parameters you set in the tAlfrescoOutput editor and according to which the Note that in the Property Mapping |
|
Schema and Edit |
A schema is a row description. It defines the number of fields to be processed and passed on Since version 5.6, both the Built-In mode and the Repository mode are Click Edit schema to make changes to the schema. If the
|
|
Result Log File Name |
Browse to the file where you want to save any logs related to the |
Die on error |
This check box is cleared by default, meaning to skip the row on |
|
Configure Target Location Container |
Allows to configure the (by default) type of containers Select this check box to display new fields where you can modify |
|
Permissions |
Configure Permissions |
When selected, allows to manually configure access rights to Select the Inherit Permissions Click the Plus button to add new |
|
Encoding |
Select the encoding type from the list or select Custom and define it manually. This field |
|
Association Target Mapping |
Allows to create new documents in Alfresco with associated links To create associations:
|
|
tStatCatcher Statistics |
Select this check box to gather the Job processing metadata at a |
Global Variables |
NB_LINE: the number of rows read by an input component or
NB_LINE_REJECTED: the number of rows rejected. This is an ERROR_MESSAGE: the error message generated by the A Flow variable functions during the execution of a component while an After variable To fill up a field or expression with a variable, press Ctrl + For further information about variables, see Talend Studio |
|
Usage |
Usually used as an output component. An input component is |
|
Limitation/Prerequisites |
To be able to use the tAlfrescoOutput component, few relevant resources Due to license incompatibility, one or more JARs required to use this component are not |
To be able to use tAlfrescoOutput in the
Integration perspective of Talend Studio, you need first to install the Alfresco server with
few relevant resources.
The below sub sections detail the prerequisite and the installation
procedure.
Start with the following operations:
-
Download the file
alfresco-community-tomcat-2.1.0.zip
-
Unzip the file in an installation folder, for example:
1C:Program FilesJavajdk1.6.0_27 -
Install JDK 1.6.0+
-
Update the environment variable
1JAVA_HOME (JAVA_HOME= C:alfresco) -
From the installation folder (
C:alfresco
), launch the
alfresco server using the scriptalf_start.bat
Warning
Make sure that the Alfresco server is launched
correctly before start using the tAlfrescoOutput component.
Note that the talendalfresco_20081014.zip
is provided with the
tAlfrescoOutput component in the Integration perspective of Talend Studio.
To install the talendalfresco
module:
-
From
talendalfresco_20081014.zip
and in the
talendalfresco_20081014alfresco
folder, look for the
following jars:stax-api-1.0.1.jar, wstx-lgpl-3.2.7.jar
,
talendalfresco-client_1.0.jar
, and
talendalfresco-alfresco_1.0.jar
and move them to
C:alfresco omcatwebappsalfrescoWEB-INFlib
-
Add the authentification filter of the commands to the
web.xml
file located in the path12C:alfresco omcatwebappsalfrescoWEB-INFson WEB-INF/following the model of the example provided in
talendalfresco_20081014/alfresco
folder of the zipped
filetalendalfresco_20081014.zip
The following figures show the portion of lines (in blue) to add in
the fileweb.xml alfresco.
Installing new types for Alfresco:
From the package_jeu_test.zip
and in the
package_jeu_test/fichiers_conf_alfresco2.1
folder, look for the
following files: xml H76ModelCustom.xml
(description of the model),
web-client-config-custom.xml
(web interface of the model), and
custom-model-context.xml
(registration of the new model) and
paste them in the following folder:
C:/alfresco/tomcat/shared/classes/alfresco/extension
Dates:
-
The dates must be of the Talend date type
java.util.Date
. -
Columns without either mapping or default values, for example of the
typeDate
, are written as empty strings. -
Solution: delete all columns without mapping or default values. Note
that any modification of the type Alfresco will put them back.
Content:
-
Do not mix up between the file path which content you want to create
in Alfresco and its target location in Alfresco. -
Provide a URL! It can target various protocols, among which are file,
HTTP and so on. -
For URLs referring to files on the file system, precede them by
“file:” for Windows used locally, and by “file://” for Windows on a
network (which accepts as well “file: “) or for Linux. -
Do not double the backslash in the target base path (automatic
escape), unless you type in the path in the basic settings of the
tAlfrescoOutput component, or doing
concatenation in the tMap editor for
example.
Multiple properties or associations:
-
It is possible to create only one association by document if it is
mapped to a string value, or one or more associations by document if it
is mapped to a list value (object). -
You can empty an association by mapping it to an empty list, which you
can create, for example, by usingnew
in the tMap
java.util.ArrayList()
component.
However, it is impossible to delete an association.
Building List(object)
with tAggregate:
-
define the table of the relation n-n in a file, containing a
name
line for example (included in the input rows), and
acategory
line (that can be defined with its mapping in a
third file). -
group by: input name, output name.
-
operation: output
categoryList
, function
list(object)
, inputcategory
. ATTENTION
list (object) and non simple list.
– References (documents and folders):
-
References are created by mapping one or more existing reference nodes
(xpath or namepath) usingString
type or
List(object).
-
An error in the association or the property of the reference type does
not prevent the creation of the node that holds the reference. -
Properties of the reference type are created in the Basic Settings view.
-
Associations are created in the Advanced
Settings view.
Dematerialization is the process that convert documents held in physical form into
electronic form, and thus helps to move away from the use of physical documentation
to the use of electronic Enterprise Content Management (ECM) systems. The range of
documents that can be managed with an Enterprise Content Management system include
just about everything from basic documents to stock certificates, for example.
Enterprises dematerialize their content via a manual document handling, done by
man, or an automatic document handling, machine-based.
Considering the varied nature of the content to be dematerialized, enterprises
have to use varied technologies to do it. Scanning paper documents, creating
interfaces to capture electronic documents from other applications, converting
document images into machine-readable/editable text documents, and so on are
examples of the technologies available.
Furthermore, scanned documents and digital faxes are not readable texts. To
convert them into machine-readable characters, different character recognition
technologies are used. Handwritten Character Recognition (HCR) and Optical Mark
Recognition (OMR) are two examples of such technologies.
Equally important as the content that is captured in various formats from numerous
sources in the dematerialization process is the supporting metadata that allows
efficient identification of the content via specific queries.
Now how can this document content along with the related metadata be aggregated
and indexed in an Enterprise Content Management system so that it can be retrieved
and managed in meaningful ways? Talend provides the answer through
the tAlfrescoOutput component.
The tAlfrescoOutput component allows you to stock
and manage your electronic documents and the related metadata on the Alfresco
server, the leading open source enterprise content management system.
The following figure illustrates Talend‘s role between the
dematerialization process and the Enterprise Content Management system
(Alfresco).
This Java scenario describes a two-component Job which aims at creating two document
files with the related metadata in an Alfresco server, the java-based Enterprise Control
Management system.
-
Drop the tFileInputDelimited and
tAlfrescoOutput components from the
Palette onto the design workspace. -
Connect the two components together using a Main > Row
connection.
-
In the design workspace, double-click tFileInputDelimited to display its basic settings.
-
Set the File Name path and all related
properties. Note that if you have already
stored your input schemas locally in the Repository, you can simply drop the relevant file item
from the Metadata folder onto the
design workspace and the delimited file settings will automatically
display in the relevant fields in the component Basic settings view.Note
For more information about metadata, see Setting up a File Delimited schema in Talend Studio
User Guide.
In this scenario, the delimited file provides the metadata and path of two
documents we want to create in the Alfresco server. The input schema for the
documents consists of four columns: file_name,
destination_folder name, source_path, and author.
And therefore the input schema of the delimited file will be as the
following:
-
In the design workspace, double-click tAlfrescoOutput to display its basic settings.
-
In the Alfresco Server area, enter the
Alfresco server URL and user authentication information in the corresponding
fields. -
In the TargetLocation area, either type
in the base name where to put the document in the server, or Select the
Map… check box and then in the
Column list, select the target location
column,destination_folder_name
in this scenario.Note
When you type in the base name, make sure to use the double backslash
(\) escape character. -
In the Document Mode list, select the
mode you want to use for the created documents. -
In the Container Mode list, select the
mode you want to use for the destination folder in Alfresco.
-
Click the Define Document Type three-dot
button to open the tAlfrescoOutput
editor. -
Click the Add button to browse and
select the xml file that holds the metadata according to which you want to
save the documents in Alfresco.All available aspects in the selected model file display in the Available Aspects list.
Note
You can browse for this model folder locally or on the network. After
defining the aspects to use for the document to be created in Alfresco,
this model folder is not needed any more. -
If needed, select in the Available Aspects
list the aspect(s) to be included in the metadata to write in the
Alfresco server. In this scenario we want the author name to be part of the
metadata registered in Alfresco. -
Click the drop-down arrow at the top of the editor to select from the
list the type to give to the created document in Alfresco,
Content
in this scenario.All the defined aspects used to select the metadata to write in the
Alfresco server display in the Property
Mapping list in the Basic Settings
view of tAlfrescoOutput, three
aspects in this scenario, two basic for theContent
type
(content
andname
) and an additional one
(author
).
-
Click Sync columns to auto propagate all
the columns of the delimited file.If needed, click Edit schema to view the
output data structure of tAlfrescoOutput. -
Click the three-dot button next to the Result Log
File Name field and browse to the file where you want to save
any logs after Job execution. -
Save your Job, and press F6 to execute
it.The two documents are created in Alfresco using the metadata provided in
the input schemas.