tALSModel
Generates an user-ranking-product associated matrix, based on given user-product
interactive data.
This matrix is used by tRecommend to estimate these users’
preferences.
tALSModel leverages Spark to process a
large amount of information about users’ preferences over given products.
It receives this kind of information from its preceding Spark
component and performs ALS (Alternating Least Squares) computations over
these sets of information in order to generate and write a fine-tuned
product recommender model in a given file system in the Parquet
format.
tALSModel properties for Apache Spark Batch
These properties are used to configure tALSModel running in the Spark Batch Job framework.
The Spark Batch
tALSModel component belongs to the Machine Learning family.
This component is available in the Palette of the Studio only if you have subscribed to any Talend Platform product with Big Data or Talend Data Fabric.
Basic settings
Define a storage configuration |
Select the configuration component to be used to provide the configuration If you leave this check box clear, the target file system is the local The configuration component to be used must be present in the same Job. For |
Feature table |
Complete this table to map the input columns with the three factors
required to compute the recommender model.
This map allows tASLModel to read the |
Training percentage |
Enter the percentage (expressed in the decimal form) of the input data |
Number of latent factors |
Enter the number of the latent factors, with which each user or |
Number of iterations |
Enter the number of iterations you want the Job to perform to train This number should be smaller than 30 in order to avoid stack overflow However, if you need to perform more than 30 iterations, you must |
Regularization factor |
Enter the regularization number you want to use to avoid |
Build model for implicit feedback data |
Select this check box to enable tALSModel to handle the implicit data sets. Contrary to the explicit data sets such as the ranking of a product, If you leave this check box clear, tALSModel handles the explicit data sets only. For related details about how the ALS model handles the implicit data |
Confidence coefficient for implicit |
Enter the number to indicate the level of confidence you have in the |
Parquet model path |
Enter the directory in which you need to store the generated The button for browsing does not work with the Spark Local mode; if you are using the Spark Yarn or the Spark Standalone mode, |
Parquet model name |
Enter the name you need to use for the recommender model. |
Usage
Usage rule |
This component is used as an end component and requires an input link. Note that the parameters you need to set are free parameters and so |
||
MLlib installation |
In Apache Spark V1.3 or earlier versions of Spark, the Spark machine For further information about MLlib and this library, see the related |
||
RMSE score |
These scores can be output to the console of the Run view
when you execute the Job when you have added the following code to the Log4j view in the [Project Settings] dialog box.
These scores are output along with the other Log4j INFO-level information. If you want to If you are using a subscription-based version of the Studio, the activity of this For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html. |
Related scenarios
No scenario is available for the Spark Batch version of this component
yet.