July 30, 2023

tMelissaDataAddress – Docs for ESB 7.x

tMelissaDataAddress

Verifies if an address is properly formatted and corrects any formatting or
spelling errors in each row.

This address management component is the result of
Talend
collaboration with Melissa Data, one of the world
leaders for global address validation.

For more information about the enterprise and its software tools, visit http://www.melissadata.com/.

tMelissaDataAddress validates, corrects and standardizes Canadian and
United States addresses. It iterates on each row and reads all input addresses against a
Melissa Data Data file.

tMelissaDataAddress uses the June 2017 release of the AddressObject library from Melissa Data.

The Data Quality Suite from Melissa Data and the data used to validate addresses
are updated regularly but the AddressObject library
has not been modified since June 2017.

APIs used in tMelissaDataAddress

The following APIs provided by Melissa Data are used in the tMelissaDataAddress component:

  • Address Object is used to clean up contact
    data,
  • GeoCoder Object is used to access geographic
    data. GeoCode and GeoPoint are two different GeoCoder API methods. Using
    GeoCode, you can retrieve the latitude and longitude coordinates of a 9-digit
    ZIP code centroid. Using GeoPoint, you can retrieve the rooftop level latitude
    and longitude coordinates of addresses, provided that you purchased the
    license,
  • RightFielder Object is used to parse and
    reorganize input data into usable data types. The Parse(String address1Str) method is used to parse
    fields.

Input fields used in tMelissaDataAddress

The table below lists all input fields in tMelissaDataAddress:

Field name

Description

Address1

This field is used to map the first line of the address.

Address2

This field is used to map the second line of the address.

Company

This field is used to map the company name.

City

This field is used to map the city name.

State

This field is used to map the state name.

Postal

This field is used to map the postal ZIP code.

Output standard columns used in tMelissaDataAddress

The table below lists all the output standard columns in tMelissaDataAddress. These read-only columns are automatically added to
the output schema.

Output column

Description

COMPANY_STANDARDIZED

This column returns a standardized company name.

ADDRESSLINE1_STANDARDIZED

This column returns the first line of the address.

ADDRESSLINE2_STANDARDIZED

This column returns the second line of the address.

CITY_STANDARDIZED

This column returns a standardized city name.

STATE_STANDARDIZED

This column returns a two-letter abbreviation for the state
name.

COUNTRY_STANDARDIZED

This column returns a two-letter abbreviation for the country
name.

RESULTS_CODE

This column returns verification codes to indicate data quality,
statuses and errors. These codes are written in comma-delimited
lists. Each code consists of two letters followed by two
numbers.

For example, the AC02 code means that the
state name is corrected based on the combination of city name and
ZIP code.

For a complete list of the result codes visit http://www.melissadata.com/.

tMelissaDataAddress Standard properties

These properties are used to configure tMelissaDataAddress running in the Standard Job framework.

The Standard
tMelissaDataAddress component belongs to the Data Quality family.

This component is available in Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, Talend MDM Platform and Talend Data Fabric.

Basic settings

Schema and Edit
schema

A schema is a row description, it defines the number of fields to be
processed and passed on to the next component. The schema is either
Built-in or stored remotely in the
Repository.

 

Built-in: You create the schema and
store it locally for this component only. Related topic: see

Talend Studio User Guide
.

 

Repository: You have already created
the schema and stored it in the Repository. You can reuse it in various
projects and job designs. Related topic: see
Talend Studio User Guide
.

Input address

Click the [+] button to add lines to
the table.

Click on Address field and select
from the predefined list the fields that hold the input address
data.

The component will map the values of these fields to the input columns
you set in the table.

Click on Input Column and select from
the list the columns from the input schema that hold the input address
data.

Output address

Use this table to add extra columns to the output.

Click the [+] button to add lines to
the table.

Click on Address field and select
from the predefined list the fields that hold the output address
data.

The component will map the values of these fields to the output
columns you set in the table.

Click on Output Column and select
from the list the columns from the output schema that will hold the
extra information.

Specify your MelissaData license

Enter the Melissa Data license key provided by Melissa Data when you
order the Data Quality Suite or the Address Object API.

This software key unlocks the full functionality of Address
Object.

For more information, visit http://www.melissadata.com/ and download the Reference Guide
for Address Object from the Support Center of MelissaData.

If your GeoCoder license has expired, you can use it in demo mode.
This means that you can only process records from Nevada. Records from
other states return a GE03 (Demo Mode) code in the
RESULTS_CODE column.

Specify your MelissaData DataFile folder

Set the path to the MelissaData Data folder provided by MelissaData
and installed locally. You can also enter a path to a shared
folder.

You must order and download the Data Quality Suite or the Address
Object API from http://www.melissadata.com/.

Advanced settings

GeoCoder Licensing Agreement

Select the license you purchased:

  • No Melissa GeoCoder License Was
    Purchased

  • The Melissa GeoPoint License Was
    Purchased

  • The Melissa GeoCoder License Was
    Purchased

You cannot check the license validity at the initialization of the
component.

tStat
Catcher
Statistics

Select this check box to gather the Job processing metadata at the Job level
as well as at each component level.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see
Talend Studio

User Guide.

Usage

Usage rule

This component is usually used as an intermediate component, and it requires an
input component and an output component.

Editing addresses against a Melissa Data data file

This Job uses the tFixedFlowInput component to generate the
address data to be analyzed, the tMelissaDataAddress component to
analyze the input schema and validate, correct and standardize the US addresses
generated by the tFixedFlowInput component and a
tLogRow component to output the correct formatted addresses on
the console.

tMelissaDataAddress_1.png

This scenario applies only to Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, Talend MDM Platform and Talend Data Fabric.

Prerequisites to using the tMelissaDataAddress component

Before being able to use the
tMelissaDataAddress component, follow these steps:

  • To retrieve longitude and latitude data and the GeoCode result codes, you
    must have purchased a GeoCode or a GeoPoint license.
  • To successfully execute a Job with the
    tMelissaDataAddress component, you must have
    installed Melissa Data with the GeoPoint and GeoCode data files.
  • Add the path to the folder containing the mdAddr library to the system
    environment variables. For example, export
    LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path to folder containing
    libmdAddr.so>
    on Linux and PATH=%PATH%;<path to
    folder containing mdAddr.dll>
    on Windows. If the system
    environment variable is not set correctly, the following error is to be
    expected: java.lang.Error:
    java.lang.UnsatisfiedLinkError
    .
  • On Linux, restart your computer after setting your system environment
    variables to take the changes into account.

Setting up the Job

  1. Drop the following components from the Palette onto the design workspace: tFixedFlowInput, tMelissaDataAddress and tLogRow.
  2. Connect the three components together using Row > Main connections.

Configuring the input component

  1. Double-click tFixedFlowInput to open its Basic
    settings
    view in the Component tab.

    tMelissaDataAddress_2.png

  2. Click Edit schema to make changes to the
    schema.
  3. Click the [+] button to add the columns that will hold
    the address data to your input schema.

    For this example, add:

    • input_company
    • input_address1
    • input_address2
    • input_city
    • input_state
    • input_postal

    tMelissaDataAddress_3.png

  4. Click OK.
  5. In the Number of rows
    field, set the number of rows as 1.
  6. In the Mode area, select
    the Use Inline Content (delimited file)
    option, and set the row and field separators in the corresponding fields.
  7. In the Content table,
    enter the address data you want to analyze.

    For
    example:

Configuring the tMelissaDataAddress component

  1. Double-click tMelissaDataAddress to display the Basic
    settings
    view and define the component properties.

    tMelissaDataAddress_4.png

  2. Click Sync columns to retrieve the schema
    from the preceding component.
  3. Click the Edit schema
    button to view the input and output schema and edit the output schema, if
    necessary.

    tMelissaDataAddress_5.png

    Read-only columns are added the output schema:

    • COMPANY_STANDARDIZED returns the standard company name.
    • ADDRESLINE1_STANDARDIZED returns the first line of the street
      address.
    • ADDRESLINE2_STANDARDIZED returns the second line of the street
      address.
    • CITY_STANDARDIZED returns the standard city name.
    • STATE_STANDARDIZED returns a two-letter abbreviation for the
      state name.
    • POSTAL_STANDARDIZED returns the postal ZIP code.
    • COUNTRY_STANDARDIZED returns a two-letter abbreviation for the
      country name.
    • RESULT_CODES returns verification codes.

  4. Click OK to close the dialog box.
  5. In the Input Address
    table:

    1. Use the [+] button to add lines in the
      table.
    2. Click in the Address Field column and select
      from the predefined list the fields that hold the input address
      data.

      The component will map the values of these fields to the input
      columns you set in this table.

    3. Click in the Input Column column and select from
      the list the columns from the input schema that hold the input address
      data you want to parse.
  6. In the Output Address
    table, you can define additional address fields:

    1. Use the [+] button to add lines in the table.
      These lines will hold the extra information you want to retrieve from
      Melissa Data, such as the Address Key, the country name or longitude and
      latitude data.
    2. Click in the Address Field column and select
      from the predefined list the fields that hold the output address
      data.

      The component will map the values of these fields to the output
      columns you set in this table.

    3. Click in the Output Column column and select
      from the list the columns from the output schema that will hold the
      extra information.

      If you click Sync Columns after adding columns
      to the output schema, they are removed.

  7. In the Specify your MelissaData
    license
    field, set your license key provided by Melissa Data
    when you order the Data Quality Suite or the Address Object API.

    If the license key you entered is not correct, you can use GeoCoder in demo
    mode.

  8. In the Specify your MelissaData DataFile
    folder
    field, set the path to the Melissa Data data folder
    provided by Melissa Data.
  9. In the Advanced settings
    view of the component, select the license you purchased.

    If you have not purchased a GeoPoint or a GeoCode license, select
    No Melissa GeoCoder License Was Purchased to run
    the Job. Note that you will not be able to retrieve latitude and longitude
    data and GeoCode result codes.

Saving and executing the Job

Save your Job and press F6 to execute it.

The tMelissaDataAddress reads the input address rows,
corrects and formats the addresses and gives the result in a kind of
“standardized” address output rows.

tMelissaDataAddress_6.png

In addition to verifying and standardizing an address,
tMelissaDataAddress will also match street names against a
ZIP code, match geographic data to ZIP code and city information and finally parse
street addresses and return all these results via different output columns. This
example shows only some of the output columns written by the
tMelissaDataAddress component:

  • GetAddressKey returns the Address Key.
  • GetCountyName returns the county names.
  • GetTimeZone returns the time zone.
  • GetLongitude returns the longitude data.
  • GetLatitude returns the latitude data.
  • GeoCodeResult returns the GeoCode result codes.
  • The output standard columns return the standard company name, up to two
    street address lines, the standard city name, two-letter abbreviation for
    the state name, the postal ZIP code, and two-letter abbreviation for the
    country name.
  • The RESULTS_CODE output column returns verification codes for each of the
    processed address rows. These codes are written in comma-delimited
    lists. Each code consists of two letters followed by two numbers. These
    codes indicate different statuses and errors. For example, the
    AC02 code means that the state name is corrected
    based on the combination of city name and ZIP code, and the
    AS01 code means that the street address is valid
    and deliverable.

For a complete list of the result codes and for further information about all the
output columns,visit http://www.melissadata.com/.


Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x