August 17, 2023

tBigQueryInput – Docs for ESB 5.x

tBigQueryInput

tBigQueryInput_icon32_white.png

Warning

This component will be available in the Palette of the studio on the condition that you have subscribed to
one of the Talend solutions with Big
Data.

tBigQueryInput Properties

Component family

Big Data / Google BigQuery

 

Function

This component connects to Google BigQuery and performs queries in
it.

Purpose

This component performed the queries supported by Google
BigQuery.

Basic settings

Schema and Edit
Schema

A schema is a row description. It defines the number of fields to be processed and passed on
to the next component. The schema is either Built-In or
stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are
available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the
current schema is of the Repository type, three options are
available:

  • View schema: choose this option to view the
    schema only.

  • Change to built-in property: choose this option
    to change the schema to Built-in for local
    changes.

  • Update repository connection: choose this option to change
    the schema stored in the repository and decide whether to propagate the changes to
    all the Jobs upon completion. If you just want to propagate the changes to the
    current Job, you can select No upon completion and
    choose this schema metadata again in the [Repository
    Content]
    window.

 

 

Built-In: You create and store the schema locally for this
component only. Related topic: see Talend Studio
User Guide.

 

 

Repository: You have already created the schema and
stored it in the Repository. You can reuse it in various projects and Job designs. Related
topic: see Talend Studio User Guide.

Connection

Client ID and Client
secret

Paste the client ID and the client secret, both created and viewable on the API Access tab
view of the project hosting the BigQuery service and the Cloud Storage service you need to
use.

To enter the client secret, click the […] button next
to the client secret field, and then in the pop-up dialog box enter the client secret
between double quotes and click OK to save the
settings.

 

Project ID

Paste the ID of the project hosting the BigQuery service you need to use.

The default ID of this project can be found in the URL of the Google API Console, or by
hovering your mouse pointer over the name of the project in the BigQuery Browser
Tool.

 

Authorization code

Paste the authorization code provided by Google for the access you are building.

To obtain the authorization code, you need to execute the Job using this component and
when this Job pauses execution to print out an URL address, you navigate to this address to
copy the authorization code displayed.

 

Query

Enter the query you need to use.

Advanced settings

token properties File Name

Enter the path to, or browse to the refresh token file you need to use.

At the first Job execution using the Authorization code
you have obtained from Google BigQuery, the value in this field is the directory and the
name of that refresh token file to be created and used; if that token file has been created
and you need to reuse it, you have to specify its directory and file name in this
field.

With only the token file name entered, Talend Studio considers the directory of that token file
to be the root of the Studio folder.

For further information about the refresh token, see the manual of Google BigQuery.

 

Advanced Separator (for number)

Select this check box to change the separator used for the
numbers.

 

Encoding

Select the encoding from the list or select Custom and
define it manually. This field is compulsory for database data handling.

 

tStatCatcher Statistics

Select this check box to collect the log data at the component
level.

Global Variables

ERROR_MESSAGE: the error message generated by the
component when an error occurs. This is an After variable and it returns a string. This
variable functions only if the Die on error check box is
cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable
functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl +
Space
to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio
User Guide.

Usage

This is an input component. It sends the extracted data to the
component that follows it.

Log4j

The activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User
Guide
.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

N/A

Scenario: Performing a query in BigQuery

This scenario uses two components to perform the SELECT query in BigQuery and present
the result in the Studio.

use_case-tbigqueryinput1.png

The following figure shows the schema of the table, UScustomer, we use as example to perform the SELECT query in.

use_case-tbigqueryinput2.png

We will select the State records and count the occurrence of each State among those
records.

Linking the components

  1. In the Integration perspective
    of Studio, create an empty Job, named BigQueryInput for example, from the
    Job Designs
    node in the Repository tree view.

    For further information about how to create a Job, see the Talend Studio User
    Guide.

  2. Drop tBigQueryInput and tLogRow onto the workspace.

  3. Connect them using the Row > Main
    link.

Creating the query

Building access to BigQuery

  1. Double-click tBigQueryInput to open its
    Component view.

    use_case-tbigqueryinput3.png
  2. Click Edit schema to open the
    editor

    use_case-tbigqueryinput4.png
  3. Click the Button_Plus.png button twice to add two rows and enter the names of
    your choice for each of them in the Column
    column. In this scenario, they are: States and Count.

  4. Click OK to validate these changes and
    accept the propagation prompted by the pop-up dialog box.

  5. Navigate to the Google APIs Console in your web browser to access the
    Google project hosting the BigQuery and the Cloud Storage services you need
    to use.

  6. Click the API Access tab to open its view.

  7. In the Component view of the Studio,
    paste Client ID, Client secret and Project ID from the API Access tab view
    to the corresponding fields, respectively.

Getting Authorization code

  1. In the Run view of the Studio, click
    Run to execute this Job. The execution
    will pause at a given moment to print out in the console the URL address
    used to get the authorization code.

  2. Navigate to this address in your web browser and copy the authorization
    code displayed.

  3. In the Component view of tBigQueryInput, paste the authorization code in
    the Authorization Code field.

Writing the query

  • In the Query field, enter

Executing the Job

The tLogRow component presents the execution
result of the Job. You can configure the presentation mode on its Component view.

To do this, double-click tLogRow to open the
Component view and in the Mode area, select the Table (print values in
cells of a table)
option.

  • To execute this Job, press F6.

Once done, the Run view is opened automatically,
where you can check the execution result.

use_case-tbigqueryinput-result.png

Document get from Talend https://help.talend.com
Thank you for watching.
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x