USE CASE

Leverage the Talend random data generator to spawn records onto a CUSTOMER table residing on the Netezza emulator. Additionally, leverage Talend’s Netezza specific functions for loading to Netezza.

BUSINESS CASE

Recently, a client wanted to compare some of the leading data integration tools on the market for their ability to integrate with Netezza. Coming from an Oracle solution most of their existing code was written for singleton transactions and the client needed to take advantage of MPP’s ability to bulk load and bulk transform their data sets. Since Talend’s data integration tool contains a library of Netezza specific functions they were chosen as part of the evaluation.
One challenge for the evaluation was that they did not have an existing Netezza server in house as they were still in their acquisition phase. To accomplish the evaluation of Talend’s built-in Netezza functions we utilized the virtual Netezza emulator along with version 5.6 of Talend.

PRE-REQUISITES

Software Versions and Hardware:

  • Talend Open Studio for Big Data version 5.6
  • IBM Netezza Emulator 7.1.0
  • Windows 8.1
  • Laptop with 16GB RAM and 20GB of spare SSD storage

Software Configuration
Before you begin, you will need an environment configured with the emulator and Talend.

  1. Download and configure the Netezza Emulator from IBM’s site
  2. Download the Netezza JDBC driver, and copy to ${TALEND_HOME}/configuration/lib/java
  3. Download and install Talend 5.6

 

CREATE THE NETEZZA TABLE

On Netezza, create the CUSTOMERS table

  1. Start the emulator
  2. Use your favorite SQL tool to create a simple CUSTOMERS table
CREATE TABLE customers (
  custid INT
  ,custfirstname VARCHAR(50)
  ,custlastname VARCHAR(50)
  ,custage SMALLINT
  ,custstreet VARCHAR(100)
  ,custcity VARCHAR(100)
  ,custstate VARCHAR(100)
) distribute ON random;

 

TALEND DEVELOPMENT

  1. Step 1: In Talend, create a new project and job
    1. Create the project
      Create Talend project
    2. Create a new job
      Create a new job
  2. Step 2: In Talend, create a connection to Netezza
    1. Navigate to Metadata > DB Connections > Create Connection
      Create the connection
    2. Configure the connection
      2.Configure the connection
  3. Step 3: In Talend, create a data generator with tRowGenerator
    1. Drag tRowGenerator to the canvas
      Drag tRowGenerator to the canvas
    2. Create the fields that match the CUSTOMERS table in Netezza and choose the appropriate random data generator
      columns
  4. Step 4: In Talend, map the tRowGenerator to a Netezza table
    1. Bring in a tMap and tNetezzaOutput to the job
      Bring in a tMap and tNetezzaOutput to the job
    2. Configure tNetezzaOutput to point to the Netezza Customers table
      Configure tNetezzaOutput to point to the Netezza Customers table
    3. Connect tRowGenerator to tMap and tMap to tNetezzaOutput
      Connect tRowGenerator to tMap and tMap to tNetezzaOutput
    4. Configure the tMap with direct mappings
      Configure the tMap with direct mappings
    5. Validate the NetezzaCustomerGenerator job has no errors
      Validate the NetezzaCustomerGenerator job has no errors

TESTING

  1. Step 1: In Talend, run the data generator job
    1. Click the Talend run button
      Click the Talend run button

 

DATA VALIDATION

  1. Step 1: In Netezza, validate the records
    Validate the records
    Validate the records

 

SUMMARY

Utilizing Talend’s built-in Netezza functionality along with the Netezza emulator we were able to demonstrate how easy it is to develop a Talend job, point to Netezza, and load with randomly generated customer records.