USE CASE
Leverage the Talend random data generator to spawn records onto a CUSTOMER table residing on the Netezza emulator. Additionally, leverage Talend’s Netezza specific functions for loading to Netezza.
BUSINESS CASE
Recently, a client wanted to compare some of the leading data integration tools on the market for their ability to integrate with Netezza. Coming from an Oracle solution most of their existing code was written for singleton transactions and the client needed to take advantage of MPP’s ability to bulk load and bulk transform their data sets. Since Talend’s data integration tool contains a library of Netezza specific functions they were chosen as part of the evaluation.
One challenge for the evaluation was that they did not have an existing Netezza server in house as they were still in their acquisition phase. To accomplish the evaluation of Talend’s built-in Netezza functions we utilized the virtual Netezza emulator along with version 5.6 of Talend.
PRE-REQUISITES
Software Versions and Hardware:
- Talend Open Studio for Big Data version 5.6
- IBM Netezza Emulator 7.1.0
- Windows 8.1
- Laptop with 16GB RAM and 20GB of spare SSD storage
Software Configuration
Before you begin, you will need an environment configured with the emulator and Talend.
- Download and configure the Netezza Emulator from IBM’s site
- Download the Netezza JDBC driver, and copy to ${TALEND_HOME}/configuration/lib/java
- Download and install Talend 5.6
CREATE THE NETEZZA TABLE
On Netezza, create the CUSTOMERS table
- Start the emulator
- Use your favorite SQL tool to create a simple CUSTOMERS table
CREATE TABLE customers ( custid INT ,custfirstname VARCHAR(50) ,custlastname VARCHAR(50) ,custage SMALLINT ,custstreet VARCHAR(100) ,custcity VARCHAR(100) ,custstate VARCHAR(100) ) distribute ON random;
TALEND DEVELOPMENT
- Step 1: In Talend, create a new project and job
- Step 2: In Talend, create a connection to Netezza
- Step 3: In Talend, create a data generator with tRowGenerator
- Step 4: In Talend, map the tRowGenerator to a Netezza table
TESTING
DATA VALIDATION
SUMMARY
Utilizing Talend’s built-in Netezza functionality along with the Netezza emulator we were able to demonstrate how easy it is to develop a Talend job, point to Netezza, and load with randomly generated customer records.