Talend Data Preparation for Developers

Talend Data Preparation is a self-service application that enables information workers to prepare data for analysis and other data-driven tasks. This course is designed to help you immediately utilize the Talend Data Preparation web interface.

You learn how to create datasets and preparations to deliver cleansed, structured, enriched data to business users. You also learn how to use Talend Studio to execute preparations and create datasets in DI Jobs.

Target audienceData owners, DI developers, and administrators who want to deploy, manage, and deliver ready-to-use data to business users
PrerequisitesCompletion of Talend Data Integration Basics and a fundamental understanding of administrative tasks
Course objectives
After completing this course, you will be able to:
  • Use Talend Administration Center (TAC) to configure Data Preparation users and manage tasks
  • Create and share datasets and preparations
  • Handle large data volumes in Data Preparation
  • Use Talend Dictionary Service to associate data with standard semantic types and create semantic types
  • Execute a user-defined data preparation in a Talend Job
  • Design and publish live and batch data flows as datasets to authorized users
Course agenda

Data Preparation in context

  • Concepts and purpose

Getting started

  • Exploring the environment
  • Creating users and groups in TAC
  • Connecting to Talend Data Preparation

Creating a data preparation

  • Creating a data preparation and related dataset
  • Adding a join to a data preparation
  • Promoting the preparation

Working with large data volumes

  • Creating a dataset from a database
  • Using selective sampling
  • Exporting preparations

Using Talend Dictionary Service

  • Discovering Talend Dictionary Service
  • Creating a dictionary semantic type
  • Creating a regular expression semantic type
  • Creating a compound semantic type

Using DI for Data Preparation

  • Publishing a dataset to Data Preparation
  • Executing a preparation in Talend Studio

Implementing a live dataset

  • Implementing Live Dataset mode in Studio
  • Deploying a Job in TAC
  • Creating a dataset from a Talend Job