Sample catalog for the extract, transform and load (ETL) process

To extract data from a data source, you need to know the structure of data, create an extract, transform, and load (ETL) catalog that reproduces this structure in IBM® Cognos® Data Manager, and then set up the processes required to transform the data into the star schema or metadata format required by the data mart or IBM Cognos Framework Manager. To help you simplify the process, IBM Rational® Insight has sample catalogs for extracting data from the data services, loading the data to the operational data store in IBM Rational Insight data warehouse, and building the conformed data marts. This section is an overview of the sample catalog.

An IBM Cognos Data Manager catalog is a central repository that describes how information will be extracted, transformed, and loaded from external sources. Broadly, a Data Manager catalog describes:

The sample catalog of Rational Insight facilitates the extraction, transformation, and loading of data from the following products:

These sample catalogs contain predefined job streams and builds that map data extracted from these sources into corresponding tables, and transform these tables into facts and dimensions. Every sample catalog contains predefined libraries, builds, and metadata.

Example

Consider the sample ETL build used in conjunction with RequisitePro to extract requirement management data. The build loads data from the source as follows:
  • All types of requirements into the REQUIREMENT table of the operational data store
  • Requirement types information into the REQUIREMENT_CLASSIFICATION table
  • Requirement revision data into the REQUIREMENT_REVISION_HISTORY table
  • Requirement parent-child relationship into the REQUIREMENT_REQUIREMENT_LOOKUP table
  • Requirement traceability data into REQUIREMENT_HIERARCHY table
  • Projects data into the PROJECT table
  • Users data into RESOURCE
The sample ETL model for IBM Rational RequisitePro also contains predefined ETL job streams that organize jobs into a predefined categories. A job stream is defined for each external data source. Every job stream can be scheduled independently of others.
The OOTBE2E job is the top-level job that defines the overall flow. It runs jobs to load data from different external operational data sources into corresponding tables, and then executes the job to build the data mart in the data warehouse. The Build_Star job populates the data mart. It does the following:
  • Starts the Dimensions job to populate the dimension tables
  • Starts the Facts to populate the fact tables
  • Starts the HistoricalLookups to populate the historical lookup tables.
You can customize these builds and job streams depending on your specific business situation and implementation of these products.

Feedback