< Previous | Next >

Lesson 1.2: Modify an existing Data Manager build

In this lesson you will learn how to customize an existing Data Manager build for extracting data, transforming the data, and loading the data into the data warehouse.
Prerequisites:

In lesson 1.1, you modified the Rational® ClearQuest® data configuration file by adding new attributes to a data table. These attributes now need to be put into the build stream so that they can be exposed in reports.

In this lesson, you will modify the Data Manager build so that new attributes enter the data collection process. You will learn to customize an existing build to extract new attributes, and transform and deliver these attributes to the data warehouse.

Prepare the ETL catalog
  1. Start IBM® Cognos® Data Manager.
  2. Select Create a new catalog, specify the name to be RationalInsightETL, specify the connection information of the new empty database you created before you began this lesson, and click Finish.
  3. When the new catalog is created and opens, click File > Restore Catalog, and select the OOTBDMCatalog.ctg file. The default location of this file is the [rational_insight_installation_directory]/datamgr/insight_catalog/ directory.
Extract new attributes from the data source
  1. Expand the folder Builds and JobStreams > ODS > ClearQuest > CQEnterprise. Click CQ_ENT_Defect.
    Concept: In Data Manager, the ETL is specified in fact or dimension builds. A fact build delivers fact data and can also deliver dimension data, so you can construct a private data mart from within a single build. A dimension build delivers data to describe a single business dimension, such as a product or a customer. Rational Insight uses fact builds to load data into the consolidated operational data store. Fact and dimension builds are used to build the data marts or star schema. The builds are organized into JobStreams as a series of steps that can be executed in sequence or in parallel in Data Manager.
  2. Double-click the RI_Stg data source icon. Click the Query tab.
    icon for data source
    Concept: By using the data source element of the build, you can select a subset of the database required for the build.

    In lesson 1.1, you modified the ClearQuest defect entity to expose the related component ID and status attributes. In the next few steps, you will modify the data source query to include the new attributes.

  3. Expand the ClearQuest node to list the database schemas (modeled as resource groups in through XML Data Configuration). Expand the CQEnterprise schema to list all data tables defined in the CQEnterprise resource group from the data configuration file from lession 1.1. Expand the ENTDefect table to show all of the defined columns.

    You will now modify the existing SQL to retrieve the component_dbid and status columns. Make sure you add a comma to separate the column names.

  4. Drag the ENTDefect columns Status and Component_dbid from the panel at the left to the SQL Query panel at the right. Place the new columns after the existing "href" entry. Make sure that the columns are separated by a comma.
  5. To test the new SQL statement, click
    play icon
    at the lower left of the window. In the window to specify variables, type the schema name (CQEnterprise) for the $RESOURCEGROUP entry. Leave the $MODIFIED_SINCE field empty so that the null value is used by the data service to return all records.
  6. Click the Result columns tab and click Refresh. In the window to specify variables, type the schema name (CQEnterprise) for the $RESOURCEGROUP entry. Leave the $MODIFIED_SINCE field empty so that the null value is used by the data service to return all records. The columns defined in this tab are the resulting columns from the SQL defined in step 7. These columns will be used in downstream objects. Click OK.
    Data source properties
  7. Right-click DataStream and select Properties. In the DataStream items tab, double-click the new columns in the left panel to add them into the data stream, and click OK.
    Concept: A DataStream object gathers together all the data sources from which a build or reference structure acquires data. This provides a method to combine different data sources into one object. An example of this can be seen in the RequisitePro® builds.
You have now added one new column to a catalog and updated the transformation model to reflect this change.

Next, you will transform the component_dbid attribute to the surrogate key of the component in the data warehouse.

Transform the attributes
  1. Right-click the Transformation Model icon and click Mapping.
    Concept: The transformation model is central to a fact build. You use it to manipulate the acquired source data in a number of ways, such as merging data from different sources and aggregating data. In this lesson, you will load the new attributes without applying any business rules.
    Data stream mapping
  2. Double-click the Status data stream.
    Concept: You will create a new attribute in the transformation model and map the Data Stream item to the attribute.
  3. Click Add, click Dimension, specify the name to be Lookup_Component, and select the (no reference) child node. Click the button to the right of this node.
    Transformation model mapping
  4. Select the node Component > Lookup_Component and click OK.
  5. Return to the mapping window. In the left panel, select the Component_dbid data stream item and in the right panel, select the Lookup_Component=>EXTERNAL_KEY1 attribute, and click Map.
    Concept: In the next few steps, you are building the relationship between the defect and component entity in the data warehouse. The EXTERNAL_KEY1 specifies the natural key (dbid for Rational ClearQuest) for the component. The lookup will return the surrogate key if the component exists in the data warehouse. If the record does not exist, the build will insert the default -1 entry representing "info not available."
  6. Click the Add button and click Attribute. Specify the name to be COMPONENT_ID, and click OK twice.
  7. Right-click the Transformation Model icon and select Show Build Elements. Double-click Lookup_Component, click the Unmatched Members tab, select Accept unmatched member identifiers, and click OK.
  8. In the Transformation Model tab, right-click the COMPONENT_ID attribute and click Convert to derivation.
  9. In the Transformation Model tab, double-click the COMPONENT_ID attribute, click the Calculation tab, and type the following code:
    If NOT Unmatched( Lookup_Component )
    then RETURN Lookup_Component.COMPONENT_ID ;
    else RETURN -1; 
    Click OK.
    Concept: For specifying a positive match for this type of condition, a double-negative construction is used in the script.
Deliver the attributes to target table
  1. Double-click the "{$ODS}"."REQUEST" table delivery.
  2. In the Table Properties tab, specify the column names to be STATUS and COMPONENT_ID for the Status and COMPONENT_ID elements. Select the checkbox for Automatically add columns to table. Click OK.
    Concept: Because you did not modify the defect table in the data warehouse to include columns to store the new attributes, checking the Automatically add columns to table option results in the table getting modified if the columns do not exist. You can also use Rational Data Architect to modify the table before executing the ETL job or to update the data model from the data warehouse after the ETL job has modified the table.
  3. Save the catalog.
You have now modified an existing build.
In this lesson, you have:

When you load a new entity, you need to create a new build and modify an existing build to add more attributes to an existing entity. In this lesson, you modified an existing build. In the next lesson (Lesson 1.3), you will create a new build. The ETL process requires one build for each entity.

Lesson checkpoint

You added a new attribute to the Data Manager build, specified the upload location of the attribute in the data warehouse, and updated the Data Manager build to deploy the changes.
< Previous | Next >

Feedback