< Previous | Next >

Lesson 2.2: Customize the default ETL catalog

In this lesson you will learn how to customize the default ETL catalog by creating new dimensions and fact tables. You will add a new delivery module to the Data Manager build and schedule the module for execution.
In this lesson, you will:
Create a reference dimension
  1. Launch IBM® Cognos® Data Manager and load the RationalInsightETL catalog.
  2. Expand the Library node.
  3. Right-click Dimensions and select Insert Reference Dimension. Specify the name to be Component_LabEx.
    Insert a reference dimension
  4. Right-click Component_LabEx, select Insert Hierarchy, specify the name to be Component_LabEx_Dim, and click OK.

    You have now added a new dimension called Component_LabEx_Dim. Next, you will specify where to retrieve the data for the dimension. To do so, you will use a template and specify the columns that you want to load as part of the dimension.

  5. Right-click Component_LabEx_Dim, select Insert Level, specify the name to be Component, and click the Attributes tab.
  6. Click the New button to the right of the Template field, and specify the name to be Component_LabEx_Dim.
  7. Click the Attributes tab and add the following two attribute definitions:
    • COMPONENT_ID, business key, primary key, true
    • NAME, normal
    Click OK.
    Properties of the template
  8. In the Available Attributes section, add COMPONENT_ID and Name as chosen attributes. Select the ID checkbox for the COMPONENT_ID attribute and the Caption checkbox for the Name attribute.
  9. Click the Data Access tab, select Use Template for data access, and specify the following values:
    • For Connection, select Rational Data Warehouse.
    • For Table name, click Browse, and select RIODS -> COMPONENT.
    Click OK.
Create a dimension build
  1. Expand the RationalInsightETL node to list the dimensions under Builds and JobStreams.
  2. Right-click Dimensions, select Insert Dimension Build, type D_COMPONENT_LABEX, and click the Dimension tab.
  3. In the Dimension to be delivered section, select Component_LabEx.
  4. In the Hierarchy/Lookup to be delivered section, select Component_LabEx_Dim(H).
  5. In the Deliver into database section, select Rational Data Warehouse. Click OK.
    Properties of the dimension build
  6. Right-click D_COMPONENT_LABEX, select Insert Table, click table icon, and select the table RIDW > D_COMPONENT_LABEX.
  7. Click the Columns tab. In the Use template section, select Component_LabEx_Dim
  8. In the Available Sources section, expand the node Component_LabEx_Dim > Component. Drag and drop COMPONENT_ID (id) to the Sourced From column for COMPONENT_ID. Drag and drop NAME (Caption) to the Sourced From column for NAME. Click OK.
Create a fact build
  1. Expand the RationalInsightETL node to list the tables under Builds and JobStreams > Facts.
  2. Select Fact, right-click, and click Insert Fact Build. Specify the name of the new build to be F_DEFECT_METRICS and click OK.
  3. Right-click the newly created F_DEFECT_METRICS fact build node and click Properties. Click the Input tab. From the Duplicate Key Handling list, select Merge Records with duplicate keys. Click OK.
    Properties of the fact build
    Concept: The Merge Records with duplicate keys option is equivalent to the GROUP BY clause in SQL. You are using this option so that you can aggregate the rows or else, aggregation on delivery will not work and you might get as many rows as the number of input rows.
  4. Expand the F_DEFECT_METRICS build, right-click the Data Stream node, and click Insert Data Source. Specify the name of the new data source to be RI_DW.
  5. Click the Query tab. From the database list, select the Rational Data Warehouse connection.
  6. Expand the Rational Data Warehouse node to list the tables under the RIODS schema and select REQUEST.
    Concept: RIODS schema is the operational data store.
  7. In the Query panel, add the following SQL code:
    SELECT "REQUEST_ID",
    {$ETL_DATEID} AS DATE_ID, COMPONENT_ID
    FROM "RIODS"."REQUEST"
    WHERE "REQUEST_CLASS_ID"=1 
    SQL query for fact build
    Concept: {$ETL_DATEID} is a variable defined at the master job level and its value is also determined at the top level. The variable points to a record in the D_DATE dimension. In this case, the date associated in the fact table reflects an ETL collection date. See the variables declared in the Build_Star job.
  8. Click the Result columns tab, select Prepare, and click Refresh. Specify any integer value for ETL_DATEID. Click OK twice.

    You have now created a new build named F_DEFECT_METRICS, and added the SQL code to specify which records to fetch from the data source.

  9. Right-click the DataStream icon, and click Properties.
  10. Click Automap, and click OK.
  11. Right-click the Transformational Model icon icon for transformation model and select Mapping.
  12. Click Add, select Attribute, and specify the name of the new attribute to be DATE_ID. Again click Add, select Attribute, and specify the name of the new attribute to be COMPONENT_ID.
  13. Click Add, select Measure, and specify the name of the new measure to be TOTAL_DEFECTS. Click OK twice. Right-click the measure and click Properties. In the Aggregation tab, select SUM from the list. In the Merge tab, select COUNT from the list. Click OK.
    Properties of the measure
  14. Expand the F_DEFECT_METRICS fact build, right-click Transformational Model, and click Insert dimension. In the General tab, specify the name to be Component_Dim. Click the Reference Tab and in the Dimension list, select the Component_LabEx dimension. From the Structure list, select the Component_LabEx(H) hierarchy. Select the output box. In the Unmatched Member tab, select the Accept unmatched member identifiers box. Click OK.
    Concept: With the Accept unmatched member identifiers option, you are accepting rows from the input that do not match a component in D_COMPONENT_LABEX. You can then treat such rows upon delivery (such as specifying a default value if the row is null) or through more derivations in the transformation that have unmatched members.
  15. Click the Transformation Model icon icon for transformation model, expand the nodes in the mapping window, and drag the following items to the Maps To column:
    • DATE_ID to DATE_ID
    • TOTAL_DEFECTS to REQUEST_ID
    • COMPONENT_ID to COMPONENT_ID
    Click OK.

    You have now created a data stream for each element of the data source. You have mapped the DATE_ID element as an attribute, and the TOTAL_DEFECTS element as a measure to contain the count of the attribute element. Next, you will define a fact delivery to specify the location in the data warehouse where this information will be inserted.

  16. Expand the F_DEFECT_METRICS build to list the delivery modules under Delivery Modules.
  17. Select Fact Delivery, right-click and click Insert Relational Table Delivery. Specify the name of the new fact build to be RIDW.F_DEFECT_METRICS.
  18. In the Table Properties tab, select the connection as Rational Data Warehouse, and type "RIDW"."F_DEFECT_METRICS".
  19. In the Columns section:
    • For the DATE_ID element, select the key box.
    • For the other element, select the update box.
    Make sure the Record Identity box is selected.
  20. Click Import Target Table.
  21. Expand the Rational Data Warehouse node to list the tables under RIDW, and select F_DEFECT_METRICS.
  22. Drag the DATE_ID element to the DATE_ID column, the DEFECT_METRICS_ID element to the (Record Identity) column and the TOTAL_DEFECTS element to the TOTAL_DEFECTS column. Click OK.
    import the target table
    Concept: The Record Identity element is present in every table delivery in IBM Cognos Data Manager. The element acts as a surrogate key generator. In this tutorial, you are using the element to generate the values for the primary key of the table.
  23. In the Module Properties tab, from the Table Delivery window select APPEND as the refresh type. Set the commit interval to 1000.
    Fact module properties
    Concept: What this means is that this fact table will always add records to the table and will commit rows to the database 1000 rows at a time.
Add the builds to the job and publish the job
  1. Expand the RationalInsightETL node to list the job streams under Builds and JobStreams > Jobs > StarJobs.
  2. Select Facts, right-click, and click Insert Node > Fact Build Node. Specify the business name of the new node to be F_DEFECT_METRICS.
  3. In the Associated Builds section, select Facts > Fact > F_DEFECT_METRICS.
  4. In the Predecessor tab, select the Start node, and click OK.
  5. Expand the RationalInsightETL node to list the jobs under Builds and JobStreams > Jobs > StarJobs, right-click Dimensions, and select Insert Node > Dimension Build Node.
  6. Specify the name to be D_COMPONENT_LABEX and in the associated build section, select Dimensions > D_COMPONENT_LABEX.
  7. On the Predecessor tab, select the Start node. Click OK.
  8. Republish the Build_Star job as a data movement task:
    1. Select the Build_Star job.
    2. On the menu bar, click Actions > Publish Data Movement Task.
    3. Expand Jobs > Star Jobs, select Build_Star, and click OK.

Lesson checkpoint

You learned how to create reference dimensions, create a dimension builds and fact builds, add builds to jobs, and publish jobs.
< Previous | Next >

Feedback