< Previous | Next >

Lesson 2.1: Customize the default schema of the data warehouse

In this lesson you will learn how to add a new table to the default schema and how to relate two tables.
Prerequisite: Before making any changes to the data catalog and to the data warehouse, create a backup. To back up the data catalog, use the Data Manager catalog backup utility. To back up the data warehouse, refer to the documentation of the database that you are using for the data warehouse.
In this lesson, you will: You will use IBM® InfoSphere™ Data Architect during this lesson.
  1. Install the sample physical database model for your database. The sample model can be installed automatically when installing IBM Rational® Insight. The default location of the sample model is C:/Program Files/IBM/Rational Insight/datawarehouse/dw. Extract the sample model.
  2. Start IBM InfoSphere Data Architect. When prompted, select the default workspace.
  3. Create a new data design project for your tutorial:
    1. Click New > Project > Data Design Project.
    2. Specify the name RISDW and accept the default location.
    3. Click Finish. If prompted, choose to display the default view.

      A new data design project is created with containers for the data objects.

      Data project explorer

    4. In the Windows® explorer, navigate to the directory where you downloaded the sample data model in Step 1. Copy the sample data model to the default location (step b).
    5. Return to IBM InfoSphere Data Architect, right-click the RISDW project, and click Refresh.

      The data design project now displays the physical data model sample.

      Data project explorer with data model

  4. In the Data Project Explorer view (Window > Show View > Data Project Explorer), expand Data Models to show the physical data model (nodes with a dbm extension). Double-click the data model to expand it.
    Concept: The first node in the tree structure designates the database alias. To see the components of the database, such as schemas, tablespaces, and bufferpools, expand the physical model.
    Data project explorer with data model exapnded

    In this lesson, default schema names are used. If you changed the names of schemas in the database, the names used in this lesson might not match schema name in your database. The default names are CONFIG, RIODS, and RIDW. The CONFIG schema is a static schema and should not be changed. The RIODS schema is the designated operational data store. The RIDW schema is designated for the data mart (facts and dimensions) and can be renamed to fit the schema naming conventions of your Rational Insight data warehouse.

    In this lesson, you will modify the RIDW schema.

  5. Right-click RIDW and click Add Data Object > Table.
  6. Specify the table properties:
    1. Specify the table name to be F_DEFECT_METRICS.
    2. In the Properties view for the table, click Table Spaces, select VSTR_32K for Regular and VSTR_IDX for Index.
      Data project explorer with data model expanded
      Concept:
      • For naming the fact tables in the data mart, the convention is to prefix the names with F_.
      • VSTR_32K is the tablespace designated for the data mart.
      • VSTR_IDX is the tablespace designated for the indexes for the data mart.
  7. Add columns to the table:
    Concept: Your table will contain a measure for the total number of defects by the two dimensions Date and Component. You will pick up the date dimension from an existing dimension table named D_DATE. For the Component dimension, you will create a new dimension table.

    Your fact table will have five columns: TOTAL_DEFECTS, DEFECT_METRICS_ID, REC_DATETIME, COMPONENT_ID and DATE_ID. You will designate the DEFECT_METRICS_ID column as the primary key and use it to uniquely identify the fact row.

    1. In the Data Project Explorer view, select the F_DEFECT_METRICS table, right-click and click Add Data Object > Column.
    2. Specify the name to be TOTAL_DEFECTS. In the Properties view, click Type, and specify the following attributes: Data type as INTEGER, Default Value as 0, Not Null is selected.
    3. Repeat step (a). Specify the name to be DEFECT_METRICS_ID. In the Properties view, click Type, and specify the following attributes: Data type as INTEGER, and Primary Key and Not Null are selected.

      A primary key named F_DEFECT_METRICS1 is created automatically.

    4. Repeat step (a). Specify the name to be REC_DATETIME. In the Properties view, click Type, and specify the following attributes: Data type as TIMESTAMP, Default value as CURRENT_TIMESTAMP, Not Null is selected.
    5. Repeat step (a). Specify the name to be COMPONENT_ID. In the Properties view, click Type, and specify the following attributes: Data type as INTEGER, Not Null is selected.
    6. Repeat step (a). Specify the name to be DATE_ID. In the Properties view, click Type, and specify the following attributes: Data type as INTEGER, Not Null is selected.
    table columns

    You have now created a new fact table named F_DEFECT_METRICS and added a primary key to the table. In the next few steps, you will add a foreign key to this table.

    By default, referential integrity is enforced in the data warehouse. Therefore, in the next step you will implement referential integrity between the fact table and the dimensions that are referenced in the fact table.

  8. Add a foreign key to the F_DEFECT_METRICS table:
    1. Right-click the F_DEFECT_METRICS table, and click Add Data Object > Foreign Key
    2. Specify the parent table to be the D_DATE table. Click OK.
      Concept: D_DATE is the date dimension table used in the data warehouse.
    3. Specify the name of the foreign key to be DATE_ID.
    4. Select DATE_ID and in the Properties view, click Details. In the Parent section, in the Unique constraint or index list, select the D_DATE_PK primary key. If prompted, select Use the existing child/attribute column.
      Concept: D_DATE_PK is the primary key in the other table (D_DATE table) that the foreign key of this table (F_DEFECT_METRICS) is making a reference to.

      A foreign key named DATE_ID is created for the F_DEFECT_METRICS fact table. Through this foreign key, you have created a relationship to an existing table D_DATE.

      In the next step, you will add a new dimension table in the data mart. The dimension table will be referenced by the fact table F_DEFECT_METRICS you created.

  9. Right-click RIDW, and click Add Data Object > Table.
  10. Specify the table properties:
    1. Specify the table name to be D_COMPONENT_LABEX.
      Concept: For naming the dimension tables in the data mart, the convention is to prefix the names with D_.
    2. In the Properties view for the table, click Table Spaces, select VSTR_32K for Regular and VSTR_IDX for Index.
  11. Add columns to the table:
    Concept: Your table will have two columns: NAME and COMPONENT_ID. You will designate COMPONENT_ID as the primary key and use it to uniquely identify the dimension row.
    1. Right-click the D_COMPONENT_LABEX table and click Add Data Object > Column.
    2. Specify the column name to be NAME. In the Properties view of the column, specify the following attributes: Data type as VARCHAR, Length as 255, Not Null is selected.
    3. Repeat step (a). Specify the column name to be COMPONENT_ID. In the Properties view of the column, specify the following attributes: Data type as INTEGER, and Primary Key and Not Null are selected.

      A primary key called D_COMPONENT_LABEX1 is created automatically.

    You have now created a new dimension table named D_COMPONENT_LABEX and added a primary key to the table. In the next few steps, you will use this primary key to link it to the F_DEFECT_METRICS fact table you created earlier.

  12. Link the F_DEFECT_METRICS fact table to the D_COMPONENT_LABEX dimension table:
    1. Right-click F_DEFECT_METRICS and click Add Data Object > Foreign Key.
    2. Specify the parent table to be D_COMPONENT_LABEX. Click OK.
    3. Specify the name of the foreign key to be COMPONENT_ID.
    4. Select COMPONENT_ID and in the Properties view, click Details. In the Parent section, in the Unique constraint or index list, select the D_COMPONENT_LABEX1 primary key. If prompted, select Use the existing child/attribute column.

    You have linked the F_DEFECT_METRICS fact table to two dimension tables: the D_DATE table for the date attribute and the D_COMPONENT_LABEX table for the component attribute.

    You will now produce the template for the new tables you added. To do so, you will export the tables to a DDL script and run the script.

  13. Generate a DDL script for the D_COMPONENT_LABEX table:
    1. Right-click the D_COMPONENT_LABEX table and click Generate DDL.
    2. In the Options page, select:
      • Fully qualified names
      • CREATE statements
      • COMMENT ON statements
      • IN TABLESPACE clause
      Click Next.
    3. In the Objects page, select:
      • Check constraint
      • Foreign key constraint
      • Primary key constraint
      • Tables
      Click Next.
    4. Select Run DDL on server and click Next.
      Remember: For the tutorial, you will use a sample database (RIDW). Be careful when you use the Run DDL on server option to run a DDL script on a live database. With IBM InfoSphere Data Architect, you cannot use a migration function; so, data cannot be preserved for an existing table. The table is either created, or dropped and then created.

      You can choose to just save the generated DDL script and run the script later.

    5. In the Connection page, select Use an existing connection, select RIDW, and click Next.
    6. Type the user name and password of the database administrator, click Next, and click Finish.
    7. Save the model.
  14. Generate a DDL script for the F_DEFECT_METRICS table:
    1. Right-click the F_DEFECT_METRICS table and click Generate DDL.
    2. In the Options page, select the following options:
      • Fully qualified names
      • CREATE statements
      • COMMENT ON statements
      • IN TABLESPACE clause
      Click Next.
      elements for DDL script
    3. In the Objects page, select the following options:
      • Check constraint
      • Foreign key constraint
      • Primary key constraint
      • Tables
      Click Next.
      objects for DDL script
    4. Select Run DDL on server and click Next.
      generate DDL and run on server
    5. In the Connection page, select Use an existing connection, select RIDW, and click Next.
    6. Type the user name and password of the database administrator, click Next, and click Finish.
    7. Save the model.

    You have now applied the changes so that the structure of your tables are modified.

In this lesson, you have:

Lesson checkpoint

You created a new data source, defined the measures for the elements of the data source, and saved the changes.
< Previous | Next >

Feedback