Data Models

The starting point is a reporting requirement. This requirement is formalized by drawing a logical user model. A user model captures the end user's model or view of the data required. User models help to determine the structure of the data marts. Drawing the user model involves picking out the measures or facts from the reporting requirement and the dimensions that the measure is sliced by. The user model should also identify the level of granularity required. The granularity is very important, as it determines what and how much data is captured.

Modeling the Data Marts

In order to model the Data Mart, one must logically grouping the user models. One related group of user models will form the starting point for modeling a single Data Mart. Data Marts are defined as dimensional Star Schemas. The Data Marts contain the measures and the dimensions that are of interest. A measure is usually a numerical 'fact', e.g. 'sales', and dimensions are a way of narrowing the view of this fact, e.g. 'sales for product x in the month of June'. It should be possible to trace each item on a user model to a column in the Data Mart tables. The Data Mart must also provide the granularity specified in the user models. The datamarts are de-normalized and this makes querying them easier.

Models in the Data Mart conceptual model are engineered as star schema structure where one fact table references any number of dimension tables.

Modeling the CDW

The next step is to design the CDW. The CDW is the main data storage area. This supports the data needs of the various Data Marts and also captures the underlying business processes.

The CDW model is engineered a normalized Entity-Relationship structure. It contains the lowest level of granularity that may be of interest and is a model of the business processes from a reporting viewpoint.

The following model is part of the CDW conceptual model, which is further discussed in the 'Directory Structure & Artifacts' section.

Modeling the Staging Database

The final data model is the Staging area. This is where the data of interest from the source OLTP data source is stored. It is derived by mapping the CDW to the source system. It contains a copy of every table of interest from the source system.