Metadata management

Every phase of dimensional modeling produces some amount of metadata. Metadata describes the dimensional model in business terms that are easily understood.

At the dimensional model level, a list should be provided of what is available in the data warehouse. This list should contain the models, dimensions, and facts that are available, as these objects will all be used as initial entry points for data analysis.

For each model, provide a name, definition, and purpose. The name simply gives something to focus on when searching. Usually, the name is the same as the name of the fact table. The definition identifies what is modeled, and the purpose describes what the model is used for. The metadata for the model should also contain a list of dimensions, and measures that are associated with the model. These data objects will be used as initial entry points for data analysis.

A name must be provided for all dimensions, dimension attributes or columns, and measures.

Metadata for a dimension should also include hierarchies, slowly changing dimensions, and the attributes and measures that are associated with the dimension. A hierarchy defines the relationships between attributes of the dimension that identify the different levels that exist within the dimension. For example, in a sample Seller dimension, you have the Sales Region, Outlet Type (corporate or retail), Outlet, and Salesperson hierarchies. The hierarchies document the roll-up structure of the dimension. Slowly changing dimensions identify how changes to attributes within a dimension are dealt with. If you must track the history of individual columns or attributes, record the slowly changing dimensions with the columns or attributes.

The columns or attributes of a dimension are used to identify which measures to analyze. For columns or attributes to be used effectively, metadata about the columns or attributes should include the data type. At this point, a general indication of the data type (such as character, date, and numeric) is sufficient. Exact data type definitions can be developed when you design the model. The domain of a column defines the set of valid values.

Metadata about a fact should include the grain of time or date for the fact. Although you can derive the grain of time for a fact through its relationship to the time dimension, you should define the grain to properly analyze the data in a dimensional model.

Dimensional modeling in the workbench

In the workbench, the metadata for the dimensional model is known as dimensional notation. The metadata is contained within the data model. The workbench displays dimensional notation if you enable dimensional notation for a data model. You can discover or specify dimensional object types in the data model.


Feedback