Dimension Table

Summary: in this article, we will examine dimension table concept, surrogate keys in dimension tables and a brief overview of slowly changing dimension.

In data warehousing, a dimension table is one of the companion tables to a fact table in the star schema. Different from a fact table that contains measures or business facts, a dimension table contains the textual descriptor of the business. The fields of dimension table are designed to satisfy these two important requirements:

  • Query constraining / grouping / filtering.
  • Report labeling

Dimension table example

In the schema below we have 3 dimension tables Dim_Date, Dim_Store and Dim_Product surrounding the fact table Fact_Sales.

 Dimension Table Example

Surrogate keys in dimension tables

It is critical that primary key’s value of a dimension table remain unchanged.  And it is highly recommended that all dimension tables use surrogate keys as primary keys.

Surrogate keys are key generated and managed inside the data warehouse rather than keys extracted from data source systems.

There are several advantages of using surrogate keys in dimension tables:

  • Performance – join processing between dimension tables and fact table is much more efficient by using single field surrogate key.
  • Integration – in term of data acquisition, the surrogate key allows integrating data from multiple data sources even if they lack consistent source keys.
  • Manage version of data – keep track of changes in dimension field values in the dimension table.

It is so important that the dimension tables should be designed in such a way that they can be shared between multiple data marts and cubes within a data warehouse. This ensures that data warehouse provides consistent information for similar queries. And surrogate key must be used as primary keys of dimension tables to enable the dimension tables to be shared easier.

Slowly changing dimension

The attributes of a given record in dimension table could be changed e.g. product description, shipping address. This phenomenon is known as slowly changing dimension and there are corresponding techniques to deal with each type of slowly changing dimension effectively:

  • Type 1 is used when the history of the data is not important. When data in the data source changes, the corresponding dimension attribute is overwritten.
  • Type 2 is used when the change of data in the data source is important and you want to preserve the historic context of facts with the corresponding to the changing data. When data in the data source changes, a new row is inserted into the dimension table. The previous row remains unchanged.
  • Type 3 – this happens when you want to learn about every fact before and after the attribute changes. To deal with this, you can introduce a new attribute to the existing row and update value to both fields.

We have examined how the dimension tables fit into star schema and their relationships with the fact table. We also discussed surrogate keys and slowly changing dimension concepts applied to the dimension table.

  • Was this tutorial helpful ?
  • YesNo