Data Warehousing
Star Schema
The Star Schema or Snowflake Schema is a database schema that is structured in a
way to efficiently answer complex queries. The design is much more business oriented
than a transactional system would be which is important as the main purpose of the
data warehouse is to provide useful information to the business. For example, powerful
analysis can be done on large amounts of data, often with many millions of records.
The star schema follows Kimball theory and consists of historical change dimensions,
reference dimensions and fact tables. They are related by alternate keys which have
no direct connection to the actual data. For example a customer dimension may have
a customer code, but it will also have an alternate key (kCustomer). In fact in
our designs we have 2 alternate keys for historic change control.
We recommend a snowflake schema for the relational tables which will be collapsed
to a star schema when dimensions are created on the cubes in Microsoft SQL Server
Analysis Services (SSAS). The cube utilises a star schema so it is important to
understand exactly how the star / snowflake should be designed on the relational
tables. We do not generally recommend that OLAP cubes be built directly on top of
the transactional systems. An example of where this is a problem is a product cost
field on a product table in the transactional system, when translated on to a cube
directly it will probably end up in the Product Dimension, not the Fact table, so
it will not easily be able to sum up the cost for that product.