Designing Dimensions and Attributes


 

Dimension design is one of the most critical factors in designing a high performance SSAS solution. A bad dimension design can very easily break an otherwise well designed cube. As you would already know, dimensions are composed of attributes and these attributes can be related to one another thru User Defined Hierarchies as well as Attribute Relationships. Understanding and mastering the difference between Hierarchies and Relationships is important and can immensely help improve cube performance.

KeyColumn

The most important property that can be used to define an attribute is the KeyColumn property. The KeyColumn property can point to a single column in the source table or even multiple columns and is needed to uniquely identify each member of the attribute. Keep in mind the following recommendations while working with the KeyColumn property of an attribute.

  1. KeyColumns should hold as small a value as possible. When given a choice between pointing the KeyColumn of an attribute like Country to a string column called CountryName or a numeric column called CountryID, it is always better to go with CountryID. This reduces the size of the dimension and the processing time.
  2. Another benefit of going with the CountryID column instead of CountryName is that an ID is always constant over different locales, whereas Names can change. For example, a country name of India may be used wherever the locale is English, but if the default regional settings of an end-user is Hindi, then we would like the country name returned to him to be भारत. The KeyColumn needs to be a constant value not dependent on factors like locale, culture, etc.
  3. Sometimes you may need to pay special attention to the uniqueness of the values returned by the KeyColumn. For example, when choosing the KeyColumn for the ZipCode attribute of the Geography dimension, you may be tempted to choose the zipcode column itself. However there is every possibility that two cities in different corners of the world may have the same zipcode. So zipcode simply may not be enough to enforce uniqueness between different members of the ZipCode attribute. You may have to concatenate the ZipCode column with the City column, to enforce uniqueness.
  4. Special care needs to be taken with attributes of the Time dimension. For example, if I point the KeyColumn of the Month attribute to the MonthNumber or MonthName column in my source table, then the output produced by the cube would be something like this. There would be no way the user will be able to differentiate between the same month falling in different years or quarters.

The solution would be to have a composite key, comprising of Month and Year as the KeyColumn for the Month attribute. Composite keys should be used only when required as they increase the size of the dimension and the processing time.

NameColumn

The NameColumn property specifies the column whose value will be displayed to users. By default, if the NameColumn is not explicitly assigned, it is automatically set to the value of the KeyColumn property. In situations where the value returned by the KeyColumn is the final value that you want to show to your end-users, this behavior is fine and works well. However in most situations if your KeyColumn is something like CategoryID, then you probably want CategoryName to be returned and shown in your reports.

The NameColumn property becomes mandatory, whenever the KeyColumn is composite. The reason for this should be self-explanatory. If the KeyColumn is a composite value, then which one of those numerous values that it is composed of should be returned to the enduser?

The source column that NameColumn points to is valid only for the default locale. Using Dimension or Cube translations, the NameColumn of an attribute can be assigned to different source columns for different locales. For example, in the AdventureWorksDW database, the Product table has columns like EnglishProductName, SpanishProductName, FrenchProductName etc. The NameColumn can point to EnglishProductName for the default locale, but point to SpanishProductName for the Spanish locale.

Since every dimension must have a Key Attribute (analogous to the primary key in a table), the Key attribute mostly points to the primary key in the underlying table. For example, the Product dimension will use ProductID as its Key Attribute. However ProductID in most cases would be of no use to end-users. They would rather see the names of the products rather than some cryptic id or a running integer value. The Key attributes in such cases are excellent candidates for making effective use of the NameColumn property. The Key attribute of the Product Dimension (ProductID) can have its KeyColumn point to ProductId, but have its NameColumn property point to ProductName. This way we continue to use ProductID as the identifying column, but show a much more useful value to the end-user. This also allows us to eliminate the ProductName attribute from the dimension. Reducing the number of attributes in a dimension is always welcome.

ValueColumn

The ValueColumn property is perhaps one of least understood and used property. It allows us to hold some additional information about the attribute. For example, for an ProductName attribute, the KeyColumn could be ProductId, NameColumn could be ProductName and the ValueColumn could be ProductPhoto. The ValueColumn is typically used for calculations and its value can be accessed using the MemberValue MDX function.

AttributeHierarchyEnabled

In SSAS, every attribute added to a dimension results in the creation of attribute hierarchy. It also automatically creates member properties for every attribute that is directly related to the Key Attribute of the dimension. Attribute hierarchies can be used in the query axis of MDX queries, but member properties cannot. They can be used only by querying the attribute (mostly the key attribute) that contains the member property. For example, a employees date of birth of email address may be needed for informational purposes in reports, but not to slice or dice a measure on a query axis. Hence date of birth and email address can be classified as member properties rather than attribute hierarchies. So if we only want to access an attribute as a member property, we can disable the automatically generate attribute hierarchy by setting the AttributeHierarchyEnabled property to false.

Disabling AttributeHierarchyEnabled improves the processing performance of the cube. It also decreases the cube size because the attribute will no longer be indexed or aggregated. However this turn into a drawback as well since filtering measures using member properties will be much slower than attribute hierarchies.

AttributeHierarchyOptimizedState

The AttributeHierarchyOptimizedState property can be used to turn off the creation of bitmap indexes for an attribute. The following section is taken from the SSAS 2008R2 Performance Guide.

“During processing of the primary key attribute, bitmap indexes are created for every related attribute. Building the bitmap indexes for the primary key can take time if it has one or more related attributes with high cardinality. At query time, the bitmap indexes for these attributes are not useful in speeding up retrieval, because the storage engine still must sift through a large number of distinct values. This may have a negative impact on query response times.

For example, the primary key of the customer dimension uniquely identifies each customer by account number; however, users also want to slice and dice data by the customer’s social security number. Each customer account number has a one-to-one relationship with a customer social security number. You can consider removing the creation of bitmaps for the social security number.

You can also consider removing bitmap indexes from attributes that are always queried together with other attributes that already have bitmap indexes that are highly selective. If the other attributes have sufficient selectivity, adding another bitmap index to filter the segments will not yield a great benefit.

For example, you are creating a sales fact and users always query both date and store dimensions. Sometimes a filter is also applied by the store clerk dimension, but because you have already filtered down to stores, adding a bitmap on the store clerk may only yield a trivial benefit. In this case, you can consider disabling bitmap indexes on store clerk attributes.”

The full guide can be downloaded from here.

AttributeHierarchyOrdered

The AttributeHierarchyOrderded property controls whether members of an attribute hierarchy are sorted in a particular order or not. The OrderBy and OrderByAttribute property can be further used to control the exact way in which sorting is desired. In case there are certain attributes where the order of members is not important, then setting AttributeHierarchyOrdered to false can save processing time for the dimension.

IsAggregatable

The IsAggregatable property indicates whether an attribute can be aggregated in a hierarchy. If this property is true, then SSAS will automatically add an “ALL” member to the attribute.

In a majority of dimensions, an aggregation of all the descendants which rolls up to the ALL level is desired, but there are certain cases where it could be detrimental. The best example for this can be found in cubes which contain data that deals in multiple currencies. In such cases you will have a currency dimension. Performing a summation across all currencies would not make any business sense. So in order to prevent queries from accessing the ALL level, you can disable the ALL member by setting IsAggregatable to false.

Still to come – Attribute Relationships and User Defined hierarchies

Advertisements

One Response to Designing Dimensions and Attributes

  1. Pingback: Designing an OLAP Solution using SSAS « nauzadk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: