Proactive Caching Explained


What is Proactive Caching?

SSAS provides us with different storage modes for cube partitions and dimensions – MOLAP, HOLAP and ROLAP. In MOLAP, the source data as well as the aggregations are stored in a multidimensional structure. Hence this mode gives us the best performance, but very high latency. At the other end, the ROLAP storage mode does not cause a copy of the source data to be stored in the multidimensional database and leaves it in the underlying relational database itself. Because of this, performance is not as good as MOLAP, but there is a misconception that is produces less latency and users can see the data as it changes in the source system. Proactive caching comes into the picture when you want the best of both worlds – users to be able to see the most recent data, but at the same time take advantages of performance benefits of the MOLAP mode. As the name suggests – data and aggregations are cached by SSAS. The cache gives us the performance boost. At the same time, SSAS can monitor the source system for changes to the data. At a set schedule or whenever the data in the underlying system changes, SSAS can invalidate the old cache and build a new one with the most up-to-date data from the source system.

Proactive caching can be configured on a partition in a cube or on a dimension. Also note that this feature is available only in the Enterprise edition of SSAS and not in the Standard edition.

Exploring Proactive Caching

In order to understand proactive caching a little better, I have put together a small demo thru which I will explain some other details. The demo is based on the AdventureWorks Analysis Services Project that ships as a sample with SQL Server.

If you look at the storage settings for any partition in a cube, you have the following standard settings –

With the exception of the MOLAP option (on the extreme right hand side), all the other options use Proactive caching (by default) in some way or another. The Scheduled MOLAP is the only option which uses proactive caching, but does not monitor the source system for changes. Rather it updates the cache on a fixed predefined interval. The Automatic, Medium Latency and Low Latency options are more or less the same and only differ in the values used for certain properties. To experiment with proactive caching, I am going to use one of these three options, let’s say the Automatic MOLAP option. Since this option depends on notifications sent by SQL Server whenever data in the source changes, we must specify one or more tracking tables. Alternatively we could use Scheduled polling instead of tracking tables.

If I click on the Options button, I am presented with the following dialog thru which I can enable or disable proactive caching, change its behavior and control notifications.

Right now, I am going to leave all these default values untouched and go to the Notifications tab. I have selected FactInternetSales as the tracking table (because the partition on which I am experimenting, is based on the FactInternetSales table).

Before I start browsing the cube, I am going to use SQL Server Profiler and start two profiler traces. One trace will run on the SSAS database and the other one will run on the AdventureWorksDW2008R2 relational database.

For the trace on SSAS, I have only taken events from the Queries and Query Processing Category.

For the trace on the AdventureWorksDW relational database, I have only taken the RPC:Completed, SQL:BatchStarting and SQL:BatchCompleted events. Furthermore, to reduce clutter, I have a column filter created on the DatabaseName column.

In the cube browser, I drag the Internet Order Quantity measure from the Internet Sales measure group. The current value displayed in 60398.

If I take a look at the Profiler trace for SSAS, I can see the presence of a number of Query Dimension and Calculate NonEmpty Begin events. The presence of these events indicates the building of the MOLAP cache in memory.

If I come back to the cube browser and refresh the window or drag some other measure from the same measure group, this time SSAS simply picks up the data from its internal cache.

This is evident from the Profiler trace which now shows the presence of the Get Date from Cache event.

All along, the SQL Profiler trace for the source relational database has been completely blank, proving that whatever queries we have fired so far have not needed to access the source system.

Now I am going to modify a row in the FactInternetSales table in the AdventureWorksDW2008R2 database. I have increased the value in the OrderQuantity column by one for one of the rows. If I come back to the cube browser, and refresh it, (and remember we have not re-processed the cube or the partition after changing the data), we will see the Internet Order Quantity incremented by one.

We can also now see a flurry of events generated in the trace window of the RDBMS related to the notifications sent by SQL Server to SSAS and the subsequent processing of the affected partitions triggered by SSAS.

This sequence of activities shows us the most basic capabilities of proactive caching. Cache the data, wait for notifications about changes in data, when notifications arrive, recreate the cache and serve hot fresh data.

Now let’s look at some other ways in which we can control proactive caching. There are 4 important properties –

  1. Silence Interval – This property allows you to instruct SSAS to ignore a change notification for the amount of time as specified by this property. For example, if I set Silence Interval to 5 mins, and if some data changes in the source system at 08:00 hrs. SQL Server will ignore this notification for the next 5 mins, i.e. until 08:05 hrs. This indirectly means that SSAS will not rebuild its internal cache until 08:05 hrs. This property is useful in scenarios where it is acceptable for users to see stale data up to a certain level (say 3 hours). So even if there is a change to the source system every 15 mins, SSAS will not rebuild its cache every 15 mins. It would wait until the silence interval (in this case – 3 hours) elapses before it starts to rebuild the cache.
  2. Silence Override Interval – This property allows you to instruct SSAS to ignore change notifications x units of time after it has started rebuilding its cache. For example, if SSAS takes an hour to rebuild its cache, and 45 mins into this rebuilding, it receives another data change notification, it would drop whatever it has done for the last 45 mins and again start from scratch to rebuild the cache. However if the Silence override interval is set to 10 mins, then once SSAS has spent 10 mins rebuilding the cache, it will ignore all other change notifications until the cache has been completely rebuilt.
  3. Latency – Latency is a property which allows us to control the duration of time for which stale data can be served by SSAS. For example, if SSAS receives a change notification at 08:00 hrs. and it starts rebuilding its cache. However rebuilding the cache may take an hour or so to complete. If as per the business requirements, the maximum latency that can be tolerated is 30 mins, then 30 mins would be the value assigned to the Latency property. When the time specified by the Latency property expires and SSAS has yet to finish rebuilding its cache, it will temporarily switch over to the ROLAP mode and directly access the data in the source RDBMS. For the duration of time that it takes to complete the rebuilding of the cache, the partition will act like a ROLAP partition. The Latency property works in conjunction with the Bring Online immediately property.
  4. Rebuild interval – If you want the cache to be rebuilt based on a fixed schedule, regardless of whether the data in the source system has changed or not, you can specify an appropriate value in the Rebuild interval property.

 

In the settings above, I have specified a silence interval of 2 minutes. The initial value of the Internet Order Quantity measure is 60398.

Using SQL Server management Studio, I increment the value of the OrderQuantity column by 1 for the 1st row in the FactInternetSales table. However SSAS still continues to return me the old value for the Internet Order Quantity measure.

I will have to wait for 2 minutes before refreshing the cube browser to be able to see the new updated value.

A look at the Profiler trace for SQL Server confirms our understanding. Below we can see that SQL Server Management Studio fired an update query at 19:01 hrs. and SSAS reprocessed our partition at 19:03, exactly 2 minutes after the update.

General mis-conceptions regarding ROLAP mode and caching.

During most of my SSAS training sessions, there is one point where I often find experienced SSAS professionals misunderstanding proactive caching and most of the misunderstanding has to do with the ROLAP storage mode. A lot of SSAS professionals will tell me that if I always (every MDX query) want SSAS to pick up data from the source tables, I should use ROLAP as the storage mode and disable proactive caching. This way nothing will be cached and the data delivered as output will always be the most up-to-date data.

That is wrong. While by definition, the ROLAP storage mode leaves data in the underlying relational system and does not make a copy of it in the multidimensional database (SSAS database). However before serving any data as the output of an MDX query, SSAS always caches the data. The cache is mandatory and currently to the best of my knowledge there is no way you can turn this caching behavior off. You can clear the cache whenever you want, but you cannot turn the caching feature off. The important thing to understand here is that proactive caching <> SSAS cache. When you turn off proactive caching (for any storage mode including ROLAP) you are simply turning off the feature that allows SSAS to detect changes in the source system and refresh the cache with fresh data when changes happen. When you disable proactive caching, you are not; as some people imagine, turning off the caching capability of SSAS. Proactive caching on or off, the SSAS cache still remains.

To prove this, I am going to change the storage mode of one of the partitions to ROLAP. I have also disabled proactive caching.

Now, the cube browser currently shows me a value of 60398 for Internet Order Quantity. I increment the value of OrderQuantity in the source system by 1. And no matter how long I wait, the output returned will always be 60398.

The profiler trace for the relational engine shows no trace of activity from SSAS no matter how many times I refresh my cube browser.

However the profiler trace for SSAS shows us the presence of the Get Data from Cache event, indicating that even through the storage mode is ROLAP; SSAS is still serving the query from its internal cache and not making a round trip to the relational engine everytime I run an MDX query.

And since proactive caching has been disabled, there are no notifications sent to SSAS informing it about the data changes in the source system and SSAS continues to dish out stale data from its internal cache.

Taking a cue from this blog post by Jamie Thompson, you can clear the SSAS cache manually using a query as follows –

<ClearCache xmlns=”http://schemas.microsoft.com/analysisservices/2003/engine”&gt;

<Object>

<DatabaseID>Adventure Works DW 2008R2</DatabaseID>

</Object>

</ClearCache>

 

If I refresh my cube browser after executing the above XMLA query, I get the new data returned as output.

A quick glance at the RDBMS profiler trace confirms that SSAS did reconnect to the FactInternetSales table to rebuild its internal cache.

So this should effectively prove that even in the ROLAP mode, caching does take place and you are not making a round trip between SSAS and your relational system for every query that is fired.

Advertisements

2 Responses to Proactive Caching Explained

  1. Santhosh Kumar says:

    Nice post and thanks for your efforts.

  2. tushar joshi says:

    excellent blog .. explained the concept of proactive caching in detail.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: