Online Analytical Processing (OLAP)

OLAP or online analytical processing is another component of BI. It enables a user to easily and selectively extract and view data from different points-of-view. OLAP data is stored in a multidimensional database. The main component of OLAP is the OLAP server, which sits between client and database management systems (DBMS). OLAP tools enable users to analyze different dimensions of multidimensional data. For example, it provides time series and trend analysis views. OLAP often is used in data mining.

Online Analytic Processing is the capability to store and manage the data in a way, so that it can be effectively used to generate actionable information.
OLAP provides the building blocks to enable analysis (like rich functions, multi-dimensional models, analysis types). Mostly the end-user tools (like business modeling tools, Data mining tools, performance reporting tools..), which sit on top of the OLAP to provide rich user Business Intelligence interface.

Most OLAP systems typically utilize one or many of the following three storage paradigms to support multidimensional analysis. Let’s take closer look at each of these:-

MOLAP:-

The “M” in MOLAP refers to Multidimensional. Multidimensional OLAP (MOLAP) stores the aggregated values in a multidimensional structure on the OLAP server.

An aggregated value is a pre-calculation of a cell within an OLAP cube defined by its multidimensional definition. Aggregations are pre-calculated summaries of data that improve query response time by having the answers ready before the questions are asked. For example, when a data warehouse fact contains hundreds of thousands of rows, a query requesting the weekly sales totals for a particular product line can take a long time to answer if the fact table has to be scanned to compute the answer.

The benefit of a MOLAP storage mechanism is query speed. The retrieval of information from a MOLAP cube is fast, especially when compared to a ROLAP storage mechanism.

ROLAP:-

The “R” in ROLAP refers to “relational.” Relational OLAP (ROLAP) stores aggregations in a relational structure and leaves the partition’s source data in its existing relational structure.

This architecture places a heavy burden on the relational database system (RDBS) where the source tables for the OLAP cubes is stored. Every query issued against the OLAP information first passes through the OLAP definition, then is processed by the underlying RDBS prior to returning a result.

The primary driver to use ROLAP is data storage limitations. The benefit of ROLAP is twofold. First, there is very little additional disk storage needed to accommodate a ROLAP implementation. Unlike MOLAP, the ROLAP storage mechanism doesn’t cause data explosion. Second, the processing time is very short when compared to MOLAP. ROLAP does not aggregate (pre-calculate) the data available in a cube, so there is very little processing that occurs when creating the cube.

HOLAP:-

The “H” is for “hybrid.” Hybrid OLAP (HOLAP) stores aggregations in a multidimensional structure on the OLAP server and leaves the partition’s source data in its existing relational structure.

In the hybrid model, the aggregated (or pre-calculated) values are stored in the MOLAP section and queries that can be answered by the aggregated values have fast response times. When a query retrieves information below the aggregated values and accesses the ROLAP section, response time slows down as the relational system retrieves the data prior to passing it back through the OLAP system.

For the Business users the storage mode is transparent – or at least it should be. Business people don’t care whether an OLAP system uses MOLAP, ROLAP or HOLAP. What business users will pay attention to is how much effort they need to expend and how long they have to wait in order to perform fast and flexible analysis of their data.

0 comments: