- Transaction Processing(OLTP)
- Analytic Processing, using statistical method(OLAP) or machine/computational learning method(Data Mining)
which is based on E.F. Codd's relation model, is the traditional (maybe
most popular) application type of DBMS and most people are very
familiar with it. This post tries to summarize related technologies in
analytic processing, which is widely adopted in decision support
Part I - What Data to Analyze?
In Decision Support(or Business Intelligence)
system, data to be analyzed usually comes from operational system,
i.e., OLTP relational database. These relational databases are often
located at different departments/sites, may be using different DBMS
vendor's products, using different data schema and merely contain data
within a relatively short time span.
To make a good business
decisions, it's strongly desired to hold historical data, view them in
a uniformed way and not bothering the daily operational environment.
Thus comes out the Data Warehouse, which is a repository of an organization's electronically stored data and is designed to facilitate reporting and analysis.
Operational data (in OLTP system) is extracted, transformed(also cleaned) and loaded(by ETL subsystem) into
the data warehouse for further analyzing. OLAP and DM systems read
these data, analyze them, produce useful reports and present them to
end(business) users. (See diagram below)
Part II - How to Analyze?
There are two ways to analyze data in data warehouse:
1. Data Mining
Data Mining is the extraction of hidden predictive information from large databases. Someone also defines it as knowledge discovery process(using machine learning algorithms) in database.
- Typical Data Mining enabled information systems process data in terms of Record(or Case).
- Such system also provides some Language Extension to facilitate composing data mining related queries. (for example, MS SQL Server provides DMX - Data Mining Extensions)
The challenging part of data mining is various mining algorithms. Here is a list of data mining algorithms available in MS SQL Server Analysis Service.
See - A Basic Data Mining Tutorial using MS SQL Server
2. OnLine Analytical Processing(OLAP)
OLAP is the processing of large scale multidimensional data using statistical based methods. A typical OLAP system provides:
- Multidimensional Model
- Analytical Query Language (for example, MS SQL Server provides MDX - Multidimensional Expression)
- Analyzing Server(or Engine) that executes analytical queries
See - A Basic Tutorial for OLAP in MS SQL Server
2.1 Multidimensional Model
model view data as as cubes that generalize spreadsheets to any number
of dimensions. It categorizes data either as numerical values(a.k.a. measures) associated with some facts or textual values(a.k.a. dimensions) that characterize the facts.
Facts represent the subject - the interesting event in a enterprise that need to be analyzed.
Dimensions represent context information for facts, perspectives to view facts.
Measures represent those numeric properties of facts that decision makers want to analyze.
example, in a shoe shop, shoe purchasing events are the facts, the
selling price is a measure attribute and the color, the size, the
manufacture and the brand are all dimension attributes.
More tutorial explanation of this model can be found in .
2.2 OLAP Server Architecture
There are three ways to implement multidimensional model:
- ROLAP (Relational OLAP)
data is stored in relation model based storage system and some special
induces technologies may be adopted. In this architecture, measures are
derived from the records in the fact table and dimensions are derived
from the dimension tables.
of them contains fact table and dimension table, but in star schema,
there is only one table for one dimension, while in snowflake schema,
there are usually multiple tables for one particular dimension.
- MOLAP (Multidimensional OLAP)
data is stored in an optimized multi-dimensional array storage, i.e.,
the server supports the multidimensional model directly.
It's usually regarded to be faster but less scalable than ROLAP.
- HOLAP (Hybrid OLAP)
is a combination of ROLAP and MOLAP. HOLAP allows storing part of the
data in a MOLAP store and another part of the data in a ROLAP store.
Part III - Other technologies
Other related technologies include data visualization, metadata management, analytical query parallelizting etc.
Analytical database technology is very promising and also complicated, more survey paper can be found at.
 Course on DM & OLAP
 OLAP & Data Mining Links
 OLAP & Data Warehouse Bibliography
 A Brief Tutorial on Data Mining and OLAP
 An Overview of Data Warehousing and OLAP technology
 Providing OLAP to User-Analysts: An IT Mandate
 An Overview of Data Warehouse, OLAP and Data Mining Technology
 Multidimensional Database Technology
 Data Warehouse Architecture
 Oracle Data Warehousing Guide
 Data Mining FAQ
 Data Mining Introduction
 Data Mining Tech Summary
 Oracle Data Mining Concept
 Microsoft SQL Server Analysis Service - Data Mining
 OLAP Introduction
 OLAP Overview
 OLAP Council
 OLAP Wiki
 Oracle OLAP User Guide
 Microsoft SQL Server Analysis Service - OLAP