Data mining algorithms and their use cases

Hi folks,

I have compiled the comparisons and use case of OOB algorithms available within SSAS/DM for handy reference.

Name

Description

Type

Use cases

Parent algorithm

Supported Values

Associate Rule

Builds rules describing which items are most likely to be  appear together in a transaction.

Association

The rules can be used to predict the presence of an item  based on the presence of other items in a transaction.

-

-

Clustering

Uses iterative techniques to group records from a dataset  into clusters containing similar characteristics.

Clustering

This is useful when you want to find general groupings in  your data.

-

-

Sequence Clustering

Combination of sequence analysis and clustering, which  identifies clusters of similarly ordered events in a sequence

Clustering/

Sequencing

Clusters can be used to predict the likely ordering of  events in a sequence based on known characteristics.

Clustering

-

Decision Trees

It’s a classification algorithm.

Classification

Works well for predictive modeling

-

Supports the prediction of both discrete and continuous  attributes.

Linear Regression

This algorithm is a particular configuration of the  Microsoft Decision Trees algorithm, obtained by disabling splits (the whole  regression formula is built in a single root node).

Regression

Works well for regression modeling

Decision Trees

The algorithm supports the prediction of continuous  attributes.

Time Series

algorithm uses a combination of ARIMA analysis and linear  regression based on decision trees to analyze time-related data, such as  monthly sales data or yearly profits.

Classification/

Regression

Discovered patterns can be used to predict values for  future time steps. The algorithm can be customized to use either the decision  tree method, ARIMA, or both.

Linear Regression

-

Naive Bayes

Classification algorithm that is quick to build.

Classification

Works well for predictive modeling

-

The algorithm supports only discrete attributes.

Neural Network

Uses a gradient method to optimize parameters of  multilayer networks to predict multiple attributes

Classification/

Regression

It can be used for classification of discrete attributes  as well as regression of continuous attributes.

-

Discrete/Continuous

Logistic Regression

This algorithm is a particular configuration of the  Microsoft Neural Network algorithm, obtained by eliminating the hidden layer

Regression

Works well for regression modeling

Neural Network

Supports the prediction of both discrete and continuous  attributes.