# Aggregation WITH CUBE

In my last post, I wrote about how aggregation WITH ROLLUP works.  In this post, I will discuss how aggregation WITH CUBE works.  Like the WITH ROLLUP clause, the WITH CUBE clause permits us to compute multiple "levels" of aggregation in a single statement.  To understand the difference between these two clauses, let's look at an example.  We'll use the same fictitious sales data from last week's example.

CREATE TABLE Sales (EmpId INT, Yr INT, Sales MONEY)
INSERT Sales VALUES(1, 2005, 12000)
INSERT Sales VALUES(1, 2006, 18000)
INSERT Sales VALUES(1, 2007, 25000)
INSERT Sales VALUES(2, 2005, 15000)
INSERT Sales VALUES(2, 2006, 6000)
INSERT Sales VALUES(3, 2006, 20000)
INSERT Sales VALUES(3, 2007, 24000)

Consider the following query from last week:

SELECT EmpId, Yr, SUM(Sales) AS Sales
FROM Sales
GROUP BY EmpId, Yr WITH ROLLUP

It will be easier to see what is happening if we pivot the sales data:

 Yr 2005 2006 2007 ALL EmpId 1 12000.00 18000.00 25000.00 55000.00 2 15000.00 6000.00 21000.00 3 20000.00 24000.00 44000.00 ALL 120000.00

The table clearly shows that the WITH ROLLUP clause computes the total for each employee for all years and the grand total for all employees and all years.  The query does not compute the totals for each year for all employees.  Moreover, the order of the columns in the GROUP BY clause determines in which order the data is totaled.

Now let's repeat the same query but replace the WITH ROLLUP clause with a WITH CUBE clause:

SELECT EmpId, Yr, SUM(Sales) AS Sales
FROM Sales
GROUP BY EmpId, Yr WITH CUBE

This query computes all possible sub-totals and totals:

 Yr 2005 2006 2007 ALL EmpId 1 12000.00 18000.00 25000.00 55000.00 2 15000.00 6000.00 21000.00 3 20000.00 24000.00 44000.00 ALL 27000.00 44000.00 49000.00 120000.00

Because the WITH CUBE clause causes the query to compute all possible totals, the order of the columns in the GROUP BY clause does not matter.  Of course, by default, SQL Server does not pivot the results of either of the above queries.  Here is the actual output from the WITH CUBE query:

```EmpId       Yr          Sales
----------- ----------- ---------------------
1           2005        12000.00
1           2006        18000.00
1           2007        25000.00
1           NULL        55000.00
2           2005        15000.00
2           2006        6000.00
2           NULL        21000.00
3           2006        20000.00
3           2007        24000.00
3           NULL        44000.00
NULL        NULL        120000.00
NULL        2005        27000.00
NULL        2006        44000.00
NULL        2007        49000.00```

Next, let's look at the query plan for the WITH CUBE query:

|--Compute Scalar(DEFINE:([Expr1004]=CASE WHEN [Expr1005]=(0) THEN NULL ELSE [Expr1006] END))
|--Concatenation
|--Stream Aggregate(GROUP BY:([Sales].[EmpId], [Sales].[Yr]) DEFINE:([Expr1005]=SUM([Expr1007]), [Expr1006]=SUM([Expr1008])))
|    |--Sort(ORDER BY:([Sales].[EmpId] ASC, [Sales].[Yr] ASC))
|         |--Table Spool
|              |--Stream Aggregate(GROUP BY:([Sales].[Yr], [Sales].[EmpId]) DEFINE:([Expr1007]=COUNT_BIG([Sales].[Sales]), [Expr1008]=SUM([Sales].[Sales])))
|                   |--Sort(ORDER BY:([Sales].[Yr] ASC, [Sales].[EmpId] ASC))
|                        |--Table Scan(OBJECT:([Sales]))
|--Compute Scalar(DEFINE:([Expr1012]=NULL))
|--Stream Aggregate(GROUP BY:([Sales].[Yr]) DEFINE:([Expr1005]=SUM([Expr1007]), [Expr1006]=SUM([Expr1008])))
|--Table Spool

This plan consists of two parts.  SQL Server has effectively rewritten our query as follows:

SELECT EmpId, Yr, SUM(Sales) AS Sales
FROM Sales
GROUP BY EmpId, Yr WITH ROLLUP
UNION ALL
SELECT NULL, Yr, SUM(Sales)
FROM Sales
GROUP BY Yr

The first part of the plan computes the result for the WITH ROLLUP query above.  I described how this query plan works in last week's post.  The second part of this plan computes the missing year sub-totals yielding the entire CUBE result.  Note that this plan employs a common sub-expression spool.  As I discussed in this post, a common sub-expression spool copies its input rows into a worktable and then reads and returns the rows from the worktable multiple times - in this case twice.  The spool is meant to improve performance although, in this example, it has little impact since the server could just as easily have re-read the original Sales table.  However, if the input to the aggregation was more complex and cost more to evaluate, the spool would help.

If we use the WITH CUBE clause when aggregating on more than two columns, SQL Server simply generates increasingly complex plans with additional inputs to the concatentation operator.  As with the simple two column example, the idea is to compute the whole CUBE by computing all of the individual ROLLUPs that compose it.

Finally, we can actually combine WITH CUBE and PIVOT to generate the above table in a single simple statement.  (I actually proposed a variation of this query in an answer to a reader's comment on my post about the PIVOT operator but I like this solution better.)

SELECT EmpId, , , , [ALL]
FROM
(
SELECT
CASE WHEN GROUPING(EmpId) = 0
THEN CAST (EmpId AS CHAR(7))
ELSE 'ALL'
END AS EmpId,
CASE WHEN GROUPING(Yr) = 0
THEN CAST (Yr AS CHAR(7))
ELSE 'ALL'
END AS Yr,
SUM(Sales) AS Sales
FROM Sales
GROUP BY EmpId, Yr WITH CUBE
) AS s
PIVOT (SUM(Sales) FOR Yr IN (, , , [ALL])) AS p

Here is the output from this query:

```EmpId   2005                  2006                  2007                  ALL
------- --------------------- --------------------- --------------------- ---------------------
1       12000.00              18000.00              25000.00              55000.00
2       15000.00              6000.00               NULL                  21000.00
3       NULL                  20000.00              24000.00              44000.00
ALL     27000.00              44000.00              49000.00              120000.00```

1. Craig Freedman's SQL Server Blog says:

In my last two posts, I gave examples of aggregation WITH ROLLUP and CUBE . SQL Server 2008 continues

2. Craig Freedman's SQL Server Blog says:

One of the most important uses of CTEs is to write recursive queries. In fact, CTEs provide the only

3. N Amaresh says:

Nice post.

4. M Qasim ali says:

please tell how to display pivot tables using graph

5. Craig Freedman says:

You'll need to use a tool such as Excel or Reporting Services to display the results of a query as a graph.  Both tools can link directly to the data stored in SQL Server.

HTH,

Craig

6. Sonal Bhosale says:

Thanx a lot again….

7. kannan says:

It's working fine. But i have adding another one column EmpName that showing problem?

8. Craig Freedman says:

I'm not certain that I understand your question.  You can certainly add other columns to the table and include them in the query.  Remember that any column that you select that is not an argument to an aggregate function must be including in the group by list.  For example:

SELECT EmpId, EmpName, Yr, SUM(Sales) AS Sales

FROM Sales

GROUP BY EmpId, EmpName, Yr WITH CUBE

HTH,

Craig

9. Tony A. says:

Hi Craig, very interesting and useful post and I have the following question:

In your example the number or Years [Yr] is known and they are specified in the Select and the Pivot, in my case the number of years is uncertain, because in some cases it can be 10 in other 35 depending on the record set I extract.

Is there a way to populate the Years [Yr] dynamically from the result set?

Regards

10. Craig Freedman says:

Hi Tony,

PIVOT does require that all values be known a priori and specified statically in the query text.  There are ways to work around this problem using dynamic SQL.  For example, see these articles by Itzik Ben-Gan:

sqlmag.com/…/enhancing-pivot-crosstab-queries

sqlmag.com/…/pivot-steroids

HTH,

Craig

11. Tony says:

Hi Craig,

Thanks for your kind answer which has pointed me in the right direction.

I've created this code which, after testing appears to address my issue and I'm posting it here in case someone else has the same requirement.

Declare @SQL_STMNT_1   NVarChar(500)

Declare @SQL_STMNT_1a NVarChar(500)

Declare @SQL_STMNT_1b NVarChar(500)

Declare @SQL_STMNT_1c NVarChar(500)

Declare @SQL_STMNT_2   NVarChar(800)

Declare @PERIOD Int

Declare @PERIOD_CHAR  Char(8)

Declare @Table      VarChar(15)

Set @Table        = '##Sales'

Set @SQL_STMNT_1  =  Space(1)

Set @SQL_STMNT_1b = Space(1)

Set @SQL_STMNT_1c =

'FROM

(

SELECT

CASE WHEN GROUPING(EmpId) = 0

THEN CAST (EmpId AS CHAR(7))

ELSE '+ ''''+ 'Total'+ '''' + '

END AS EmpId,

CASE WHEN GROUPING(Yr) = 0

THEN CAST (Yr AS CHAR(7))

ELSE '+ ''''+ 'Total'+ '''' + '

END AS Yr,

SUM(Sales) AS Sales

FROM '+ @Table +'

GROUP BY EmpId, Yr WITH CUBE

) AS s

PIVOT (SUM(Sales) FOR Yr IN ('

DECLARE MY_CURSOR CURSOR

LOCAL STATIC READ_ONLY FORWARD_ONLY

FOR

Select Distinct

Yr

From

##Sales

OPEN MY_CURSOR

FETCH NEXT FROM MY_CURSOR INTO @PERIOD

WHILE @@FETCH_STATUS = 0

BEGIN

Set @PERIOD_CHAR = Cast(@PERIOD As Char(8))

Set @SQL_STMNT_1  = LTRim(@SQL_STMNT_1) + 'IsNull(([' +RTrim(LTrim(@PERIOD_CHAR))+']),0) As '+''''+RTRim(@PERIOD_CHAR) + ''''+','

Set @SQL_STMNT_1a = 'Select EmpId,' + Left(@SQL_STMNT_1,Len(@SQL_STMNT_1) -1)

Set @SQL_STMNT_1b = LTRim(@SQL_STMNT_1b) + '['+RTrim(LTrim(@PERIOD_CHAR))+']'+','

FETCH NEXT FROM MY_CURSOR INTO @PERIOD

END

CLOSE MY_CURSOR

DEALLOCATE MY_CURSOR

Set @SQL_STMNT_2 = @SQL_STMNT_1a + Space(1) + ',[Total] ' +

@SQL_STMNT_1c +

@SQL_STMNT_1b + '[Total])) As p'

Execute sp_executesql @SQL_STMNT_2