Few words about: TSQL analytical functions

Transact-SQL give us possibility to automatically create aggregates based on a query. In this article I will show shortly how to use ROLLUP,CUBE and GROUPING SETS.

ROLLUP

ROLLUP is a part of GROUP BY clause and we use it like this

SELECT a,b,c, sum(d)

FROM Table_name

GROUP BY ROLLUP(a,b,c)

ROLLUP calculate sum of d attribute of few levels: ABC,AB,A and additionally sum of everything.

Example:

select

dpc.EnglishProductCategoryName,

dps.EnglishProductSubcategoryName,

dp.Color,

sum(fis.SalesAmount) as SalesAmount

from FactInternetSales fis

join DimProduct dp on dp.ProductKey=fis.ProductKey

join DimProductSubcategory dps on dps.ProductSubcategoryKey=dp.ProductSubcategoryKey

join DimProductCategory dpc on dpc.ProductCategoryKey=dps.ProductCategoryKey

group by ROLLUP(dpc.EnglishProductCategoryName,dps.EnglishProductSubcategoryName, dp.Color)

The result will be:

Aggregate on ABC and AB level

Aggregate on ABC and AB level

Aggregate on A Level:

result

SUM of all levels:

result

Cube

CUBE is similar to ROLLUP  clause but the main difference is that CUBE creates aggregation based on all possible combinations for example

SELECT a,b,c, sum(d)

FROM Table_name

GROUP BY CUBE(a,b,c)

Aggregations will be on levels: ABC,AB,AC,A,BC,B,C:

Select dpc.EnglishProductCategoryName,dps.EnglishProductSubcategoryName,dp.Color,sum(fis.SalesAmount) as SalesAmount

 from FactInternetSales fis join DimProduct dp on dp.ProductKey=fis.ProductKey

join DimProductSubcategory dps on dps.ProductSubcategoryKey=dp.ProductSubcategoryKey

join DimProductCategory dpc on dpc.ProductCategoryKey=dps.ProductCategoryKey

group by CUBE(dpc.EnglishProductCategoryName,dps.EnglishProductSubcategoryName, dp.Color)

The result will be:

ABC,BC,C

resultAC

resultA

resultB

resultAB

result

GROUPING_ID

CUBE and ROLLUP are very helpful but what if one of our attributes are really NULL, how can we distinguish real NULL from CUBE/ROLLUP aggregate NULL? WE can use GROUPING_ID function. Use of this function is simple

As a parameter to GROUPING_ID function we have to provide ROLLUP/CUBE arguments for instance:

GROUPING_ID(dpc.EnglishProductCategoryName,dps.EnglishProductSubcategoryName, dp.Color) as GROUP_ID

As a result we will see:

Grouping_ID

Integer number that we get is the identifier of a level of aggregation where 0 is in our notation ABC, 1 is AB and so on.

Finally I would like to mention something about GROUPING_SETS, this option give us control what summaries will be calculated.

group by GROUPING SETS((dpc.EnglishProductCategoryName,dps.EnglishProductSubcategoryName, dp.Color),())

We will get two summaries:

dpc.EnglishProductCategoryName,dps.EnglishProductSubcategoryName, dp.Color

and

() – aggregate of everything in a query

GROUPING SETS are great if we want full control what will be calculated.

I hope you enjoy analytical functions in TSQL 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s