Over clause can be used in association with aggregate function and ranking function. The over clause determine the partitioning and ordering of the records before associating with aggregate or ranking function. Over by clause along with aggregate function can help us to resolve many issues in simpler way. Below is a sample of Over clause along with the aggregate function.
SELECT SalesOrderID,p.Name AS ProductName,OrderQty,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS TotalOrderQty,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS "Avg Qty of Item"
,COUNT(OrderQty)OVER(PARTITION BY SalesOrderID) AS "Total Number of Item"
,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS "Min order Qty"
,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS "Max Order Qty" FROM Sales.SalesOrderDetail SOD INNER JOIN Production.Product p ON SOD.ProductID=p.ProductID WHERE SalesOrderID IN(43659,43664)
The Partition clause tell the aggregate function that the result should be based on the salesorderid. The output will looks like as given below
TotalOrderQty: is the total quantity of product ordered in the the sales order.
Avg Qty of Item : is the average of order quantity for a salesorder. In our case Totalorderqty for the salesorderid 43659 is 26 and we have twelve order line . So the average quantity per order line = 26/12
Total Number of Item : is the number of product ordered in a salesorder.
Min Order Qty : is the minimum quantity ordered in a salesorder.
Max Order Qty: is the maximum quantity ordered in a salesorder.
The difference between group by and this method is , in group by we will get only the summery part. In our case if we use group by,will get only two records in the output. To get the result as above using group by, we need to write the query as given below:
SELECT
p.name,GRPRESULT.* FROM sales.SalesOrderDetail SOD INNER JOIN Production.Product p ON SOD.ProductID=p.ProductIDINNER JOIN ( SELECT
SalesOrderID
,SUM(OrderQty) AS TotalOrderQty
,AVG(OrderQty) AS "Avg Qty of Item"
,COUNT(OrderQty)AS "Total Number of Item" ,MIN(OrderQty) AS "Min order Qty" ,MAX(OrderQty) AS "Max Order Qty"
FROM Sales.SalesOrderDetail WHERE SalesOrderID IN(43659,43664)GROUP BY SalesOrderID)
GRPRESULT
ON
GRPRESULT .SalesOrderID =sod.SalesOrderID
Another interesting part is we can use the over clause with out partition clause which will do an aggregation on entire result set . Let us assume that we have requirement to list all sales order for the year 2008 with sales order number, total amount and Percentage of 2008 sales. It can be achieved easily as given below.
USE AdventureWorks2008
GOSELECT SalesOrderNumber,TotalDue,
(TotalDue*100.)/ SUM(TotalDue) OVER() AS [%2008Sales]FROM Sales.SalesOrderHeader WHERE YEAR(OrderDate)=2008
In SQL server 2012 there are more options along with over clause to display cumulative total .ROW_NUMBER, RANK, DENSE_RANK and NTILE are the ranking function which can be used along with Over clause. For ranking function along with Partition by clause, we can use Order by clause also.To explain the rank function let us have a small table
USE mydb
GO
CREATE TABLE Student
(
Name VARCHAR(10)
)
INSERT INTO Student VALUES ('aa'),('bb'),('cc'),('dd'),('ee')
INSERT INTO Student VALUES ('aa'),('bb'),('cc')
INSERT INTO Student VALUES ('aa'),('bb'),('cc')
INSERT INTO Student VALUES ('dd'),('ee')
INSERT INTO Student VALUES ('dd'),('ee')
INSERT INTO Student VALUES ('ff'),('gg'),('hh')
Row_Number() can be used in many scenarios like to filter the records, remove the duplicated records , implementing paging etc. Let us assume that we need to generate serial number while listing the entries from the student table.
SELECT ROW_NUMBER() OVER (ORDER BY NAME) AS [Si No],* FROM Student
To remove the duplicate entries from the above table
WITH cte_s
AS (
SELECT ROW_NUMBER() OVER (PARTITION BY name ORDER BY NAME) AS [SiNo],* FROM Student
)
DELETE FROM cte_s WHERE [SiNo]<>1
GO
SELECT * FROM Student
Let us assume that we have to divide the student into four group for a game. The NTILE will help us
SELECT NTILE(4) OVER (ORDER BY NAME) AS [Grpno],* FROM Student
As the total number of records 18 is not divisible by 4, it has created two groups with 5 students and other two groups with 4 students.
Let us have slightly different table structure to understand RANK and DENSE_RANK function.
CREATE TABLE StudentMark(
Name VARCHAR(10),
Mark INT)INSERT INTO StudentMark VALUES ('aa',10),('bb',14),('cc',16),('dd',22),('ee',25),('ff',25),('gg',11),('hh',21),('ii',16)
To assign a rank to student based on their mark we can use the below querySELECT RANK() OVER (ORDER BY mark DESC) AS 'Rank' ,* FROM StudentMark
The output will looks like as given below:
We can see that rank is assigned based on the position .We have two student with same marks and the student who has next highest marks came in the third position. This listing will be suitable for scenario like an entrance examination result for a total seat of 50. Student who has rank above 50 will not get the admission.
But some scenario we might need to display the actual rank with out any gap.The student who has the second highest mark should have the second rank irrespective of number of student have highest mark. The below query will helps us to do that.
SELECT DENSE_RANK() OVER (ORDER BY mark DESC) AS 'Rank' ,* FROM StudentMark
The output will looks like as given below:


Just to point out that ROW_NUMBER() with OVER has been in since SQL 2005.
ReplyDeleteThanks for sharing this. This a new technique I learnt today.
ReplyDeleteThank you, this is exactly what I needed. This help me to save a lot of work
ReplyDelete