Pages

Monday 8 April 2013

SQL Server : Part 9 : Filtered Index : A new way for Performance Improvemnt

Filtered index is a new feature introduced in SQL Server 2008. All the indexes that we have discussed till now were always on the entire table. In other words, Index and table will have same number of records.With filtered index, it is possible to create an index for a subset of the table.This is achieved by adding a where clause in the index creation statement.This helps to reduce the size of index in terms storage as well as depth of the index.The where condition specified in the index creation statement determine the existence of a record in the index.

This will be a great performance on larger table where a good chunk of queries work on smaller portion of the table.A regular index will be on entire table ignoring the fact that most of the queries are interested in a smaller portion of the index.This makes the index deeper and need more pages to store the index b tree structure which results in more IO. Where as filtered index will be created only for a portion of the table and hence less pages are required to store the index.

Let us consider an example of salesorder table which contain data for last five years. Major part of active queries on this table are based on the last calender year and the current calender year.A simple example of filtered index will be like this.

CREATE NONCLUSTERED INDEX ix_salesorder_Filter 
ON salesorder(SalesOrderId,OrderDate,Status,Customer_id,TotalDue
WHERE OrderDate>'2012-01-01'


Unique column with multiple null values: One of the significant use of filtered index is to define complex unique constraint on a column. The unique constraint will not allow you to have multiple null values in a unique column. How do you enforce uniqueness on column except for NULL values ? For example , Productcode column  in the product table can be null but should be unique if the value is defined on that column. Let us see the below example.


CREATE TABLE Product

(
        Productid            INT NOT NULL PRIMARY KEY,
        ProductCode     CHAR(10) ,
       ProductName     VARCHAR(100)

)
GO 
CREATE UNIQUE INDEX ix_Unique_Filtered ON Product(Productcode) WHERE productcode IS NOT NULL
GO
INSERT INTO Product VALUES(1,'AR-5381','Adjustable Race')
INSERT INTO Product VALUES(2,NULL,'Bearing Ball')
INSERT INTO Product VALUES(3,NULL,'BB Ball Bearing')
INSERT INTO Product VALUES(4,'AR-5381','Adjustable Race-Small')


We are able to insert multiple record with NULL value in the productcode column, but when we are trying to insert a record with duplicate value 'AR-5381', it is not allowing to insert duplicate value into the Productcode column. The unique filtered index defined on this table help us to enforce this kind of uniqueness.

Read operation: The above said is one of the common usage of filtered index.Another usage of filtered index is to support the queries. Let us see a sample below.

USE mydb
GO                               
SELECT * INTO SalesOrderheader FROM AdventureWorks2008.Sales.SalesOrderheader
GO
--Unique Clustered index
CREATE UNIQUE CLUSTERED INDEX ix_SalesOrderheader ON SalesOrderheader(SalesOrderid)
GO
--Filtered Inde
CREATE INDEX ix_filtered_index ON SalesOrderheader(orderdate) WHERE orderdate>'2008-01-01'

SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE 
orderdate>'2008-05-01'
GO
SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE 
orderdate='2008-03-01'
GO
SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE 
orderdate='2007-12-01'

An index is defined on the column orderdate for the records which are ordered after 2008-01-01. There are three select query with different condition on the orderdate. The first one is trying to fetch a range of records which comes under the filtered criteria (sub set of filtered result). Second query try to select records which ordered on specific date which comes under the filter criteria.The third one also try to fetch the orders placed on a specific date, but that date is not come under the filtered range.






















The first two queries are made use of the filtered index and the third one gone for clustered index scan. This is because the third query can not be served from filtered index as the predicate of the query is not comes under the filter criteria used in the index creation statement.In short , the records needed to perform the third select operation is not available in the filtered index.The filtered index gives lots of performance improvement for this type of queries especially when the number of records filtered out are much more than the number of records satisfying the filtering criteria.

It is not mandatory that filtering column should be part of the index but in that case predicate  of the select statement should exactly match withe filtered index predicate. Let us create an index on orderdate by filtering the records with territoryid<5

CREATE INDEX ix_TerritoryID_Filter ON SalesOrderheader (OrderDate) WHERE TerritoryID<=5
GO
SELECT salesorderid,orderdate FROM SalesOrderheader WHERE 
TerritoryID<=5
GO
SELECT salesorderid,orderdate FROM SalesOrderheader WHERE 
TerritoryID=4

Let us see the execution plan of the two select statement.



































The predicate of the first statement is exactly matching with the where condition of the filtered index creation statement. The second statement, is actually a subset of filtered index, but as the columns territoryid is not part of the index , SQL server will not be able to filter the records and hence execution is performed by clustered index scan.



Limitation of filtered index   The concept of filtered index is  very attractive but there are some limitation in the usage of filtered index , especially when it comes to parameterization.Let us rewrite the query 


SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE orderdate>'2008-05-01'
as

DECLARE @Orderdate date='2008-05-01'
SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE orderdate>@Orderdate













The execution plan is not utilizing the filtered index when we have rewritten the query with local parameter. The reason behind that is , at the compile time , the query optimizer does not know what value will be passed for the parameter @OrderDate. So optimizer  has to generate a safe plan to satisfy all the criteria. This is the same issue when we change the database property to forced parameterization  or defined the statement  as procedure. 


ALTER DATABASE mydb SET parameterization forced
GO
SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE orderdate>'2008-05-01'
GO

CREATE PROCEDURE GetSalesorder (@OrderDate date
AS
BEGIN
SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE orderdate=@OrderDate END 


In both these cases optimizer will not use the filtered index as it does not know what value will be passed during the run time and it try to generate a safe plan with out considering the availability of filtered index.


While we changing the database property parameterization to forced, the optimizer will replace all the static predicate with local variable. For example , the statement 

SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE orderdate>'2008-05-01'

will be treated as below by the optimizer

DECLARE @Orderdate date='2008-05-01'

SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE orderdate>@Orderdate


The option to force SQL server to use the filtered index is to make  dynamic statement  as given below.

DECLARE @Orderdate date='2008-05-01'
DECLARE @SQL NVARCHAR(1000)
SET @SQL=N'SELECT orderdate,SalesOrderid FROM SalesOrderheader WHERE 
orderdate>'''+CAST(@Orderdate AS CHAR(10))+''''
EXEC (@SQL)
GO

The execution plan is given below 









From the execution plan it is clear that ,it is making use of the filtered index.

If you liked this post, do like my page on FaceBook

8 comments:

  1. You should note that depending on your ANSI settings filtered indexes can actually break your application by throwing errors during INSERT statements. See my writeup: http://www.davewentzel.com/content/gotcha-sql-server-filtered-indexes

    --dave

    ReplyDelete
    Replies
    1. Thank you for reading this article and pointing it out. I have experienced the same long back in my live environment but not able to recall the incident exactly. With your permission, I will add that point to this article

      Delete
    2. certainly. Great article BTW.

      Delete




  2. I loved the way you discuss the topic great work thanks for the share, Let me share this, vmware training in pune

    ReplyDelete
  3. This is very detailed and informational post. Thanks a lot.gangstar vegas mod zombie tsunami mod cooking fever mod

    ReplyDelete

  4. غسيل خزانات بمكة غسيل خزانات بمكة
    غسيل خزانات بجدة غسيل خزانات بجدة
    غسيل خزانات بالدمام غسيل خزانات بالدمام
    غسيل خزانات بالمدينة المنورة غسيل خزانات بالمدينة المنورة

    ReplyDelete
  5. Roulette odds and payouts are the identical throughout all the three major variants, as proven beneath. 카지노사이트 Also, the home edge concerned in each variant differs, as discussed above. Even cash wagers will lose if the ball lands on both or in American roulette. To newbies learning means to|tips on how to} play roulette, the American model ought to be avoided in any respect costs for obvious causes. While there are three main roulette variants, you’ll discover other variations of the game at our on-line casinos.

    ReplyDelete