Pages

Tuesday 16 April 2013

SQL Server : Part 10: Importance of Key Column Position While Creating Index

Now we have discussed about different types of indexes in the last posts. In this post, let us discuss about the key column order (order of the column of indexes). The order of the key column of the index is decided based on the data access pattern and how do you want to organize the data.

The general guidelines for the order of the index key column is to keep the most selective column as the first column.It does not  meant that, a unique id column should be first column of all your index. The optimizer will decide to use the index based on the statistics available on the index. I will explain about the statistics in later post. Statistics gives the information about the density of the key column which give uniqueness of index, and histogram that stores the information about the distribution of the values within the column.

Let us consider an example of customer table which stores the information of customers from across the countries.The application running on top of this table deal with customers from a specific country based on the user permission/access location. 

CREATE customer (
   Customer_id     INT IDENTITY(1,1) NOT NULL
  
CountryCode     CHAR(3) NOT NULL,
  
FirstName       VARCHAR(100) NOT NULL,
  
LastName        VARCHAR(100) NOT NULL,
   Mobile
Phone     VARCHAR(20),
  
Email           VARCHAR(100)
  
)

GO
CREATE UNIQUE CLUSTERED INDEX Ix_Customerid_Countrycode ON customer(Customer_id,Countrycode)




The clustered index is created based on the general guideline to keep the most selective column on the left side.If I need fetch a single records based on the customer_id, this index will work perfectly.So what is the drawback of this index ? In case if I need to fetch all/many customers based on the countrycode , the optimizer opt for clustered index scan.

SET STATISTICS IO ON 
go
SELECT * FROM customer WHERE Countrycode='VNH' AND customer_id=1216468













Let us try to fetch all customers with countrycode VNH. The table has around 620 thousand records and there are 3066 customers with VNH countrycode

SELECT FROM customer WHERE Countrycode='VNH' 












From the execution plan, it is clear that, optimizer has opted for clustered index scan  by scanning all 6825 pages used to store this table. We can optimize this by changing the index with countrycode as the first column.

DROP INDEX customer.Ix_CustomerId_CountryCode 
GO
CREATE UNIQUE CLUSTERED INDEX Ix_CountryCode_CustomerId ON customer(Countrycode,Customer_id)



SET STATISTICS IO ON 
go
SELECT FROM customer WHERE Countrycode='VNH' AND customer_id=1216468










Let us try to fetch all customers with countrycode VNH. 

SELECT FROM customer WHERE Countrycode='VNH' 










From the execution plan, it is clear that , the optimizer used index seek in both the cases and IO operation has reduced drastically in the scenario while fetching all the customers with VNH countrycode.


Other problem will be, while keeping the customer_id as the first column , data will be stored in the order of customer_id column and you will have many pages (almost all pages) will have data belongs to multiple countrycode. This may lead to more blocking/deadlock issues.By defining the index with countrycode as the first column, only a few pages will have data overlapped with multiple countrycode and will help to reduce the blocking issues.The important point is, by defining the index with countrycode as the first column will cause for higher level of index fragmentation, but that can be controlled by defining proper fill factor value. I have experienced this in one of our project and experienced lot of improvement after changing the indexes with countrycode as the first column.

In short the general guidelines about the key column is a good starting point but at the same time you have to consider the data access pattern in your application.Hope this will help you to resolve some issue that you are facing.


If you liked this post, do like my page on FaceBook


7 comments:

  1. Awesome performance pointer. Thanks for posting!

    ReplyDelete




  2. I loved the way you discuss the topic great work thanks for the share, Let me share this, vmware training in pune

    ReplyDelete
  3. http://obat-aborsi99.com/ Obat Penggugur Kandungan Janin
    http://klinikfarma.com/ Jual Obat Aborsi Alsi
    http://situs-online.com/ Jual Obat Aborsi Penggugur Kandungan
    http://obataborsi-ampuh.com/ Jual Obat Aborsi Cytotec

    ReplyDelete
  4. فانت مع شركة تنظيف خزانات في امان
    فلدى شركة لمسات جدة خدمات أخرى متمثلة في تنظيف المنازل من الداخل لأن عمليات تنظيف المنازل من الأمور الصعبة التي تحتاج الى مكالمة شركة تنظيف منازل بجدة لصعوبة عملية التنظيف المنزلية ولدينا ايضا قسم خاص بتنظيف وغسيل المفروشات والسجاد والكنب بالبخار في شركة تنظيف كنب بالبخار بجدة نتميز بالدقة العالية لاستخدامنا أفضل أنواع المطهرات والمعقمات للمفارش والكنب المصرح بها عالميا
    وايضا في مجال التنظيف في مكة لدينا شركة تنظيف منازل بمكة متميزة ومتخصصة وعلى خبرة كبيرة بكل مجالات التنظيف ولهذا تتواصل شركة تنظيف منازل بالعملاء لمعرفة الآراء

    ReplyDelete
  5. Its really an Excellent post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog. Thanks for sharing...
    Docker online training
    Docker certification training
    Docker online course
    Docker training course

    ReplyDelete
  6. The agricultural sector in Europe is generally well developed. The process of improvement and modernization of agriculture in Central Europe continues, which is facilitated by the accession of the Central European states to the EU. The agricultural sector among EU member states is supported by the Common Agricultural Policy (CAP), which helps to provide farmers with a guaranteed minimum price for their products and subsidizes their exports, which increases the competitiveness of their products. https://www.immigration-residency.eu/residence-permit-latvia/real-estate/

    ReplyDelete