Pages

Wednesday 22 August 2012

SQL Server:Understanding the Data Record Structure

In our last post, we have gone through the data page structure and we have noticed that there is an overhead of 7 bytes in each records. Let us try to understand how SQL Server stores the records in a data pages.

Data record stores actual data. In other words, it store the records in heap or  the records in the leaf level of clustered index. The data records are stored in a structure which helps SQL server manage these record efficiently. Let us see a pictorial representation of a data record.












The section which are marked in blue are part of all data records.The other section will be part of the data record depends on fixed length / variable length column are available in the table structure.

The first one byte is used for status Bits 1 which define the properties of the record :

Bit 0 : Versioning information. In SQL server 2008 this is always 0
Bits 1 to 3: This is three bit value define the record type.
0 data record. 
1 Forwarded record. 
2 a forwarding stub. 
3 Index record. 
4 blob fragment or row overflow data. 
5 ghost index record.
6 ghost data record 
7 ghost version record 
Bit 4: Null bitmap exists or not.In SQL server 2008 null bitmap exists even if there is no null able columns
Bit 5: Indicate variable column exists or not.
Bit 6 :Indicate that row contain versioning information

Bit 7 : Not used in SQL server

The second byte used for status bits 2 . Only one bit is used in this to indicate that the record is ghost forwarded record.

The next 2 bytes are used to store the length of fixed portion of the record.This include the two status bytes,2 bytes used for this field and actual size of fixed length data in the table. For example if a table does not have any fixed length column the value in this field will be 4.This is the same value will be displayed in the pminlen field in the page header. (Refer the Page Structure post)

The next n bytes are used to store the fixed length data available in the table where n is the total size of the fixed length column in the table. This part will not be there in the record structure if all column in the table are variable length column.

Next 2 bytes are used to store the total number of column in the table.

Next n bytes are used for null bitmap. one bit for each column in the table. value 1 in each bit indicate the corresponding column has NULL value in that record. The value of n will be no.column the table/8 and round to next integer value.

Next 2 bytes are used to store the number of variable length column in the table.

Next n bytes are used to store variable column offset array .This is nothing but the starting offset value (with respect to the page) of each variable length column.Each variable column requires 2 bytes. The value of n will be 2 X no. of variable length column in the table.

Next n bytes are used to store the actual data of variable length column . The value of n will be the total size (actual data stored not in the definition of the table) of variable length column.

Let us see the same example that we considered in the page structure post

CREATE TABLE Customer (
   
FirstName CHAR(200),
   
LastName  CHAR(300),
   
Email     CHAR(200),
   
DOB       DATE, --Size is 3
)

GO
INSERT INTO Customer VALUES('William','James','William.J@yahoo.com','1982-01-20')
INSERT INTO Customer VALUES('Jade','Victor','Jade.V@yahoo.com','1985-08-12')
INSERT INTO Customer VALUES('Jonas','hector','Jonas.h@yahoo.com','1980-10-02') 
INSERT   INTO  Customer     VALUES('William1','James','William.J@yahoo.com','1982-01-20')
INSERT INTO Customer VALUES('Jade1','Victor','Jade.V@yahoo.com','1985-08-12')
INSERT INTO Customer VALUES('Jonas1','hector','Jonas.h@yahoo.com','1980-10-02')
INSERT INTO Customer VALUES('William2','James','William.J@yahoo.com','1982-01-20')
INSERT INTO Customer VALUES('Jade2','Victor','Jade.V@yahoo.com','1985-08-12')
INSERT INTO Customer VALUES('Jonas2','hector','Jonas.h@yahoo.com','1980-10-02')
INSERT INTO Customer VALUES('William3','James','William.J@yahoo.com','1982-01-20')
GO
DBCC IND('mydb','customer',-1)

This gives me page number 148 with page type 1

DBCC TRACEON(3604)
GO
DBCC page('mydb',1,148,3)

The output will look like the image given below:


















The pminlen value 707 in the header is the total size of the column (703 bytes) , 2 bytes used for status bytes and 2 bytes used to store the size of fixed length columns. The Length 710 mention in the record slot is the sum of pminlen, size of null bitmap (1 byte) and 2 bytes used to store the number column in the table.

Let us try another example with variable length column.

USE MyDb
GO
CREATE TABLE VariableLength(
  
Title         CHAR(10) NOT NULL,
  
FirstName     VARCHAR(100),
  
Lastname      VARCHAR(100),
  
email         VARCHAR(50), 
  
dob           date NOT NULL,
  
phone         CHAR(10),
  
Countrycode   CHAR(3),
  
Designation   VARCHAR(100),
  
PersonalPreference VARCHAR(100)
)

GO
INSERT INTO VariableLength VALUES ('Mr','Fedric','John','fedric.john01@abc.com','1980-01-01','XXXXXXXXXX','US','DBA','Nothing Spl')
GO
DBCC IND('mydb','VariableLength',-1)
GO
This gives me page number 173 with page type 1

DBCC TRACEON(3604)
GO
DBCC PAGE('mydb',1,173,3)--Please change page number appropriatly
The output will look like the image given below:






















The pminlen value 30 is the sum of
     1 bytes for status bits 1
     1 bytes for status bits 2
     2 bytes used to store the size of fixed length columns
    26 bytes of fixed length column (title,dob,phone,countrycode)

The length 91 shown in the slot 0 is the sum of :
     1 bytes for status bits 1
     1 bytes for status bits 2
     2 bytes used to store the size of fixed length columns
    26 bytes of fixed length column (title,dob,phone,countrycode)
     2 bytes used to store total number of columns
     2 bytes for null bitmap, round up total no of column/8 = 9/8=2
     2 bytes to store number of variable length columns
   10 bytes to store the variable length column offset (number of variable length column X2)
   45 bytes to store the variable length data (actual size of the data). This can be obtained by running the   below query
SELECT DATALENGTH(FirstName)+DATALENGTH(Lastname)+DATALENGTH(email)+
DATALENGTH(Designation)+DATALENGTH(PersonalPreference) FROM VariableLength


Hope you got clear idea about the storage of data in the SQL server page. 

Reference : Microsoft® SQL Server® 2008 Internals by Kalen Delaney, Paul S. Randal, Kimberly L. Tripp , Conor Cunningham , Adam Machanic. ISBN :0-7356-2624-3


If you liked this post, do like my page on FaceBook 

13 comments:

  1. Your blog is very informative! I am glad I stumbled upon it via a google search for blocking email notifications!

    ReplyDelete
    Replies
    1. Happy to hear that google search showing it up

      Delete
  2. Dear John, this blog is very helpful.

    Could you explain the case of Null data (existing of missing column) for both fixed and variable length cases?
    e.g.

    If phone char(10) in any row is omitted, I wonder that the length of fixed data is redued or not? Does 10 bytes space('20') be filled?

    How about similar case for varchar ?

    I'm sorry for my poor English. I'm not native in English.

    ReplyDelete
  3. Can you help me under stand if the table has around 100 columns how the data page will be

    ReplyDelete
  4. http://jumperads.unblog.fr/2017/04/15/%d8%b4%d8%b1%d9%83%d8%a7%d8%aa-%d9%86%d9%82%d9%84-%d8%a7%d9%84%d8%b9%d9%81%d8%b4-%d8%a8%d8%ac%d8%af%d8%a9/
    http://jumperads.unblog.fr/2017/03/23/%d8%b4%d8%b1%d9%83%d8%a9-%d8%aa%d9%86%d8%b8%d9%8a%d9%81-%d9%81%d9%84%d9%84-%d8%a8%d9%8a%d9%86%d8%a8%d8%b9-%d9%85%d8%aa%d8%ae%d8%b5%d8%b5%d8%a9/
    http://jumperads.unblog.fr/2017/03/23/%d8%b4%d8%b1%d9%83%d8%a9-%d9%85%d9%83%d8%a7%d9%81%d8%ad%d8%a9-%d9%88%d8%aa%d9%86%d8%b8%d9%8a%d9%81-%d8%a8%d8%a7%d9%84%d9%82%d8%b5%d9%8a%d9%85/
    http://support.wikipad.com/hc/en-us/community/posts/214559303-%D8%B4%D8%B1%D9%83%D8%A9-%D9%86%D9%82%D9%84-%D8%A7%D8%AB%D8%A7%D8%AB-%D8%A8%D8%A7%D9%84%D8%B1%D9%8A%D8%A7%D8%B6

    ReplyDelete
  5. Therefore, it can store data and code and are generally a lower cost system than hard disk options.Self Storage

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. This article is good and useful but in case of storing an Clustered index or Non clustered index does it have a difference in pminlen and length of a record as i have observed by using the following cases

    Example i have created three tables with same data and same structure

    Case 1 : used your example and size and calculation are same

    case 2: for the same table created clustered index and length of record increased by 2 bytes.

    case 3: for the same table without clustered index and 1 non clustered index has different length of a record on index pages for an integer column.

    Case 4: With Clustered and 2 Non clustered index I got different length of records for non clustered index columns and pages.

    ReplyDelete
  8. If you don"t mind proceed with this extraordinary work and I anticipate a greater amount of your magnificent blog entries

    digital marketing courses in hyderabad with placement

    ReplyDelete
  9. It will also check for any recently introduced new services or discounts that are finding enhanced interest among the customers. salesforce training in bangalore

    ReplyDelete
  10. Great Information sharing .. I am very happy to read this article .. thanks for giving us go through info.Fantastic nice. I appreciate this post. data science course in Nashik

    ReplyDelete