Solutions with Practical SQL DBA : SQL Server: Understanding the Data Page Structure

Sunday, 12 August 2012

SQL Server: Understanding the Data Page Structure

We all know very well that SQL server stores data in 8 KB pages and it is the basic unit of IO for SQL server operation. There are different types of pages like data , GAM,SGAM etc. In this post let us try to understand the structure of data pages.

SQL server use different types of pages to store different types of data like data, index data,BLOB etc.SQL servers stores the data records in data pages.Data records are rows in heap or in the leaf level of the clustered index.

A data page consist of three sections. Page Header ,actual data and row offset array. A schematic diagram of data pages looks like as below.

Before going into details let us see how this looks internally in SQL server. Let us create a table and insert some records into it.

CREATE DATABASE MyDb

GO

USE MyDb
GO

CREATE TABLE Customer (
   FirstName CHAR(200),
   LastName  CHAR(300),
   Email     CHAR(200),
   DOB       DATE,
)

GO
INSERT INTO Customer VALUES('William','James','William.J@yahoo.com','1982-01-20')
INSERT INTO Customer VALUES('Jade','Victor','Jade.V@yahoo.com','1985-08-12')
INSERT INTO Customer VALUES('Jonas','hector','Jonas.h@yahoo.com','1980-10-02')
INSERT INTO Customer VALUES('William1','James','William.J@yahoo.com','1982-01-20')

INSERT INTO Customer VALUES('Jade1','Victor','Jade.V@yahoo.com','1985-08-12')

INSERT INTO Customer VALUES('Jonas1','hector','Jonas.h@yahoo.com','1980-10-02')

INSERT INTO Customer VALUES('William2','James','William.J@yahoo.com','1982-01-20')

INSERT INTO Customer VALUES('Jade2','Victor','Jade.V@yahoo.com','1985-08-12')

INSERT INTO Customer VALUES('Jonas2','hector','Jonas.h@yahoo.com','1980-10-02')

INSERT INTO Customer VALUES('William3','James','William.J@yahoo.com','1982-01-20')

GO

Now we need to find out the pages allocated to this table. For that we have to use an undocumented command DBCC IND.

The syntax of DBCC IND is given below:

DBCC IND ( { 'dbname' | dbid }, { 'objname' | objid }, { nonclustered indid | 1 | 0 | -1 | -2 });
nonclustered indid = non-clustered Index ID
1 = Clustered Index ID
0 = Displays information in-row data pages and in-row IAM pages (from Heap)
-1 = Displays information for all pages of all indexes including LOB (Large object binary) pages and row-overflow pages
-2 = Displays information for all IAM pages

Run the below command from SSMS

DBCC IND('mydb','customer',-1)
The output will looks like as in below picture:

You can see two records, one with page type 10 and other one with 1. Page type 10 is an IAM page and we will talk about different page types in a different post.Page type 1 is data page and its page id is 114.

Now to see the row data stored in that page , we have to use the DBCC PAGE command. The syntax of DBCC PAGE :
dbcc page ( {'dbname' | dbid}, filenum, pagenum [, printopt={0|1|2|3} ]);Printopt:
0 - print just the page header
1 - page header plus per-row hex dumps and a dump of the page slot array
2 - page header plus whole page hex dump
3 - page header plus detailed per-row interpretation

By default the output of dbcc page is sent to error log. To get the output in the current connection , we have to enable the trace flag 3604.You can also use with tableresults along with dbcc page to get the output in table format. Run the below command to get the row data stored in the data page.

DBCC TRACEON(3604)
GO
DBCC page('mydb',1,114,3)
This will have four section in output.The first section is BUFFER which talk about in memory allocation and we are not interested in that section. The next section is page header which is fixed 96 bytes in size.The size of page header will be same for all pages. Page header section will looks like as below picture.

To know more about these field http://www.sqlskills.com/BLOGS/PAUL/post/Inside-the-Storage-Engine-Anatomy-of-a-page.aspx

The next section is slots where the actual data is stored. I have removed some hex dumps to make it more clear . Each records are stored in a slot. Slot 0 will have the first records in the page and slot 1 will have second records and so on ,but it is not mandatory that these slots should be in the physical order of the data.You can see from the below image that the size of the record is 710 bytes. Out of this 703 bytes are fixed length data and 7 bytes are row overhead.We will discuss about the record structure and row overhead in different post.

The last section of a page is row offset table and we should run dbcc page with option 1 to get the row offset table at the end.

DBCC page('mydb',1,114,1)

The row offset table will looks like below picture and this should read from the bottom to top.Each slot entry is just a two-bytes pointer into the page slot offset.In our example we have ten records and in the offset table we have ten entries. The first record pointing to the 96th bytes,just after the page header. It is not mandatory to have the first record at 96th bytes.This offset table will helps to manage the records in a page.Each records need 2 bytes of storage in the page for offset array.Consider a non-clustered index over a heap. Each non-clustered index row contains a physical pointer back to the heap row it maps too. This physical pointer is in form of [file:page:slot] - so the matching heap row can be found be reading the page, going to the slot number in the slot array to find the record's offset in the page and then reading the record at that offset.If we need to save a record in between, it is not mandatory to restructure the entire page. it can be easily possible by restructuring only the offset table.

In our case if you look into the page header, free space is 976 bytes, which is equal to
(8*1024)- 96-(10 * 703)-(10*7)-(10*2)
where 8*1024 = Total number of bytes in the page
96 = Size of Page Header
10*703 = Number of records * size of four columns in the table
10*7 = Number of records * row overhead
10*2 = Number of records * size in bytes to store the row offset table

Now we have seen the structure of the page. Let us summarize this . A page is 8KB size. That means 8192 bytes. Out of these, 96 bytes are used for page header which is in fixed size for all data pages. Below that, data records are stored in slots.The maximum length of data records is 8060 bytes. This 8060 include the 7 bytes row overhead also . So in a record you can have maximum of 8053 bytes. The below create table statement will fail.

CREATE TABLE Maxsize(
 id         CHAR(8000) NOT NULL,

id1        CHAR(54) NOT NULL
 )

Msg 1701, Level 16, State 1, Line 1

Creating or altering table 'Maxsize' failed because the minimum row size would be 8061, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.

The remaining 36 bytes are reserved for slot array entry and any possible forwarding row back pointer(10 bytes). This does not meant that page can hold only 18 (36/2) records. Slot array can grow from bottom to top based on the size of the records.If the size of records is small, more records can be accommodate in a page and offset table will take more space from bottom to top.

Reference:I have learned about the page structure from Paul Randal excellent post on this subject.

If you liked this post, do like my page on FaceBook