Sunday 8 July 2012

Index Fragmentation


This article is related to Index Fragmentation.

Basic of Index Fragmentation

Index fragmentation is a phenomenon where index contents are no longer stored continuously in the storage. When index contents become scattered in the storage, fragmented, performance on index will degrade.

There are 2 types of Index fragmentation

1.    Extent or External fragmentation.
2.    Page or Internal fragmentation.

The Index fragmentation can occurs in both Clustered and Non-Clustered Index. To understand the External and Page fragmentation I am taking an example.

Un-Fragmented Index



In the above figure all the pages are full and the physical order of the pages is sequential.
As a result of data modifications, in the cases of extent fragmentation the pages get out of the physical order.  This type of fragmentation produces random IO, which does not perform as well as sequential IO.

The bellow figure demonstrates the extent fragmentation.

Extent Fragmentation



In the internal fragmentation there are gaps in the data pages, which reduce the amount of data that can be stored on each page. So it increases the overall amount of space needed to store the data. When the pages are not full and additional pages are required to store the data, as more IO will be required to retrieve those additional pages. The bellow figure demonstrates the process.

Page Fragmentation



During a page split, a new page is allocated, and half the records are moved from the old page to the newly-allocated page. Each page has a pointer to the previous and next page in the index, so those pages will also need to be updated. The bellow figure demonstrates the process.

Page splitting due to fragmentation



Analyzing Fragmentation

To analyse SQL Server indexes, you use the system function sys.dm_db_index_physical_statsto determine which indexes are fragmented and the extent of that fragmentation. We used this function to analysing all the index of the database or all the index of the table or a specified index.
The function sys.dm_db_index_physical_statsto takes the following parameters described below.

Database ID

A smallint value that represents the ID number of a database. If null is specified, the function retrieves index-related data from all databases on a SQL Server instance.

Object ID

An int value that represents the ID number of a table or view. If null is specified, the function retrieves index-related data for all tables and views in a specific database or SQL Server instance. If you specify null, you must also specify null for the index ID and partition number.

Index ID

An int value that represents the ID number of an index. If null is specified, the function retrieves index-related data for all indexes defined on the specified table or view. If you specify null, you must also specify null for the partition number. Also, if the object ID refers to a heap, use 0 as the index ID.

Partition number 

An int value that represents the partition number of an index or heap. If null is specified, the function retrieves index-related information for all partitions related to a specific object.

Mode 

The scan level used to obtain index-related information. Valid inputs include NULL, DEFAULT, or one of the following three modes:

LIMITED

Scans the smallest number of pages, which means this is the fastest mode. The LIMITED mode is equivalent to NULL and DEFAULT.

SAMPLED

Scans 1% of all pages. If an index contains fewer than 10,000 pages, then DETAILED mode is used.

DETAILED

Scans all index pages, which means this is the slowest mode, but most accurate.
The sys.dm_db_index_physical_stats function returns a number of values that provide details about the indexes we specify. SQL Server Books Online provides details about each of these values.



Along with other information, there are two important columns that for detecting fragmentation, which are as follows:

avg_fragmentation_in_percent

This is a percentage value that represents external fragmentation. For a clustered table and leaf level of index pages, this is Logical fragmentation, while for heap, this is Extent fragmentation. The lower this value, the better it is. If this value is higher than 10%, some corrective action should be taken.

avg_page_space_used_in_percent

This is an average percentage use of pages that represents to internal fragmentation. Higher the value, the better it is. If this value is lower than 75%, some corrective action should be taken.

Here is the quick script to find fragmentation of the database for all the indexes.
SELECT   ps.database_id,
         ps.OBJECT_ID,
         ps.index_id,
         b.name,
         ps.avg_fragmentation_in_percent
FROM     sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) ASps
         INNER JOIN sys.indexes AS b ON ps.OBJECT_ID = b.OBJECT_ID
                                        AND ps.index_id = b.index_id
WHERE ps.database_id = DB_ID()
ORDER BY ps.OBJECT_ID

Hope you like it.


Posted by: MR. JOYDEEP DAS

No comments:

Post a Comment