Thursday 24 April 2014

Statistics


Introduction

To improve the performance of any query the query optimizer uses the statistics to create query plan.  In most of the case the query optimizer generate the necessary statistics for high quality query plan. In few cases we need to create the additional statistics or modify the query design.

What is Statistics
Statistics for query optimization are objects that contain statistical information about the distribution of values in one or more columns of a table or indexed view.

Query optimizer uses this statistical information to estimate the number of rows, in the query result set and by this way the optimizer to create a high-quality query plan.

Query optimizer uses this cardinality estimation to choose the index seek operator instead of the index scan operator, and in doing so improve query performance.

How we can Update Statistics

We can update statistics by using UPDATE STATISTICS or sp_updatestats
But Microsoft recommended that to keeping AUTO_UPDATE_STATISTICS set to ON so that the query optimizer continues to routinely update statistics.

To determine when the statistics where last update use the 
STATS_DATE ( object_id , stats_id )

Example

SELECT name AS index_name,
    STATS_DATE(object_id, index_id) AS statistics_update_date
FROM sys.indexes
WHERE object_id = OBJECT_ID('dbo.tbl_CUSTOMER');
GO

index_name                                                   statistics_update_date
PK__tbl_CUST__3D9DDA317BE3C753     2014-04-04 17:31:24.810

When to Update Statistics

We have to update the statistics in the following condition

·         Query execution times are slow.
·         Insert operations occur on ascending or descending key columns.
·         After maintenance operations.

Create Statistics
In MS SQL Server the statistics can be created by using CREATE STATICSTIC command or by CREATE INDEX command.

The statistical information created using CREATE STATISTICS command is equivalent to the statistics built by a CREATE INDEX command on the same columns.

The only difference is that the CREATE STATISTICS command uses sampling by default while the CREATE INDEX command gathers the statistics with full scan since it has to process all rows for the index anyway.

Example

CREATE STATISTICS [IX_STATS_CUSTID]
ON [dbo].[tbl_CUSTOMER]([CUSTID])
WITH SAMPLE 50 PERCENT;

As we are here using 50% sampling of the row, for any big table random sampling may not produce the actual statistics. Therefore, for bigger tables, we may need to use the resample option on UPDATE STATISTICS. The resample option will maintain the full scan statistics for the indexes and sample statistics for the rest of the columns.

The statistical information is updated when 20% of the rows changed.

Update Statistics
Updating of statistics ensures that any query that runs get the up-to-date statistics to satisfy the query needs. This is introduced in MS SQL 2005 version and it is similar to the sp_updatestats command.

UPDATE STATISTICS [dbo].[tbl_CUSTOMER]
WITH FULLSCAN, ALL
GO




Hope you like it.




Posted by: MR. JOYDEEP DAS

No comments:

Post a Comment