All about SQL server Statistics in simple words:

What is SQL server Statistics?

SQL Server statistics will be used by optimizer to create the optimized execution plans where estimating the number of rows that can be returned, density of pages and statistics object will also hold a histogram of information about the distinctive number of rows and range of typical rows. All this information will be used by optimizer to estimate optimal execution plan to retrieve the data.

When do these statistics will be created?

SQL server will create statistics object when we create an index on table and statistics object will also be created by SQL server automatically when we use non-indexed column in a where condition of select queries (What this mean is we are missing an index there).  Also we could just create the statistics manually.

Let’s see this with an example:

I have created a database and a table called dbo.employee  (I copied it over from Adventurewroks2012 )which does not have any index after copied to a new database called statistics as below

Pic (1)

Pic1

When I run a simple select on this table like below pic (2) will result into a table scan in execution plan as it doesn’t have an index and will still create statistics  pic (3 )to make use of them when the same query run every time.

Pic (2)

pic2

Pic (3)

Pic3

Let’s create index pic (4) to get benefited as all we know seek is better than scan when we select for specific rows instead a whole set of rows.

Pic (4)

Pic4

When I created the index, now SQL server will create the statistics specifically for this index and these statistics will tell the optimizer how to use this index and get the data faster and optimal.

How these statistics will help Optimizer?

Let’s take the same above Simple select to see the estimated execution plan.  Interesting now, the estimated execution plan is showing that the plan was just created by using the statistics object that was creates along with the index and displays the estimated rows that will returned and other information which estimated and this information  was read by Optimizer from the statistics.

I’m not going into detail how the query was executed internally when the SQL hits the SQL engine to process the query but in general the optimizer which will be in relation engine will use this statistics to create the estimated plan and will handed over the plan to storage engine to get the data where the actual execution plan comes into picture.

As long as the estimated and actual plans are same, there will no performance issue as this means that optimizer had the updated statistics. When these statistics got not updated may result in actual plans different from estimated which result in not much accurate plans from optimizer and let the performance down.

Pic (5)

Pic5

When do these statistics get out of date?

Usually, statistics will be out of date or inaccurate when data in a table changes from time to time. By default statics for table will get updated when

  1. When an empty table gets a row
  2. A table had less than 500 and increased by 500 rows
  3. A table had less than 500 and increased by 500 rows +every 20 % of the total rows
  4. Trace flag 2371 which will change the fixed rate of the 20% threshold for update statistics into a dynamic percentage rate. The higher the number of rows in a table, the lower the threshold will become to trigger an update of the statistics. For example, if the trace flag is activated it means an update statistics will be triggered on a table with 1 billion rows when 1 million changes occur.

More info on this Trace flag can be found here TraceFlag2371 (I haven’t tested it though)

The best way to check the statistics are out of date is to verify the estimated number of rows in an execution plan to actual number of rows, if they both are almost same then we have the accurate stats and if not then time to update stats.

How to automate updating these statistics?

The database options to create and update statistics will let you do it automatically by SQL server or Optimizer. But how smart are they on a highly transactional databases and large TB databases with millions of data loads every minute to minute?

By default these are enabled.

How to perform manual updating these stats?

1.       Below stored procedure will update the stats within whole single database

      EXEC sp_updatestats; Or EXEC sp_updatestats ‘resample’  

-    Resample will use the most recent sample number rows to update the stats.

-    sp_updatestats updates only the statistics that require updating based on the rowmodctr (row modification counter)information in the sys.sysindexes

2. Below update command will update the all stats for specific table or specific index if specified. Also this command will provided with multiple options. I will explain few of them which are important to note while doing this update on large TB of data.

 UPDATE STATISTICS Table Name or View Name ,  index_name with Options 

FULLSCAN: Will scan the entire number of rows in a table to update the stats.

SAMPLE NUMBER: If you are not like to do whole scan of the table rows (Time consuming) and update the stats, we have option to specify the sample number of rows or percent of rows to scan to update or create the stats.

NORECOMPUTE: If this option is specified then query optimizer completes this statistics update and disables future updates (AUTO_UPDATE_Statistics). We must carefull while using this option as it will turn off the auto stats for the specified table.

What happens to these stats when re-indexing or Re-organizing indexes?

When re-indexing the stats will also be re-created and there is nothing changed to stats when re-organizing the indexes

How to check when your stats were last updated?

1. One way to query sys.stats system table, something like below

select object_name(object_id) TableName ,name ,stats_date(object_id,

stats_id)  Last_updated from sys.stats

where objectproperty(object_id, 'IsUserTable') = 1 

2. the other way is

 DBCC SHOW_STATISTICS (TableName, statsName) 

 

 

About these ads

What is Parameter Sniffing in simple words?

I have been asked recently what is parameter sniffing in an interview. Well, we all know how Optimizer in SQL server works to retrieving the data from the tables. We do also know that Optimizer reads the statistics and internal pre-compiled plans instead generating a new plan for parameterized Queries and Store procedures. Here we are, how Optimizer knows we are executing a similar query as it searching for pre-compiled plan for us because it was reading the parameters we passed to the stored procedure or a parameterized query and it is called “sniffing”.

What it counts for Performance, Good or bad?

 “It depends “Yes “It depends” – I know we use this word many times in regards to databases

Good:

By default, Parameter sniffing is enabled and SQL Server caches the plans and reads the parameters those passed to the stored procedures or parameterized SQL Queries. We benefit from it as Optimizer will go and search for the plan it compiled earlier in cache when we execute the same query again. Optimizer will not need to re-create the plan to execute the same query multiple times.

Bad:

This is all good only when there are good statics are maintained and typical parameters are used. What changes day by day is the requirement for using the new parameters with different words and optimizer will tries to use the same old plan in cache and may not be good for the new parameters as data has grown a lot and statistics have not updated yet and may be the plan for new parameter was doing table scans.

Work Around:

  1. Until we see a performance issue with this parameter sniffing it is better to leave it by default and we really benefit from it.
  2. When we see lot of performance issue, we could disable it using a trace flag 4136, this will disable the parameter sniffing at the server level.
  3. We could use it to disable only for specific query then we could use the recompile  query hint

SQL Server 2012- Sample Databases

I was not able to find the sample databases for SQL server 2012 as most of the sites I have visited have only 2000  sample databases which we will be not able to restore to 2012. Here I have uploaded the link 2012 sample .bak files. (Easy to restore to 2012)

Pubs, Adventure Works, North Wind  -Download

How to bring a database online which is in restoring mode ?

For Example, In log shipping the secondary or a stand-by database will be in restoring mode while applying the logs from primary for every 15 minutes.

In order to perform a DR test, we need to bring the production down and DR database online, this can be achieved by running the below query on secondary server database.

–Run this below query to bring the database online which is in Restoring state

Restore Database [DATABASE NAME] with recovery

List Database Users, Roles and permissions through roles

Simple script list all database users and mapped roles.

--Create Temp Table

create table #Users  ( 

DBName sysname,

UserName varchar(100),

LoginType sysname,

mappedroles varchar(max),

create_date datetime,

modify_date datetime)

--Insert permissions to temptable

INSERT INTO #users

EXEC  sp_MSforeachdb

' use [?] SELECT ''?'' AS DB_Name, case prin.name when ''dbo'' then prin.name + '' (''+ (select SUSER_SNAME(owner_sid)

from master.sys.databases where name =''?'') + '')''

else

prin.name end AS UserName,

prin.type_desc AS LoginType,

isnull(USER_NAME(mem.role_principal_id),'''') AS mappedroles ,

create_date,

modify_date

FROM sys.database_principals prin LEFT OUTER JOIN sys.database_role_members mem ON prin.principal_id=mem.member_principal_id WHERE prin.sid IS NOT NULL and prin.sid NOT IN (0x00) and prin.is_fixed_role <> 1 AND prin.name NOT LIKE ''##%'''

--retrive data from temp table

SELECT

dbname,username ,logintype ,create_date ,modify_date ,

STUFF((SELECT','+CONVERT(VARCHAR(500),mappedroles)

FROM  #users user2

WHERE  user1.DBName=user2.DBName AND user1.UserName=user2.UserName

FOR XMLPATH('')),1,1,'')AS User_Permissions

FROM   #users user1

where  dbname not in('master','msdb','tempdb','model')

GROUP  BY dbname,username ,logintype ,create_date ,modify_date

ORDER BY DBName,username

--Drop temp table

Drop table #users

Configure SQL Server DBMAIL using T-SQL

Configuring dBMail involves 3 main steps

Step 1: Creating Mail Profile

Step2: Creating Mail Account

Step3: Mapping Account to Profile.

Below script will let you do this 3 steps and configures your dBmail successfully.

Source


-    ENABLE SQL DBMAIL, if diabled

EXEC sys.sp_configure N'Database Mail XPs', N'1'

GO

&nbsp;

RECONFIGURE

GO

&nbsp;

-    Add Mail Profile

&nbsp;

EXEC msdb.dbo.sysmail_add_profile_sp @profile_name=N'Profile Name'

GO

&nbsp;

-    Set as Default Profile

&nbsp;

EXEC msdb.dbo.sysmail_add_principalprofile_sp 

@profile_name=N'Profile Name',

@is_default =N'1'

GO

&nbsp;

-    ADD MAIL ACCOUNT

&nbsp;

EXEC msdb.dbo.sysmail_add_account_sp

&nbsp;

   @account_name    = 'Account_Name',

   @email_address   = 'Email Address,EX:DBA@yourcompany.com',

   @display_name    = 'Account Name',

   @replyto_address = 'Email Address,EX:DBA@yourcompany.com',

   @mailserver_name = 'your SMTP Server',

   @mailserver_type = N'SMTP or if you use other mail protocol',

   @port                = 25,

   @use_default_credentials = 0,

   @enable_ssl = 0

GO

&nbsp;

-    Mapping Account with Profile

&nbsp;

EXEC  msdb.dbo.sysmail_add_profileaccount_sp   @profile_name=N'Profile Name', @account_name= N'Account Name',  @sequence_number=N'1'

GO