Wednesday, 6 February 2013

Performance Counters

Counters:
Current Queue Length: Sustained high queue for a period of time means that the IO subsystem has a problem. Average Queue Length: If the Average Queue Length for the disk is high more than 2. Then this indicates a potential IO subsystem problem. This would also make the disk sec/read and disk sec/write high. Disk Sec/Read: This counter should measure not more than 15ms in normal situation. Continuous values of more than 20ms indicate that the disks have a problem.

Under these conditions, consider the following

    Move the files to additional drives.
    Check for proper indexing on the tables.
    Check for the CPU and memory counters to find any other potential bottlenecks.

Disk Sec/Writes: This counter should measure not more than 15ms in normal situation. Continuous values of more than 20ms indicate that the disks have a problem.

Under these conditions, consider the following

    Move the files to additional drives.
    Check for proper indexing on the tables.
    Check for the CPU and memory counters to find any other potential bottlenecks.
    Placing transactional log files in separate drives to remove additional pressure on the drives

Memory/Cache:

Object Name: Memory:

Counters:
Page Faults/Sec: Page faults occur when the page that is not requested is not found in the memory. There are 2 types of page faults. Hard page faults and Soft page faults. Hard page faults require a disk access and where as Soft page faults are ones where the page is found elsewhere in the memory. High number of page faults/sec indicates that there is a problem with memory used by SQL Server. Use this counter in correlation with SQL Server:Memory Manager counters to check if there is a memory pressure. Pages/Sec: This counter is the actual value of the Hard faults and should be used in correlation with Page Faults/Sec and SQL Server memory Manager counters to find the memory pressure.

CPU:

Object Name: Processor

Counters:
%User Time: SQL Server runs in User mode. Privilege mode is used by Operating system to access hardware and memory components. This counter should have a value more than 70%. If it has a value less than that check the %privileged time and %processor time counters to see if there is a problem with the processor. %Privileged Time: Operating system moves thread to Privileged mode to access services. This counter should value less than 20%. %Processor Time: This counter tells the percentage of the time that CPU is executing over a period. Ideally it should value less than 70%. Interrupts/Sec: Interrupts/sec is the average rate, in incidents per second, at which the processor received and serviced hardware interrupts.

System: Object Name: System

Counters:
Processor Queue Length: Is the number threads that are waiting to be processed by the processor. This counter is calculated by the actual value divided by the number of processors. Ideally value should be less than 2.

SQL Server:

Object Name: Access Methods

Counters:
Forwarded Records/Sec: Number of records fetched through forwarded record pointers. Tables with NO clustered index can. If you start out with a short row, and update the row creating a wider row, the row may no longer fit on the data page. A pointer will be put in its place and the row will be forwarded to another page.

Can be avoided by:

    Using default values
    Using Char instead of Varchar.

Full Scans/Sec: Either table or Index in fully scanned. This could be caused because of wrong or unhelpful indexes. Page Splits/Sec: Number of page splits occurring as the result of index pages overflowing. Normally, this is associated with leaf pages of clustered indexes and non-clustered indexes. This can be avoided by having proper fill factor.

Object Name: Memory Manager

Counters:
Memory Grants Pending: Memory is needed to process each user request. If enough memory is not available then the user request waits for memory which cause performance hitch in executing the query.

This can be avoided by the following

    Adding more memory to the server
    Adding memory to SQL Server
    Creating proper Indexes

Object Name: Buffer Manager
Counters:
Buffer Cache Hit Ratio: Percentage of time the pages requested are already in memory. It should be ideally more than 99% and if less than 95% indicates that SQL server has no enough memory and adding more memory would be beneficial. Checkpoints/Sec: Pages written to the disk during the Checkpoint process freeing up SQL cache. Memory pressure is indicated if this counter is more than 300 Secs. Lazy Writes/Sec: Pages written to the disk during the LazyWriter process freeing up SQL cache. Memory pressure is indicated if this counter is more than 300 Secs. Page Life Expectancy: This is one counter that I would love to monitor to check the memory pressure. This would give you the time in seconds the data pages reside in the SQL Server cache. If the value is low indicates the following problems.

    SQL Cache is cold(Discussed in Page Faults)
    Memory Problems
    Missing indexes.

If Checkpoints/Sec, Lazywriter/Sec and Page life expectancy together is less than 300 seconds then it indicates the cause of memory pressure and your server needs more memory.

Object Name: Databases

Counters:
Transactions/Sec: Indicates the number of transaction that occur per second in the server.

Object Name: General Statistics
Counters: User Connections: Number of connections made to the SLQ Server.

Object Name: Latches

Counters:
Average Latch wait Time: Latches are short term light weight synchronization object. Latches are not held for the duration of a transaction. Typical latching operations during row transfers to memory, controlling modifications to row offset table, etc. A high value here indicates potential problems with IO subsystem or Memory.

Object Name: Locks

Counters
Average Wait Time(ms), Lock Wait time(ms) and Lock waits/Sec: All these counters are related to the locks held by the SQL Server. Transaction should be as short as possible and should hold locks only for a short time without blocking other user queries, A high value here would indicate the following.

    Memory Pressure
    IO Subsystem problem
    Improper indexes
    Improper SQL structures
    Improper placement of files in the disk subsystem
    Incorrect usage of SQL Isolation levels

These are the major counters that I would wish to check. Apart from this you can also check the following counters.
Object Name     Counter Name
SQL Statistics     Compilations/sec
SQL Statistics     Recompilations/sec
SQL Statistics     Batch Requests/sec
SQL Server: Buffer Manager     Readahead pages/sec


No comments:

Post a Comment