Microsoft SQL Server¶

Microsoft SQL Server is a popular database for web applications mostly powered by Microsoft IIS servers. SQL Server is monitored thanks to our Metrics-Agent. The initial setup is exactly the same as for Windows monitoring.

Supported Versions¶

All Microsoft SQL Server versions are supported.

Setup¶

You need to setup a Metrics Agent on the Windows machine where SQL Server is installed. Please follow Windows Pre-requisites and Setup instructions.

Configuration¶

SQL Server Configuration Wizard

The SQL Server monitor configuration is exactly the same as Windows. Provide the information to connect to the metrics agent which exposes the metrics through JMX on port 1099 by default.

Applications¶

The SQL Server monitor supports monitor specific devices like caches or databases. To fine tune monitored counters selection.

Caches¶

Cache refers to SQLServer: Plan Cache Perfmon objects:

The Plan Cache object provides counters to monitor how SQL Server uses memory to store objects such as stored procedures, ad hoc and prepared Transact-SQL statements, and triggers. Multiple instances of the Plan Cache object can be monitored at the same time, with each instance representing a different type of plan to monitor.

For more information about available caches, refer to the documentation on Plan Cache.

Databases¶

The SQLServer:Database Mirroring performance object contains performance counters that report information about SQL Server database mirroring. The table below lists the counters that this object contains.

For more information about available databases, refer to the documentation on Database.

Locks¶

The SQLServer:Locks object in Microsoft SQL Server provides information about SQL Server locks on individual resource types.

For more information about available locks, refer to the documentation on Locks.

Monitored Counters¶

The following SQL Server Perfmon counters are available:

Access Methods:

Forwarded Records/sec: Identifies use of a pointer which has been created when variable length columns have caused a row to move to a new page in a heap,
Full Scans/sec: Monitors the number of full scans on tables or indexes. Ignore unless high CPU coincides with high scan rates. High scan rates may be caused by missing indexes, very small tables, or requests for too many records. A sudden increase in this value may indicate a statistics threshold has been reached, resulting in an index no longer being used,
Index Searches/sec: Monitors the number of index searches when doing range scans, single index record fetches, and repositioning within an index. The threshold recommendation is strictly for OLTP workloads,
Page Splits/sec: Monitors the number of page splits per second which occur due to overflowing index pages and should be as low as possible. To avoid page splits, review table and index design to reduce non-sequential inserts or implement fillfactor and pad_index to leave more empty space per page,
Workfiles Created/sec: Number of work files created per second, usually as a part of tempdb processing when working with hashing joins and other hashing operations. High values can indicate thrash in tempdb and poorly coded queries,
Worktables Created/sec: Number of work tables created per second, usually as a part of tempdb processing when working with spools such as table spools, index spools, etc,
% Worktables From Cache Ratio: Percentage of work tables created whose initial two pages were immediately available from the worktable cache. A value less than 90% may indicate insufficient memory, since execution plans are being dropped,
Table Lock Escalations/sec: Number of times that SQL Server escalated locks from page- or row-level to table-level. This number should, generally be low. Frequent or even occasional spiking in this value may indicate poorly coded transactions.

Buffer Manager:

Free List Stalls/sec: Indicates the number of requests per second that had to wait for a free page,
Lazy Writes/Sec: Monitors the number of times per second that the Lazy Writer process moves dirty pages from the buffer to disk as it frees up buffer space. Lower is better with zero being ideal. When greater than 20, this counter indicates a need for more memory,
Checkpoint Pages/Sec: Monitors the number of dirty pages, per second, that are flushed to disk when SQL Server invokes the checkpoint process. Checkpoint frequency is influenced by the recovery interval setting in sp_configure. High values for this counter may indicate insufficient memory or that the recovery interval is too high,
Page Life Expectancy: Tells, on average, how many seconds SQL Server expects a data page to stay in cache. The target on an OLTP system should be at least 300 (5 min). When under 300, this may indicate poor index design (leading to increased disk I/O and less effective use of memory) or, simply, a potential shortage of memory,
Page Lookups/sec: The number of requests to find a page in the buffer pool. When the ratio of batch requests to page lookups crests 100, you may have inefficient execution plans or too many adhoc queries,
Page Reads/sec: Number of physical database page reads issued per second. Normal OLTP workloads support 80 - 90 per second, but higher values may be a yellow flag for poor indexing or insufficient memory,
Page Writes/sec: Number of database pages physically written to disk per second. Normal OLTP workloads support 80 - 90 per second. Values over 90 should be crossed checked with 'lazy writer/sec' and "checkpoint" counters. If the other counters are also high, then it may indicate insufficient memory,
Readahead/sec: Number of data pages read per second in anticipation of their use. If this value is makes up even a sizeable minority of total Page Reads/sec say, greater than 20% of total page reads), you may have too many physical reads occurring,
Database Pages: Number of database pages in the buffer pool, as opposed to other usages for memory such as free pages, procedure cache, etc,
Target Pages: The ideal number of pages in the buffer pool according the maximum memory granted to SQL Server in sp_configure,
Buffer Cache Hit Ratio: Long a stalwart counter used by SQL Server DBAs, this counter is no longer very useful. It monitors the percentage of data requests answer from the buffer cache since the last reboot.

Databases:

Per {Database}:
- Data File(s) Size (KB): Cumulative size (KB) of all the data files in the database including any automatic growth. Monitoring this counter is useful, for example, for determining the correct size of tempdb,
- Log Bytes Flushed/sec: Total number of log bytes flushed per second. Useful for determining trends and utilization of the transaction log,
- Log File(s) Size (KB): Cumulative size, in (KB), of all the transaction log files for the specific database. Useful for determining trends and utilization of the transaction log,
- Log File(s) Used Size (KB): The cumulative used size of all the log files in the database,
- Log Flush Wait Time: Total wait time, in milliseconds, to write all transaction log pages,
- Log Flush Waits/sec: Effectively, the number of times per second that SQL Server must wait for pages to be written to the transaction log,
- Log Flushes/sec: Technically, the number of log pages flushed to the transaction log per second,
- Log Growths: Total number of times the transaction log for the database has been expanded. Each time the transaction log grows, all user activity must halt until the log growth completes. Therefore, you want log growths to occur during predefined maintenance windows rather than during general working hours,
- Log Shrinks: Total number of times the transaction log for the database has been shrunk,
- Log Truncations: Total number of times the transaction log has been truncated for the database specified. Truncations should happen during log backups or, on databases in simple recovery mode, at checkpoint or the time period specified by recovery interval,
- % Log Used: Percent Log Used.

General Statistics:

Logins/sec: Total number of logins started per second. This does not include pooled connections,
Logouts/sec: Total number of logout operations started per second,
Connection resets/sec: Total number of logins started from the connection pool,
Processes blocked: Number of currently blocked processes,
User Connections: The number of users currently connected to the SQL Server. This counter should roughly track with 'Batch Requests/Sec'.

Latches:

Average Latch Wait Time (ms): The average latch wait time, in milliseconds, for any latch requests that had to wait. This value should generally correlate to "Latch Waits/sec" and move up or down with it accordingly,
Latch Waits/sec: The number of latches in the last second that had to wait. Latches are lightweight means of holding a very transient server resource, such as an address in memory,
Number of SuperLatches: Current number of SuperLatches,
SuperLatch Demotions/sec: Number of latches that have been demoted from SuperLatches in the last second,
SuperLatch Promotions/sec: Number of latches that have been promoted to SuperLatches in the last second,
Total Latch Wait Time (ms): The total latch wait time in milliseconds spent waiting for a latch in the last second. This value should stay stable compared to the number of latch waits per second.

Locks:

Per {Lock}:
- Average Wait Time (ms): Average amount of wait time (in milliseconds) for each lock request that resulted in a wait,
- Lock Requests/sec: Number of new locks and lock conversions per second requested from the lock manager,
- Lock Timeouts/sec: Number of lock requests per second that timed out, including requests for NOWAIT locks,
- Lock Wait Time (ms): Total wait time (in milliseconds) for locks in the last second,
- Lock Waits/sec: Number of new locks and lock conversions per second requested from the lock manager,
- Number of Deadlocks/sec: Number of lock requests per second that resulted in a deadlock.

Memory Manager:

Granted Workspace Memory (KB): Specifies the total amount of memorycurrently granted to executing processes, such as hash,sort, bulk copy, and index creation operations,
Maximum Workspace Memory (KB): Maximum amount of memory available for executing processes such as hash,sort, bulk copy, and index creation operations,
Memory Grants Outstanding: Total number of processes per second that have successfully acquired aworkspace memory grant,
Memory Grants Pending: Total number of processes per second waiting for a workspace memory grant.Numbers higher than 0 indicate a lack of memory,
Total Server Memory (KB): Shows the amount of memory that SQL Serveris currently using. This value should grow untilits equal to Target Server Memory, as it populatesits caches and loads pages into memory.When it has finished, SQL Server is said to be ina 'steady-state'. Until it is in steady-state, performancemay be slow and IO may be higher,
Target Server Memory (KB):Total amount of dynamic memory the server can consume.

Cache:

Per {Plan Cache}:
- Cache Hit Ratio: Ratio between cache hits and lookups,
- Cache Hit Ratio Base: For internal use only,
- Cache Object Counts: Number of objects in the cache,
- Cache Pages: Number of 8-kilobyte (KB) pages used by cache objects,
- Cache Objects in use: Number of cache objects in use.

SQL Statistics:

Auto-Param Attempts/sec: Number of auto-parameterization attempts per second. Total should be the sum of the failed, safe, and unsafe auto-parameterizations. Auto-parameterization occurs when an instance of SQL Server attempts to reuse a cached plan for a previously executed query that is similar to, but not the same as, the current query. For more information, see Autoparameterization in the SQL Server Books On-Line (BOL),
Failed Auto-Params/sec: Number of failed auto-parameterization attempts per second. This should be small,
Safe Auto-Params/sec: Number of safe auto-parameterization attempts per second,
Unsafe Auto-Params/sec: A query is designated as unsafe when it has characteristics that prevent its cached plan from being shared,
Batch Requests/Sec: Number of Transact-SQL command batches received per second. This statistic is affected by all constraints (such as I/O, number of users, cache size, complexity of requests, and so on). High batch requests mean good throughput,
SQL Compilations/sec: Number of times that Transact-SQL compilations occurred, per second (including recompiles). The lower this value is the better,
SQL Re-Compilations/sec: Number of statement recompiles per second. Counts the number of times statement recompiles are triggered. Generally, you want the recompiles to be low,
SQL Attention Rate/sec: Number of cancels and query timeouts occurring per second. This number should be as low as possible. A high sustained number indicates frequent query timeout or end-user cancellation of queries.

Transactions:

Longest Transaction Running Time: The time, in seconds, of the longest running transaction. When blocking is high, check this counter to see if transactions are open for long periods of time.