Microsoft Windows¶
Microsoft Windows is the leading operating system for PCs. A number of web servers are also powered by Windows.
Windows is monitored thanks to our Metrics-Agent. This Agent is a small Java application which exposes Perfmon metrics via the JMX (Java Management Extensions) protocol.
The Windows monitoring module ships with pre-selected counters and thresholds.
Supported Versions¶
OctoPerf supports monitoring Windows 7, 8, 8.1, 10 and Windows server 2008 to 2016.
For non english installations it is recommended to use at least the pro edition.
Pre-requisites¶
Basically, a Windows machine can be monitored through JMX by using our Metrics-Agent. The following pre-requisites must be filled:
- Install a
Java JDK 8
Runtime only on the windows machine, - Download and start the latest JMX Agent on the windows machine to monitor,
- Open the network port 1099 (or any custom port specified on agent launch) in the firewall,
- Perfmon must be installed and enabled on the monitored machine.
Once these steps are complete, the agent is ready to be launched.
Note: A JDK instead of JRE is recommended to have the JConsole tool available. The JMX Agent only supports Java 8
as of now.
Setup¶
Launch the Metrics Agent by following the instructions on the project's page. It usually takes a few minutes to download and launch the agent. The monitoring agent then connects to the JMX Agent through the network.
Configuration¶
,
Enter the hostname and port where the JMX Agent is running. It's the hostname or IP address of the windows machine to monitor. Click on Check to verify the connection is properly working.
Credentials¶
Windows username and password are optional. These are needed when the JMX agent has been started with username/password protection. Leave this field as is if you didn't enabled authentication on the JMX Agent.
Applications¶
The Windows monitor supports monitor specific devices like physical disks and even specific processes. To fine tune monitored counters selection, the Windows monitor proposes to select the following applications:
- CPUs: each CPU core,
- Disks: hard disk drives and solid state drives,
- Network Interfaces: network cards,
- Processes: running windows processes.
_Total is available for most type of applications. Most relevant counters are selected in the next step depending on the applications being selected.
Monitored Counters¶
The following Windows Perfmon counters are available:
CPUs:
-
Per {CPU}:
- % Processor Time: Percent of Processor Usage,
- % Idle Time: Percent of Processor idle time,
- Interrupts/sec: The average rate per second at which the processor handles interrupts from applications or hardware devices,
- Interrupts/sec: The average rate per second at which the processor handles interrupts from applications or hardware devices,
- % Privileged Time: The percentage of time the processor was running in privileged mode,
- % User Time: The percentage of time the processor was running in user mode,
- % Interrupt Time: This measures the time the processor spends receiving and servicing hardware interruptions during specific sample intervals,
- DPC Queued/sec: DPCs (Deferred Procedure Calls) Queued per Second measures the rate at which DPC objects are added to the processor's DPC queue,
- % DPC Time: % of time spent in DPCs (Deferred Procedure Calls),
- DPC Rate: DPCs (Deferred Procedure Calls) Rate,
- % C1 Time: % of Processor time spent in C1 state,
- % C2 Time: % of Processor time spent in C2 state,
- % C3 Time: % of Processor time spent in C3 state,
- % C1 Transitions/sec: Number of Processor to C1 state transitions per second,
- % C2 Transitions/sec: Number of Processor to C2 state transitions per second,
- % C3 Transitions/sec: Number of Processor to C3 state transitions per second.
Memory:
- % Commited Bytes In Use: This measures the ratio of Committed Bytes to the Commit Limit - in other words, the amount of virtual memory in use,
- Available MBytes: This measures the amount of physical memory, in megabytes, available for running processes. If this value is less than 5 percent of the total physical RAM that means there is insufficient memory, and that can increase paging activity. To resolve this problem, you should simply add more memory,
- Cache Bytes: This value includes not only the size of the cache but also the size of the paged pool and the amount of pageable driver and kernel code. Essentially, these values measure the systems working set,
- Cache Faults/sec: This is the rate at which pages sought in the cache were not found there and had to be obtained elsewhere in memory or on the disk,
- Commit Limit: Commit limit is the amount of virtual memory, in bytes, that can be committed without having to extend the paging file(s). Committed memory is physical memory which has space reserved on the disk paging files. There can be one or more paging files on each physical drive. If the paging file(s) are expanded, this limit increases accordingly,
- Demande Zero Faults/sec": Demande Zero Faults per second,
- Free System Page Table Entries: This indicates the number of page table entries not currently in use by the system. If the number is less than 5,000, there may well be a memory leak,
- Page Faults/sec: Shows the average number of pages faulted per second. It is measured in numbers of pages faulted; because only one page is faulted in each fault operation, this is also equal to the number of page fault operations. This counter includes both hard faults (those that require disk access) and soft faults (where the faulted page is found elsewhere in physical memory). Most processors can handle large numbers of soft faults without significant consequence. However, hard faults, which require disk access, can cause delays,
- Page Reads/sec: Page Reads/sec is the rate at which the disk was read to resolve hard page faults. It shows the number of reads operations, without regard to the number of pages retrieved in each operation,
- Page Writes/sec: Page Writes/sec is the rate at which pages are written to disk to free up space in physical memory. Pages are written to disk only if they are changed while in physical memory, so they are likely to hold data, not code,
- Pages Input/sec: Pages Input/sec is the total number of pages read from disk to resolve hard page faults,
- Pages Output/sec: Shows the rate at which pages are written to disk to free up space in physical memory. A high rate of pages output might indicate a memory shortage. This counter shows numbers of pages, and can be compared to other counts of pages without conversion,
- Pages/sec: This measures the rate at which pages are read from or written to disk to resolve hard page faults. If the value is greater than 1,000, as a result of excessive paging, there may be a memory leak,
- Peak Cache Bytes: Shows the maximum number of bytes used by the file system cache since the system was last restarted. This might be larger than the current size of the cache,
- Pool Non Paged Allocs: The nonpaged pool is an area of system memory for objects that can’t be written to disk,
- Pool Non Paged Bytes: This measures the size, in bytes, of the non-paged pool. This is an area of system memory for objects that cannot be written to disk but instead must remain in physical memory as long as they are allocated,
- Pool Paged Allocations: Shows the number of calls to allocate space in the paged pool. It is measured in numbers of calls to allocate space, regardless of the amount of space allocated in each call,
- Pool Paged Bytes: Pool Paged Bytes is the size, in bytes, of the paged pool, an area of system memory (physical memory used by the operating system) for objects that can be written to disk when they are not being used. Paged Pool is a larger resource than Nonpaged pool - however, if this value is consistently greater than 70% of the maximum configured pool size, you may be at risk of a Paged Pool depletion (Event ID 2020),
- Pool Paged Resident Bytes: Shows the current size, in bytes, of the paged pool. Space used by the paged and nonpaged pools is taken from physical memory, so a pool that is too large denies memory space to processes,
- System Cache Resident Bytes: This shows the size, in bytes, of pageable operating system code in the file system cache. This value includes only current physical pages and does not include any virtual memory pages not currently resident. It does not equal the System Cache value shown in Task Manager. As a result, this value may be smaller than the actual amount of virtual memory in use by the file system cache. This value is a component of Memory System Code Resident Bytes which represents all pageable operating system code that is currently in physical memory,
- System Code Resident Bytes: Shows the size, in bytes, of operating system code currently in physical memory that can be written to disk when not in use. This value is a component of Memory System Code Total Bytes, which also includes operating system code on disk. Memory System Code Resident Bytes (and Memory System Code Total Bytes) does not include code that must remain in physical memory and cannot be written to disk,
- System Code Total Bytes: This measures the size, in bytes, of pageable operating system code currently in virtual memory. It is a measure of the amount of physical memory being used by the operating system that can be written to disk when not in use. This value is calculated by adding the bytes in Ntoskrnl.exe, Hal.dll, the boot drivers, and file systems loaded by Ntldr/osloader. This counter does not include code that must remain in physical memory and cannot be written to disk,
- System Driver Resident Bytes: Shows the size, in bytes, of pageable physical memory being used by device drivers. It is the working set (physical memory area) of the drivers. This value is a component of Memory System Driver Total Bytes, which also includes driver memory that has been written to disk. Neither Memory System Driver Resident Bytes nor Memory System Driver Total Bytes includes memory that cannot be written to disk,
- System Driver Total Bytes: Displays the size, in bytes, of pageable virtual memory currently being used by device drivers. Pageable memory can be written to disk when it is not being used. It includes physical memory (Memory System Driver Resident Bytes) and code and data written to disk. It is a component of Memory System Code Total Bytes,
- Write copies/sec: Shows the rate at which page faults are caused by attempts to write that have been satisfied by copying the page from elsewhere in physical memory. This is an economical way of sharing data since pages are only copied when they are written to; otherwise, the page is shared. This counter shows the number of copies, without regard to the number of pages copied in each operation.
Network Interfaces
-
Per {Network Interface}:
- Bytes Received/sec: Shows the rate at which bytes are received over each network adapter. The counted bytes include framing characters. Bytes Received/sec is a subset of Network Interface\Bytes Total/sec,
- Bytes Sent Unicast/sec: Shows the rate at which bytes are requested to be transmitted to subnet-unicast addresses by higher-level protocols,
- Bytes Sent/sec: Shows the rate at which bytes are sent over each network adapter. The counted bytes include framing characters. Bytes Sent/sec is a subset of Network Interface\Bytes Total/sec,
- Bytes Total/sec: Shows the rate at which bytes are sent and received on the network interface, including framing characters. Bytes Total/sec is the sum of the values of Network Interface\Bytes Received/sec and Network Interface Bytes Sent/sec,
- Current Bandwidth: Shows an estimate of the current bandwidth of the network interface in bits per second (BPS). For interfaces that do not vary in bandwidth or for those where no accurate estimation can be made, this value is the nominal bandwidth,
- Offloaded Connections: Number of Offloaded Connections on this interface,
- Output Queue Length: Shows the length of the output packet queue, in packets. If this is longer than two, there are delays and the bottleneck should be found and eliminated, if possible. Since the requests are queued by Network Driver Interface Specification (NDIS) in this implementation, this value is always 0,
- Packets Outbound Discarded: Shows the number of outbound packets to be discarded even though no errors had been detected to prevent transmission. One possible reason for discarding the a packet could be to free up buffer space,
- Packets Outbound Errors: Shows the number of outbound packets that could not be transmitted because of errors,
- Packets Received Discarded: Shows the number of inbound packets that were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. One possible reason for discarding such a packet could be to free up buffer space,
- Packets Received Errors: Shows the number of inbound packets that contained errors preventing them from being deliverable to a higher-layer protocol,
- Packets Received Non-Unicast/sec: Shows the rate at which non-unicast (subnet broadcast or subnet multicast) packets are delivered to a higher-layer protocol,
- Packets Received Unicast/sec: Shows the rate at which subnet-unicast packets are delivered to a higher-layer protocol,
- Packets Received Unknown: Shows the number of packets received through the interface that were discarded because of an unknown or unsupported protocol,
- Packets Received/sec: Shows the rate at which packets are received on the network interface,
- Packets Sent Non-Unicast/sec: Shows the rate at which packets are requested to be transmitted to nonunicast (subnet broadcast or subnet multicast) addresses by higher-level protocols. The rate includes packets that were discarded or not sent,
- Packets Sent Unicast/sec: Shows the rate at which packets are requested to be transmitted to subnet-unicast addresses by higher-level protocols. The rate includes the packets that were discarded or not sent,
- Packets Sent/sec: Shows the rate at which packets are sent on the network interface,
- Packets/sec: Length of output packet queue per second,
- TCP Active RSC Connections: Number of active TCP Receive Segment Coalescing connections,
- TCP RSC Average Packet Size: Average packet size for TCP Receive Segment Coalescing connections,
- TCP RSC Coalesced Packets/sec: Number of RSC coalesced packets per second,
- TCP RSC Exceptions/sec: Number RSC of exceptions per second.
Physical Disks:
-
Per {Physical Disk}:
- % Disk Read Time: The Disk Read Bytes/sec shows the Disk Throughput of the Read Operation of the Disk,
- % Disk Time: Percentage of time during which the disk was busy. The “% Disk Time” counter is nothing more than the “Avg. Disk Queue Length” counter multiplied by 100. It is the same value displayed in a different scale,
- % Disk Write Time: The Disk Write Bytes/sec shows the Disk Throughput of the Write Operation of the Disk,
- % Idle Time: This counter provides a very precise measurement of how much time the disk remained in idle state, meaning all the requests from the operating system to the disk have been completed and there is zero pending requests,
- Avg. Disk Bytes/Read: Average size of read requests,
- Avg. Disk Bytes/Transfer: Average size of transfer requests,
- Avg. Disk Bytes/Write: Average size of write requests,
- Avg. Disk Queue Length: The Avg. Disk Queue Length counter is the “estimated” average number of requests that are either in process or waiting to be processed by the Disk. It is equal to the (Disk Transfers/sec) * (Disk sec/Transfer),
- Avg. Disk Read Queue Length: Estimated number of read requests that are either in process or waiting to be processed by the Disk,
- Avg. Disk Write Queue Length: Estimated number of write requests that are either in process or waiting to be processed by the Disk,
- Avg. Disk sec/Read: The Avg. Disk sec/Read displays how long in milliseconds it takes for a read operation from the Disk,
- Avg. Disk sec/Transfer: The Avg. Disk sec/Write displays how long in milliseconds it takes for a transfer operation to the Disk,
- Avg. Disk sec/Write: The Avg. Disk sec/Write displays how long in milliseconds it takes for a write operation to the Disk,
- Current Disk Queue Length: This is an “estimated” average number of requests that are either in process or waiting to be processed by the Disk. This is an instantaneous counter. Observe its value over several intervals or use Avg. Disk Queue Length,
- Disk Bytes/sec: Rate of transfer for all requests,
- Disk Read Bytes/sec: Rate of transfer for read requests,
- Disk Reads/sec: Number of reads per second,
- Disk Transfers/sec: Number of transfers per second,
- Disk Write Bytes/sec: Rate of transfer for read requests,
- Disk Writes/sec: Number of writes per second,
- Split IO/sec: The Split IO/sec counter displays the physical disk requests that are split into multiple requests. This counter is a primary indicator if a disk is fragmented and needs to be optimized.
Process:
-
Per {Process}:
- % Privileged Time: The percentage of time a process was running in privileged mode,
- % Processor Time: The percentage of time the processor was busy servicing a specific process,
- % User Time: The percentage of time a process was running in user mode,
- Creating Process ID: Shows the identifier of the process that created the current process. Note that the creating process may have terminated since this process was created and so this value may no longer identify a running process,
- Handle Count: Shows the total number of handles currently open by this process. This number is the equal to the sum of the handles currently open by each thread in this process,
- IO Data Bytes/sec: Shows the rate at which the process is reading and writing bytes in I/O operations. This counter counts all I/O activity generated by the process to include file, network and device I/O's,
- IO Data Operations/sec: Shows the rate at which the process is issuing read and write I/O operations. This counter counts all I/O activity generated by the process to include file, network and device I/O's,
- IO Other Bytes/sec: Shows the rate at which the process is issuing bytes to I/O operations that don't involve data such as control operations. This counter counts all I/O activity generated by the process to include file, network and device I/O's,
- IO Other Operations/sec: Shows the rate at which the process is issuing I/O operations that are neither a read or a write operation. An example of this type of operation would be a control function. This counter counts all I/O activity generated by the process to include file, network and device I/O's,
- IO Read Bytes/sec: Shows the rate at which the process is reading bytes from I/O operations. This counter counts all I/O activity generated by the process to include file, network and device I/O's,
- IO Read Operations/sec: Shows the rate at which the process is issuing read I/O operations. This counter counts all I/O activity generated by the process to include file, network and device I/O's,
- IO Write Bytes/sec: Shows the rate the process is writing bytes to I/O operations. This counter counts all I/O activity generated by the process to include file, network and device I/O's,
- IO Write Operations/sec: Shows the rate at which the process is issuing write I/O operations. This counter counts all I/O activity generated by the process to include file, network and device I/O's,
- Page Faults/sec: Shows the rate at which page faults by the threads executing in this process are occurring. A page fault occurs when a thread refers to a virtual memory page that is not in its working set in main memory. This does not cause the page to be fetched from disk if it is on the standby list and hence already in main memory, or if it is in use by another process with whom the page is shared,
- Page File Bytes Peak: Shows the maximum number of bytes that this process has used in the paging file(s). Paging files are used to store pages of memory used by the process that are not contained in other files. Paging files are shared by all processes, and lack of space in paging files can prevent other processes from allocating memory,
- Page File Bytes: Shows the current number of bytes that this process has used in the paging file(s). Paging files are used to store pages of memory used by the process that are not contained in other files. Paging files are shared by all processes, and lack of space in paging files can prevent other processes from allocating memory,
- Pool Nonpaged Bytes: Shows the number of bytes in the nonpaged pool, a system memory area where space is acquired by operating system components as they accomplish their appointed tasks. Nonpaged pool pages cannot be paged out to the paging file, but instead remain in main memory as long as they are allocated,
- Pool Paged Bytes: Shows the number of bytes in the Paged Pool, a system memory area where space is acquired by operating system components as they accomplish their appointed tasks. Paged Pool pages can be paged out to the paging file when not accessed by the system for sustained periods of time,
- Priority Base: Windows schedules threads of a process to run according to their priority. Threads inherit base priority from their parent processes. The base priority level of the process can range from lowest to highest: Idle, Normal, High, or Real Time,
- Private Bytes: Private Bytes is the current size, in bytes, of memory that this process has allocated that cannot be shared with other processes,
- Process ID: Shows the unique identifier of this process. ID Process numbers are reused, so they only identify a process for the lifetime of that process,
- Process name: Name of this process,
- Thread Count: Shows the number of threads currently active in this process. An instruction is the basic unit of execution in a processor, and a thread is the object that executes instructions. Every running process has at least one thread,
- Uptime: Total time since the process started,
- Virtual Bytes Peak: Virtual Bytes Peak is the maximum size, in bytes, of virtual address space the process has used at any one time. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages. However, virtual space is finite, and the process might limit its ability to load libraries,
- Virtual Bytes: Virtual Bytes is the current size, in bytes, of the virtual address space the process is using. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages. Virtual space is finite, and the process can limit its ability to load libraries,
- Working Set - Private: Working Set - Private displays the size of the working set, in bytes, that is use for this process only and not shared nor sharable by other processes,
- Working Set Peak: Shows the maximum size, in bytes, in the working set of this process at any point in time. The working set is the set of memory pages touched recently by the threads in the process. If free memory in the computer is above a certain threshold, pages are left in the working set of a process even if they are not in use. When free memory falls below a certain threshold, pages are trimmed from working sets. If they are needed, they are then soft-faulted back into the working set before they leave main memory,
- Working Set: Working Set is the current size, in bytes, of the Working Set of this process. The Working Set is the set of memory pages touched recently by the threads in the process. If free memory in the computer is above a threshold, pages are left in the Working Set of a process even if they are not in use. When free memory falls below a threshold, pages are trimmed from Working Sets. If they are needed they will then be soft-faulted back into the Working Set before leaving main memory.
Server:
- Blocking Requests Rejected: Shows the number of times that the server has rejected blocking server message block requests (SMBs) due to insufficient count of free work items. This counter indicates whether the MaxWorkItem or MinFreeWorkItems server registry parameters might need tuning,
- Bytes Received/sec: Shows the rate at which the server is receiving bytes from the network. This counter indicates how busy the server is,
- Bytes Total/sec: Shows the rate at which the server is transmitting bytes through the network. This value provides an overall indication of how busy the server is,
- Bytes Transmitted/sec: Shows the rate at which the server is sending bytes on the network. This counter indicates how busy the server is,
- Context Blocks Queued/sec: Shows the rate at which work context blocks had to be placed on the server's FSP queue to await server action,
- Errors Access Permissions: Shows the number of times attempts to open files on behalf of clients have failed with the message STATUS_ACCESS_DENIED. This counter can indicate is someone is attempting to access random files to improperly access a file that was not properly protected,
- Errors Granted Access: Shows the number of times that attempts to access files successfully opened were denied. This counter can indicate attempts to access files without proper access authorization,
- Errors Logon: Shows the number of failed logon attempts to the server. This counter can indicate whether password guessing programs are being used to crack the security on the server,
- Errors System: Shows the number of times that an internal server error was detected. Errors can reflect problems with logon, security, memory allocation, disk operations, transport driver interface operations, communication (such as receipt of unimplemented or unrecognized SMBs), or I/O Request Packet stack size for the server. Many of these errors are also written to the System log and Security log in Event Viewer. The server can recover from most the errors displayed by this counter, but they are unexpected and should be reported to Microsoft Product Support Services,
- File Directory Searches: Shows the number of searches for files currently active in the server. This counter indicates current server activity,
- Files Open: Shows the number of files currently opened on the server. This counter indicates current server activity,
- Files Opened Total: Shows the number of successful attempts to open a file performed by the server of behalf of clients. This counter is useful in determining the amounts of file I/O and overhead for path-based operations, and for determining the effectiveness of open locks,
- Logon Total: Shows all interactive logon attempts, network logon attempts, service logon attempts, successful logon attempts, and failed logon attempts since the computer was last rebooted,
- Logon/sec: Shows the rate of all interactive logon attempts, network logon attempts, service logon attempts, successful logon attempts, and failed logon attempts,
- Pool Nonpaged Bytes: Shows the size, in bytes, of nonpageable computer memory that the server is currently using. This value is useful for determining the values of the MaxNonpagedMemoryUsage entry in the Windows Registry,
- Pool Nonpaged Failures: Shows how many times allocations from the nonpaged pool have failed. This counter indicates that the computer's physical memory is too small,
- Pool Nonpaged Peak: Shows the maximum size, in bytes, of the nonpaged pool the server has had in use at any one point. This counter indicates how much physical memory the computer should have,
- Pool Paged Bytes: Shows the size, in bytes, of pageable computer memory that the server is currently using. This counter can help in determining good values for the MaxPagedMemoryUsage registry entry,
- Pool Paged Failures: Shows how many times allocations from the paged pool have failed. This counter indicates that the computer's physical memory or page file are too small,
- Pool Paged Peak: Shows the maximum size, in bytes, of the paged pool that the server has allocated. This counter indicates the proper sizes of the page file(s) and physical memory,
- Reconnected Durable Handles: The number of reconnected durable handles, the ratio of "reconnected durable handles"/"total durable handles" indicates how much performance gain from reconnect durable handles,
- Reconnected Resilient Handles: The number of reconnected resilient handles, the ratio of "reconnected resilient handles"/"total resilient handles" indicates how much performance gain from reconnect resilient handles,
- Total Durable Handles: The number of durable handles, it indicates how many durable handles keep alive ever when SMB2 sessions are disconnected,
- Total Resilient Handles: The number of resilient handles, it indicates how many resilient handles keep alive ever when SMB2 sessions are disconnected,
- SMB BranchCache Hash Bytes Sent: The amount of SMB BranchCache hash data sent from the server. This includes bytes transferred for both hash header requests and full hash data requests,
- SMB BranchCache Hash Generation Requests: The number of SMB BranchCache hash generation requests that were sent by SRV2 to the SMB Hash Generation service because a client requested hashes for the file and there was either no hash content for the file or the existing hashes were out of date,
- SMB BranchCache Hash Header Requests: The number of SMB BranchCache hash requests that were for the header only received by the server. This indicates how many requests are being done to validate hashes that are already cached by the client,
- SMB BranchCache Hash Requests Received: The number of SMB BranchCache hash requests that were received by the server,
- SMB BranchCache Hash Responses Sent: The number of SMB BranchCache hash responses that have been sent from the server,
- SMB BranchCache Hash V2 Bytes Sent: The amount of SMB BranchCache hash data sent from the server. This includes bytes transferred for both hash header requests and full hash data requests,
- SMB BranchCache Hash V2 Generation Requests: The number of SMB BranchCache hash generation requests that were sent by SRV2 to the SMB Hash Generation service because a client requested hashes for the file and there was either no hash content for the file or the existing hashes were out of date,
- SMB BranchCache Hash V2 Header Requests: The number of SMB BranchCache hash requests that were for the header only received by the server. This indicates how many requests are being done to validate hashes that are already cached by the client,
- SMB BranchCache Hash V2 Requests Received: The number of SMB BranchCache hash requests that were received by the server,
- SMB BranchCache Hash V2 Requests Served From Dedup.
- SMB BranchCache Hash V2 Responses Sent: The number of SMB BranchCache hash responses that have been sent from the server,
- Server Sessions: Shows the number of sessions currently active in the server. This counter indicates current server activity,
- Sessions Errored Out: Shows the number of sessions that have been closed due to unexpected error conditions. This counter indicates how frequently network problems are causing dropped sessions on the server. The Sessions Errored Out counter reports auto-disconnects along with errored-out sessions. For a more accurate value for errored-out sessions, obtain the value for Sessions Timed Out and reduce the Sessions Errored Out value by that amount,
- Sessions Forced Off: Shows the number of sessions that have been forced to log off. This counter can indicate how many sessions were forced to log off due to logon time constraints,
- Sessions Logged Off: Shows the number of sessions that have terminated normally. This counter is useful in interpreting the statistics from Sessions Timed Out and Sessions Errored Out,
- Sessions Timed Out: Shows the number of sessions that have been closed because idle time exceeded the AutoDisconnect parameter for the server. This counter shows whether the AutoDisconnect setting is helping to conserve resources,
- Work Item Shortages: Shows the number of times STATUS_DATA_NOT_ACCEPTED was returned at receive indication time. This occurs when no work item is available or can be allocated to service the incoming request. This counter indicates whether the InitWorkItems or MaxWorkItems registry entries might need to be adjusted.
System
- % Registry Quota In Use: Shows the percentage of the Total Registry Quota Allowed that is currently being used by the system.
- Alignment Fixups/sec: Shows the rate of alignment faults fixed by the system,
- Context Switches/sec: Shows the combined rate at which all processors on the computer are switched from one thread to another. Context switches occur when a running thread voluntarily relinquishes the processor, is preempted by a higher priority, ready thread, or switches between user-mode and privileged (kernel) mode to use an Executive or subsystem service. It is the sum of the values of Thread\Thread: Context Switches/sec for each thread running on all processors on the computer and is measured in numbers of switches. There are context switch counters on the System and Thread objects,
- Exception Dispatches/sec: Shows the rate at which exceptions are dispatched by the system,
- File Control Bytes/sec: File Control Bytes/sec shows the overall rate at which bytes are transferred for all file system operations that are neither read nor write operations, including file system control requests and requests for information about device characteristics or status. It is measured in numbers of bytes per second,
- File Control Operations/sec: Shows the combined rate of file system operations that are neither read nor write operations, such as file system control requests and requests for information about device characteristics or status. This is the inverse of System\System: File Data Operations/sec and is measured in numbers of operations,
- File Data Operations/sec: Shows the combined rate of read and write operations on all logical disks on the computer. This is the inverse of System\System: File Control Operations/sec,
- File Read Bytes/sec: Shows the overall rate at which bytes are read to satisfy file system read requests to all devices on the computer, including read operations from the file system cache. It is measured in numbers of bytes per second,
- File Write Bytes/sec: Shows the overall rate at which bytes are written to satisfy file system write requests to all devices on the computer, including write operations to the file system cache. It is measured in numbers of bytes per second,
- File Read Operations/sec: Shows the combined rate of file system read requests to all devices on the computer, including requests to read from the file system cache. It is measured in numbers of read operations per second,
- File Write Operations/sec: Shows the combined rate of file system write requests to all devices on the computer, including requests to write to data in the file system cache. It is measured in numbers of write operations per second,
- Floating Emulations/sec: Shows the rate of floating emulations performed by the system,
- Processes: Total number of processes running,
- Processor Queue Length Per CPU: Shows the number of threads in the processor queue per processor. Unlike the disk counters, this counter shows ready threads only, not threads that are running. A sustained processor queue of greater than two threads generally indicates processor congestion,
- Processor Queue Length: Shows the number of threads in the processor queue. There is a single queue for processor time even on computers with multiple processors. Therefore, you may need to divide this value by the number of processors servicing the workload. Unlike the disk counters, this counter shows ready threads only, not threads that are running. A sustained processor queue of greater than two threads generally indicates processor congestion,
- Processor count: Number of processors,
- System Calls/sec: Shows the combined rate of calls to Windows system service routines by all processes running on the computer. These routines perform all of the basic scheduling and synchronization of activities on the computer, and provide access to nongraphic devices, memory management, and name space management,
- System Uptime: Total time since last startup,
- Threads: Total number of threads running.
TCP
- Connection Failures: Shows the number of times that TCP connections have made a direct transition to the CLOSED state from the SYN-SENT or SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state,
- Connections Active: Shows the number of times TCP connections have made a direct transition to the SYN-SENT state from the CLOSED state,
- Connections Established: Shows the number of TCP connections for which the current state is either ESTABLISHED or CLOSE-WAIT,
- Connections Passive: Shows the number of times that TCP connections have made a direct transition to the SYN-RCVD state from the LISTEN state,
- Connections Reset: Shows the number of times that TCP connections have made a direct transition to the CLOSED state from either the ESTABLISHED or CLOSE-WAIT state,
- Segments Received/sec: Shows the rate at which segments are received, including those received in error. This count includes segments received on currently established connections. Segments Received/sec is a subset of TCP: Segments/sec,
- Segments Retransmitted/sec: Shows the rate at which segments containing one or more previously transmitted bytes are retransmitted.
- Segments Sent/sec: Shows the rate at which segments are sent. This value includes those on current connections, but excludes those containing only retransmitted bytes. Segments Sent/sec is a subset of TCP Segments/sec,
- Segments/sec: Shows the rate at which TCP segments are sent or received using the TCP protocol. Segments/sec is the sum of the values of TCP Segments Received/sec and TCP\Segments Sent/sec.
Going Further¶
Processor\%Processor Time¶
This counter is the primary indicator of processor activity. High values many not necessarily be bad. However, if the other processor-related counters are increasing linearly such as Processor\% Privileged Time, high CPU utilization may be worth investigating.
System\Processor Queue Length¶
Processor Queue Length is the number of threads in the processor queue. Unlike the disk counters, this counter shows ready threads only, not threads that are running. There is a single queue for processor time even on computers with multiple processors. Therefore, if a computer has multiple processors, you need to divide this value by the number of processors servicing the workload. A sustained processor queue of less than 10 threads per processor is normally acceptable, dependent of the workload.
Processor\%Privileged Time¶
This counter indicates the percentage of time a thread runs in privileged mode. When your application calls operating system functions (for example to perform file or network I/O or to allocate memory), these operating system functions are executed in privileged mode.
Memory\Free System Page Table Entries¶
Free System Page Table Entries is the number of page table entries not currently in use by the system. This determines if the system is running out of free system page table entries (PTEs) by checking if there is less than 5,000 free PTE’s with a Warning if there is less than 10,000 free PTE’s. Lack of enough PTEs can result in system wide hangs, it can also indicate a memory leak.
Memory\Page Reads/sec¶
This counter indicates that the working set of your process is too large for the physical memory and that it is paging to disk. It shows the number of read operations, without regard to the number of pages retrieved in each operation. Higher values indicate a memory bottleneck.
If a low rate of page-read operations coincides with high values for Physical Disk\% Disk Time and Physical Disk\Avg. Disk Queue Length, there could be a disk bottleneck. If an increase in queue length is not accompanied by a decrease in the pages-read rate, a memory shortage exists.
Memory\Pages/sec¶
This counter indicates the rate at which pages are read from or written to disk to resolve hard page faults. To determine the impact of excessive paging on disk activity, multiply the values of the Physical Disk Avg. Disk sec/Transfer and Memory Pages/sec counters. If the product of these counters exceeds 0.1, paging is taking more than 10 percent of disk access time, which indicates that you need more RAM. If this occurs over a long period, you probably need more memory. A high value of Pages/sec indicates that your application does not have sufficient memory. The average of Pages Input/sec divided by average of Page Reads/sec gives the number of pages per disk read. This value should not generally exceed five pages per second. A value greater than five pages per second indicates that the system is spending too much time paging and requires more memory (assuming that the application has been optimized).
Process\Handle Count¶
The total number of handles currently open by this process. This number is equal to the sum of the handles currently open by each thread in this process. This counter checks all of the processes to determine how many handles each has open and determines if a handle leaks is suspected. A process with a large number of handles and/or an aggressive upward trend could indicate a handle leak which typically results in a memory leak.
Physical Disks\Current Disk Queue Length¶
This counter is the number of requests outstanding on the disk at the time the performance data is collected. It also includes requests in service at the time of the collection. This is an instantaneous snapshot, not an average over the time interval. Bottlenecks can create a backlog that may spread beyond the current server accessing the disk and result in long wait times for end users. Possible solutions to a bottleneck may be to add more disks to the RAID array, replace with faster disks, or move some of the data to other disks.
Troubleshooting¶
JConcole¶
You can check if the agent is running fine by launching the Java JConsole.
How to launch JConsole
- Locate where the JDK is installed, usually C:\Programs\Java\Jdk1.8.0_xxx,
- Launch bin\jconsole.exe,
- Select Remote Process and enter localhost:1099, (1099 is default port),
- Validate by selecting Insecure Connection and go to MBeans.
You should see Windows in MBeans tree. If you encounter any issue connecting to the local jmx agent:
- Check you launched the agent with a free port (1099 by default),
- Check the agent logs in the console, it should show Started Application in xx seconds at the end.
Missing CPU, process, network and disks counters¶
You might notice some counters are missing from the list. This usally happens when monitoring a windows machine not configured in english. Windows home or home basic editions do not always include multi language support thus not all counters are supported on these versions.
A quick solution is to overwrite your regional counters with the english defaults. There are two ways to overcome the issue.
Copy the content of "Counter" registry key¶
- Copy HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib\009,
- Paste into HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Perflib\CurrentLanguage.
If you do not have the rights to edit the registry, the solution is to overwrite Windows perf config files.
Overwritting Windows Perf Files¶
It should be named 00C or something close, it contains info about your current language, remember its name. Now it might be possible to achieve the same by editing the following files (adapt 00C accordingly to the registry folder name from earlier):
- Rename the 4 files (for backup):
c:\windows\system32\perf*00C.dat to c:\windows\system32\perf*00C_bak.dat
- Duplicate and rename the 4 files :
c:\windows\system32\perf*009.dat to c:\windows\system32\perf*00C.dat