Search This Blog

Sunday, October 19, 2008

Monitoring Server Performance

It is good idea to perform a server performance health check and compare it with the baseline from time to time, especially when a major update or software application is installed.  Sometimes, these performance checks can be done to troubleshoot a performance issue with the server and to justify an upgrade to new and additional hardware.

 

One of the tools that most commonly used is Microsoft Performance Monitor and these are the key counters that I collect every 30 seconds or every minute for a period of one week and I later analize.

 

-          Processor(_Total)\%Processor Time

If you're monitoring this counter and it's running at or near100% for extended periods, you should drill down at the process level by examining Process(instance)\%Processor Time counter for various process instances on your server.  For example, on an IIS web server you might track Proces(InetInfo)\%Processor Time, while on Exchange server a good counter to watch is Process(Store)\%Processor Time and so on.

-          Memory Pages/Sec

The number of pages per second should not exceed 50 per paging disk on your system.  Another key counter to watch is Memory\Available Bytes, and if this counter s greater than 10% of the actual RAM in your machine then you probably have more than enough RAM and don't need to worry.

Pages\Sec value counts the number of times per second that the computer must access virtual memory rather than physical memory.  A value of 20 is considered to be problematic, but it might indicate a problem with the way that your virtual memory is configured rather than a problem with the physical memory.

-          Cache Bytes

This counter monitors the amount of memory being used for the file system cache.  Anything over 4 MB is considered to be too much.  The solution is to add more memory

-          Physical Disk (instance)\Disk Transfer/Sec

For each physical disk if it goes above 25 disk I/Os per second then you've got poor response time for your disk. A bottleneck from a disk can significantly impact response time for applications running in your system.  So, you should investigate further by tracking Physical Disk(instance)\%Idle Time, which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read\write requests queuing up for your disk which is unable to service these requests in a timely fashion. In this case, it's time to upgrade your hardware to use faster disks or scale up and out your application to better handle the load.

-          Average Disk Queue Length

This counter tells you how many I\O operations are waiting for the hard disk to become available.  Again, this number should be as low as possible.  Experts give different opinions of what is an acceptable value, but my opinion is that the average disk queue length should be 3 or less.

-          Network Interface \ Bytes Total/Sec

Bytes transmitted per second, multiply by 8 to get the amount of bits transmitted per second.  This average amount should not exceed more than 40% of total bandwidth depending of the number of machines and protocols used in the same segment.  The more devices are transmitting in the same segment the more are the packet collisions.

Other network counters to check for more detail network performance monitoring are:

-          Network Interface \ Bytes Sent/Sec

-          Network Interface \ Bytes Received/Sec

-          Network Interface \ Current Bandwidth this is total bandwidth bits

--
Alessandro

No comments:

Giveaway of the Day

Giveaway of the Day

Soduko

Sudoku puzzles courtesy of Sudoku Shack