Thursday, June 23, 2011

Performance Monitoring in Linux

There are few useful tools that can help find out a bottleneck of your Linux box performance.

What to monitor first?

The system load is a measure of the amount of work that a computer system performs. You can use this command to read system load:
uptime
Here is a sample output:
... load average: 1.07, 1.63, 2.81
The three values of load average refer to the past 1, 5, and 15 minutes of system operation. These numbers should be read this way: the number represents how well a single CPU can handle load, thus if the number is 1 or less - it is pretty comfortable (the 4-CPU system works well at load number 4 or less); 1.5 - means at least 50% of load is not handled on time, it is queued for processing and is a subject for attention.

System Monitoring

Real time monitoring can be observed with top and htop commands. Command htop gives you more convenient way of what top does. Particularly it is handy to add two more columns (via 'F2' Setup) related to IO read and IO write.
htop
Processors related statistics with mpstat:
watch -n 1 mpstat

Disk Monitoring

IO can be a one of possible bottleneck of system performance degradation. The tool iotop tracks disk I/O by process, and prints a summary report that is refreshed every second.
iotop
Statistic for IO devices and partitions can be monitored with iostat:
watch -n 1 iostat

Who is waiting and blocked?

It is useful to know how the system load goes across processes, however most interest is related to processes that keep waiting for the operation to complete, thus cause delays. Here is a simple command to get this kind of report every second:
watch -n 1 "(ps aux | awk '\$8 ~ /D/  { print \$0 }')"

Network Monitoring

Intensive network related operation can cause the high load as well. Here is a tool that let you have a better idea of your network traffic utilization - iftop:
iftop