Xen Monitoring Tools and Techniques

From Virtuatopia
Revision as of 19:11, 22 April 2008 by Neil (Talk | contribs) (Monitoring Xen Performance with XenMon =)

Jump to: navigation, search

So far in this book we have focused primarily on the creation of Xen guest domains (domainU). By this stage it is safe to assume that you now have one or more domainU systems up and running on your server or desktop. Given this assumption, this chapter of Xen Virtualization Essentials will be dedicated to providing an overview of the tools and techniques that may be employed to monitor a Xen based environment.

Why Monitor a Xen Environment?

It is important to keep in mind that Xen is an enterprise level environment capable of supporting complex virtualization configurations. As with any complex system it would be naive to assume that the system will run without the occasional performance issues or problem. Deploying Xen virtualization therefore requires an understanding of the tools and techniques necessary to monitor the running environment, identify performance issues and track down problems.


Obtaining Xen Configuration and System Information

Perhaps the most basic step in monitoring a Xen system or isolating a problem is to get a high level overview of the Xen environment and underlying configuration. This information will be of particular importance when requesting help from a vendor or forum. A good way to obtain this information is to use the xm info command. For example, the following example shows output from xm info on a Red Hat Enterprise Linux 5 (RHEL5) system:

xm info
host                   : localhost.localdomain
release                : 2.6.18-53.el5xen
version                : #1 SMP Wed Oct 10 17:06:12 EDT 2007
machine                : i686
nr_cpus                : 1
nr_nodes               : 1
sockets_per_node       : 1
cores_per_socket       : 1
threads_per_core       : 1
cpu_mhz                : 2993
hw_caps                : 0febfbff:20100000:00000000:00000180:0000a015:00000000:00000001
total_memory           : 255
free_memory            : 14
xen_major              : 3
xen_minor              : 1
xen_extra              : .0-53.el5
xen_caps               : xen-3.0-x86_32p 
xen_pagesize           : 4096
platform_params        : virt_start=0xf5800000
xen_changeset          : unavailable
cc_compiler            : gcc version 4.1.2 20070626 (Red Hat 4.1.2-14)
cc_compile_by          : brewbuilder
cc_compile_domain      : build.redhat.com
cc_compile_date        : Wed Oct 10 16:30:55 EDT 2007
xend_config_format     : 2


Monitoring Xen Performance with XenMon =

The XenMon tool is useful for monitoring the performance Xen domains, particularly when identifying with domains are responsible for the highest I/O or processing loads on a system.

XenMon is started from the command-line using the xenmon.py command. The following figure shows a typical XenMon session:

Monitoring Xen Performance with XenMon

The XenMon display shows two sets of data. On the left hand side are statistics captured over the preceding 10 seconds and on the right is the data for the last 1 second.

For each domain three sets of data are provided. The first row (the grammatically dubious Gotten) for each domain is the amount of time the domain as spent executing. The Blocked row shows statistics for idle time. Finally, the Waited row indicates the amount of time the domain has been in a wait state. For each category the amount of time spent in the particular mode and the time as a percentage of overall time during the corresponding period (i.e 1 or 10 seconds) is displayed. The final value depends on the category. For Gotten this represents processor time, for Blocked the average blocked time and for Wait the average waiting time.

When XenMon is exited (using the q key) a summary of data collected during the monitoring session is displayed:

ms_per_sample = 100
Initialized with 1 cpu
CPU Frequency = 2993.98
Event counts:
00000000        Other
00000000        Add Domain
00000000        Remove Domain
00000000        Sleep
00022838        Wake
00022838        Block
00045666        Switch
00000000        Timer Func
00045666        Switch Prev
00045666        Switch Next
00000000        Page Map
00000000        Page Unmap
00000000        Page Transfer
processed 182674 total records in 288 seconds (634 per second)
woke up 288 times in 288 seconds (1 per second)